merge gpt-pilot 0.2 codebase

This is a complete rewrite of the GPT Pilot core, from the ground
up, making the agentic architecture front and center, and also
fixing some long-standing problems with the database architecture
that weren't feasible to solve without breaking compatibility.

As the database structure and config file syntax have changed,
we have automatic imports for projects and current configs,
see the README.md file for details.

This also relicenses the project to FSL-1.1-MIT license.
This commit is contained in:
Senko Rasic
2024-05-22 21:42:25 +02:00
parent 391998ab67
commit 5b474ccc1f
203 changed files with 15412 additions and 0 deletions

35
.github/workflows/ci.yml vendored Normal file
View File

@@ -0,0 +1,35 @@
name: Run unit tests
on:
push:
branches: [ "main" ]
pull_request:
branches: [ "main" ]
jobs:
build:
runs-on: ${{ matrix.os }}
timeout-minutes: 10
strategy:
fail-fast: false
matrix:
python-version: ["3.9", "3.12"]
os: [ubuntu-latest, macos-latest, windows-latest]
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install poetry
poetry install --with=dev
- name: Lint with ruff
run: poetry run ruff check --output-format github
- name: Check code style with ruff
run: poetry run ruff format --check --diff
- name: Test with pytest
run: poetry run pytest

18
.gitignore vendored Normal file
View File

@@ -0,0 +1,18 @@
__pycache__/
.venv/
.vscode/
.idea/
htmlcov/
dist/
workspace/
.coverage
*.code-workspace
.*_cache
.env
*.pyc
*.db
config.json
poetry.lock
.DS_Store
*.log

21
.pre-commit-config.yaml Normal file
View File

@@ -0,0 +1,21 @@
fail_fast: true
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
# Ruff version.
rev: v0.3.5
hooks:
# Run the linter.
- id: ruff
args: [ --fix ]
# Run the formatter.
- id: ruff-format
- repo: local
hooks:
# Run the tests
- id: pytest
name: pytest
stages: [commit]
types: [python]
entry: pytest
language: system
pass_filenames: false

110
LICENSE Normal file
View File

@@ -0,0 +1,110 @@
# Functional Source License, Version 1.1, MIT Future License
## Abbreviation
FSL-1.1-MIT
## Notice
Copyright 2024 Pythagora Technologies, Inc.
## Terms and Conditions
### Licensor ("We")
The party offering the Software under these Terms and Conditions.
### The Software
The "Software" is each version of the software that we make available under
these Terms and Conditions, as indicated by our inclusion of these Terms and
Conditions with the Software.
### License Grant
Subject to your compliance with this License Grant and the Patents,
Redistribution and Trademark clauses below, we hereby grant you the right to
use, copy, modify, create derivative works, publicly perform, publicly display
and redistribute the Software for any Permitted Purpose identified below.
### Permitted Purpose
A Permitted Purpose is any purpose other than a Competing Use. A Competing Use
means making the Software available to others in a commercial product or
service that:
1. substitutes for the Software;
2. substitutes for any other product or service we offer using the Software
that exists as of the date we make the Software available; or
3. offers the same or substantially similar functionality as the Software.
Permitted Purposes specifically include using the Software:
1. for your internal use and access;
2. for non-commercial education;
3. for non-commercial research; and
4. in connection with professional services that you provide to a licensee
using the Software in accordance with these Terms and Conditions.
### Patents
To the extent your use for a Permitted Purpose would necessarily infringe our
patents, the license grant above includes a license under our patents. If you
make a claim against any party that the Software infringes or contributes to
the infringement of any patent, then your patent license to the Software ends
immediately.
### Redistribution
The Terms and Conditions apply to all copies, modifications and derivatives of
the Software.
If you redistribute any copies, modifications or derivatives of the Software,
you must include a copy of or a link to these Terms and Conditions and not
remove any copyright notices provided in or with the Software.
### Disclaimer
THE SOFTWARE IS PROVIDED "AS IS" AND WITHOUT WARRANTIES OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING WITHOUT LIMITATION WARRANTIES OF FITNESS FOR A PARTICULAR
PURPOSE, MERCHANTABILITY, TITLE OR NON-INFRINGEMENT.
IN NO EVENT WILL WE HAVE ANY LIABILITY TO YOU ARISING OUT OF OR RELATED TO THE
SOFTWARE, INCLUDING INDIRECT, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES,
EVEN IF WE HAVE BEEN INFORMED OF THEIR POSSIBILITY IN ADVANCE.
### Trademarks
Except for displaying the License Details and identifying us as the origin of
the Software, you have no right under these Terms and Conditions to use our
trademarks, trade names, service marks or product names.
## Grant of Future License
We hereby irrevocably grant you an additional license to use the Software under
the MIT license that is effective on the second anniversary of the date we make
the Software available. On or after that date, you may use the Software under
the MIT license, in which case the following will apply:
Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
of the Software, and to permit persons to whom the Software is furnished to do
so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

238
README.md Normal file
View File

@@ -0,0 +1,238 @@
<div align="center">
# 🧑‍✈️ GPT PILOT 🧑‍✈️
</div>
---
<div align="center">
[![Discord Follow](https://dcbadge.vercel.app/api/server/HaqXugmxr9?style=flat)](https://discord.gg/HaqXugmxr9)
[![GitHub Repo stars](https://img.shields.io/github/stars/Pythagora-io/gpt-pilot?style=social)](https://github.com/Pythagora-io/gpt-pilot)
[![Twitter Follow](https://img.shields.io/twitter/follow/HiPythagora?style=social)](https://twitter.com/HiPythagora)
</div>
---
<div align="center">
<a href="https://www.ycombinator.com/" target="_blank"><img src="https://s3.amazonaws.com/assets.pythagora.ai/yc/PNG/Black.png" alt="Pythagora-io%2Fgpt-pilot | Trendshift" style="width: 250px; height: 93px;"/></a>
</div>
<br>
<div align="center">
<a href="https://trendshift.io/repositories/466" target="_blank"><img src="https://trendshift.io/api/badge/repositories/466" alt="Pythagora-io%2Fgpt-pilot | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
</div>
<br>
<br>
<div align="center">
### GPT Pilot doesn't just generate code, it builds apps!
</div>
---
<div align="center">
[![See it in action](https://i3.ytimg.com/vi/4g-1cPGK0GA/maxresdefault.jpg)](https://youtu.be/4g-1cPGK0GA)
(click to open the video in YouTube) (1:40min)
</div>
---
<div align="center">
<a href="vscode:extension/PythagoraTechnologies.gpt-pilot-vs-code" target="_blank"><img src="https://github.com/Pythagora-io/gpt-pilot/assets/10895136/5792143e-77c7-47dd-ad96-6902be1501cd" alt="Pythagora-io%2Fgpt-pilot | Trendshift" style="width: 185px; height: 55px;" width="185" height="55"/></a>
</div>
GPT Pilot is the core technology for the [Pythagora VS Code extension](https://bit.ly/3IeZxp6) that aims to provide **the first real AI developer companion**. Not just an autocomplete or a helper for PR messages but rather a real AI developer that can write full features, debug them, talk to you about issues, ask for review, etc.
---
📫 If you would like to get updates on future releases or just get in touch, join our [Discord server](https://discord.gg/HaqXugmxr9) or you [can add your email here](http://eepurl.com/iD6Mpo). 📬
---
<!-- TOC -->
* [🔌 Requirements](#-requirements)
* [🚦How to start using gpt-pilot?](#how-to-start-using-gpt-pilot)
* [🔎 Examples](#-examples)
* [🐳 How to start gpt-pilot in docker?](#-how-to-start-gpt-pilot-in-docker)
* [🧑‍💻️ CLI arguments](#-cli-arguments)
* [🏗 How GPT Pilot works?](#-how-gpt-pilot-works)
* [🕴How's GPT Pilot different from _Smol developer_ and _GPT engineer_?](#hows-gpt-pilot-different-from-smol-developer-and-gpt-engineer)
* [🍻 Contributing](#-contributing)
* [🔗 Connect with us](#-connect-with-us)
* [🌟 Star history](#-star-history)
<!-- TOC -->
---
GPT Pilot aims to research how much LLMs can be utilized to generate fully working, production-ready apps while the developer oversees the implementation.
**The main idea is that AI can write most of the code for an app (maybe 95%), but for the rest, 5%, a developer is and will be needed until we get full AGI**.
If you are interested in our learnings during this project, you can check [our latest blog posts](https://blog.pythagora.ai/2024/02/19/gpt-pilot-what-did-we-learn-in-6-months-of-working-on-a-codegen-pair-programmer/).
---
<br>
<div align="center">
### **[👉 Examples of apps written by GPT Pilot 👈](https://github.com/Pythagora-io/gpt-pilot/wiki/Apps-created-with-GPT-Pilot)**
</div>
<br>
---
# 🔌 Requirements
- **Python 3.9+**
# 🚦How to start using gpt-pilot?
👉 If you are using VS Code as your IDE, the easiest way to start is by downloading [GPT Pilot VS Code extension](https://bit.ly/3IeZxp6). 👈
Otherwise, you can use the CLI tool.
### If you're new to GPT Pilot:
After you have Python and (optionally) PostgreSQL installed, follow these steps:
1. `git clone https://github.com/Pythagora-io/gpt-pilot.git` (clone the repo)
2. `cd gpt-pilot` (go to the repo folder)
3. `python -m venv venv` (create a virtual environment)
4. `source venv/bin/activate` (or on Windows `venv\Scripts\activate`) (activate the virtual environment)
5. `pip install -r requirements.txt` (install the dependencies)
6. `cp example-config.json config.json` (create `config.json` file)
7. Set your key and other settings in `config.json` file:
- LLM Provider (`openai`, `anthropic` or `groq`) key and endpoints (leave `null` for default) (note that Azure and OpenRouter are suppored via the `openai` setting)
- Your API key (if `null`, will be read from the environment variables)
- database settings: sqlite is used by default, PostgreSQL should also work
- optionally update `fs.ignore_paths` and add files or folders which shouldn't be tracked by GPT Pilot in workspace, useful to ignore folders created by compilers
8. `python main.py` (start GPT Pilot)
All generated code will be stored in the folder `workspace` inside the folder named after the app name you enter upon starting the pilot.
### If you're upgrading from GPT Pilot v0.1
Assuming you already have the git repository with an earlier version:
1. `git pull` (update the repo)
2. `source pilot-env/bin/activate` (or on Windows `pilot-env\Scripts\activate`) (activate the virtual environment)
3. `pip install -r requirements.txt` (install the new dependencies)
4. `python main.py --import-v0 pilot/gpt-pilot` (this should import your settings and existing projects)
This will create a new database `pythagora.db` and import all apps from the old database. For each app,
it will import the start of the latest task you were working on.
To verify that the import was successful, you can run `python main.py --list` to see all the apps you have created,
and check `config.json` to check the settings were correctly converted to the new config file format (and make
any adjustments if needed).
# 🔎 [Examples](https://github.com/Pythagora-io/gpt-pilot/wiki/Apps-created-with-GPT-Pilot)
[Click here](https://github.com/Pythagora-io/gpt-pilot/wiki/Apps-created-with-GPT-Pilot) to see all example apps created with GPT Pilot.
## 🐳 How to start gpt-pilot in docker?
1. `git clone https://github.com/Pythagora-io/gpt-pilot.git` (clone the repo)
2. Update the `docker-compose.yml` environment variables, which can be done via `docker compose config`. If you wish to use a local model, please go to [https://localai.io/basics/getting_started/](https://localai.io/basics/getting_started/).
3. By default, GPT Pilot will read & write to `~/gpt-pilot-workspace` on your machine, you can also edit this in `docker-compose.yml`
4. run `docker compose build`. this will build a gpt-pilot container for you.
5. run `docker compose up`.
6. access the web terminal on `port 7681`
7. `python main.py` (start GPT Pilot)
This will start two containers, one being a new image built by the `Dockerfile` and a Postgres database. The new image also has [ttyd](https://github.com/tsl0922/ttyd) installed so that you can easily interact with gpt-pilot. Node is also installed on the image and port 3000 is exposed.
# 🧑‍💻️ CLI arguments
### List created projects (apps)
```bash
python main.py --list
```
Note: for each project (app), this also lists "branches". Currently we only support having one branch (called "main"), and in the future we plan to add support for multiple project branches.
### Load and continue from the latest step in a project (app)
```bash
python main.py --project <app_id>
```
### Load and continue from a specific step in a project (app)
```bash
python main.py --project <app_id> --step <step>
```
Warning: this will delete all progress after the specified step!
### Delete project (app)
```bash
python main.py --delete <app_id>
```
Delete project with the specified `app_id`. Warning: this cannot be undone!
### Import projects from v0.1
```bash
python main.py --import-v0 <path>
```
This will import projects from the old GPT Pilot v0.1 database. The path should be the path to the old GPT Pilot v0.1 database. For each project, it will import the start of the latest task you were working on. If the project was already imported, the import procedure will skip it (won't overwrite the project in the database).
### Other command-line options
There are several other command-line options that mostly support calling GPT Pilot from our VSCode extension. To see all the available options, use the `--help` flag:
```bash
python main.py --help
```
# 🏗 How GPT Pilot works?
Here are the steps GPT Pilot takes to create an app:
1. You enter the app name and the description.
2. **Product Owner agent** like in real life, does nothing. :)
3. **Specification Writer agent** asks a couple of questions to understand the requirements better if project description is not good enough.
4. **Architect agent** writes up technologies that will be used for the app and checks if all technologies are installed on the machine and installs them if not.
5. **Tech Lead agent** writes up development tasks that the Developer must implement.
6. **Developer agent** takes each task and writes up what needs to be done to implement it. The description is in human-readable form.
7. **Code Monkey agent** takes the Developer's description and the existing file and implements the changes.
8. **Reviewer agent** reviews every step of the task and if something is done wrong Reviewer sends it back to Code Monkey.
9. **Troubleshooter agent** helps you to give good feedback to GPT Pilot when something is wrong.
10. **Debugger agent** hate to see him, but he is your best friend when things go south.
11. **Technical Writer agent** writes documentation for the project.
<br>
# 🕴How's GPT Pilot different from _Smol developer_ and _GPT engineer_?
- **GPT Pilot works with the developer to create a fully working production-ready app** - I don't think AI can (at least in the near future) create apps without a developer being involved. So, **GPT Pilot codes the app step by step** just like a developer would in real life. This way, it can debug issues as they arise throughout the development process. If it gets stuck, you, the developer in charge, can review the code and fix the issue. Other similar tools give you the entire codebase at once - this way, bugs are much harder to fix for AI and for you as a developer.
<br><br>
- **Works at scale** - GPT Pilot isn't meant to create simple apps but rather so it can work at any scale. It has mechanisms that filter out the code, so in each LLM conversation, it doesn't need to store the entire codebase in context, but it shows the LLM only the relevant code for the current task it's working on. Once an app is finished, you can continue working on it by writing instructions on what feature you want to add.
# 🍻 Contributing
If you are interested in contributing to GPT Pilot, join [our Discord server](https://discord.gg/HaqXugmxr9), check out open [GitHub issues](https://github.com/Pythagora-io/gpt-pilot/issues), and see if anything interests you. We would be happy to get help in resolving any of those. The best place to start is by reviewing blog posts mentioned above to understand how the architecture works before diving into the codebase.
## 🖥 Development
Other than the research, GPT Pilot needs to be debugged to work in different scenarios. For example, we realized that the quality of the code generated is very sensitive to the size of the development task. When the task is too broad, the code has too many bugs that are hard to fix, but when the development task is too narrow, GPT also seems to struggle in getting the task implemented into the existing code.
## 📊 Telemetry
To improve GPT Pilot, we are tracking some events from which you can opt out at any time. You can read more about it [here](./docs/TELEMETRY.md).
# 🔗 Connect with us
🌟 As an open-source tool, it would mean the world to us if you starred the GPT-pilot repo 🌟
💬 Join [the Discord server](https://discord.gg/HaqXugmxr9) to get in touch.

0
core/agents/__init__.py Normal file
View File

146
core/agents/architect.py Normal file
View File

@@ -0,0 +1,146 @@
from typing import Optional
from pydantic import BaseModel, Field
from core.agents.base import BaseAgent
from core.agents.convo import AgentConvo
from core.agents.response import AgentResponse
from core.llm.parser import JSONParser
from core.telemetry import telemetry
from core.templates.registry import PROJECT_TEMPLATES, ProjectTemplateEnum
from core.ui.base import ProjectStage
ARCHITECTURE_STEP = "architecture"
WARN_SYSTEM_DEPS = ["docker", "kubernetes", "microservices"]
WARN_FRAMEWORKS = ["next.js", "vue", "vue.js", "svelte", "angular"]
WARN_FRAMEWORKS_URL = "https://github.com/Pythagora-io/gpt-pilot/wiki/Using-GPT-Pilot-with-frontend-frameworks"
# FIXME: all the reponse pydantic models should be strict (see config._StrictModel), also check if we
# can disallow adding custom Python attributes to the model
class SystemDependency(BaseModel):
name: str = Field(
None,
description="Name of the system dependency, for example Node.js or Python.",
)
description: str = Field(
None,
description="One-line description of the dependency.",
)
test: str = Field(
None,
description="Command line to test whether the dependency is available on the system.",
)
required_locally: bool = Field(
None,
description="Whether this dependency must be installed locally (as opposed to connecting to cloud or other server)",
)
class PackageDependency(BaseModel):
name: str = Field(
None,
description="Name of the package dependency, for example Express or React.",
)
description: str = Field(
None,
description="One-line description of the dependency.",
)
class Architecture(BaseModel):
architecture: str = Field(
None,
description="General description of the app architecture.",
)
system_dependencies: list[SystemDependency] = Field(
None,
description="List of system dependencies required to build and run the app.",
)
package_dependencies: list[PackageDependency] = Field(
None,
description="List of framework/language-specific packages used by the app.",
)
template: Optional[ProjectTemplateEnum] = Field(
None,
description="Project template to use for the app, if any (optional, can be null).",
)
class Architect(BaseAgent):
agent_type = "architect"
display_name = "Architect"
async def run(self) -> AgentResponse:
await self.ui.send_project_stage(ProjectStage.ARCHITECTURE)
llm = self.get_llm()
convo = AgentConvo(self).template("technologies", templates=PROJECT_TEMPLATES).require_schema(Architecture)
await self.send_message("Planning project architecture ...")
arch: Architecture = await llm(convo, parser=JSONParser(Architecture))
await self.check_compatibility(arch)
await self.check_system_dependencies(arch.system_dependencies)
spec = self.current_state.specification.clone()
spec.architecture = arch.architecture
spec.system_dependencies = [d.model_dump() for d in arch.system_dependencies]
spec.package_dependencies = [d.model_dump() for d in arch.package_dependencies]
spec.template = arch.template.value if arch.template else None
self.next_state.specification = spec
telemetry.set(
"architecture",
{
"description": spec.architecture,
"system_dependencies": spec.system_dependencies,
"package_dependencies": spec.package_dependencies,
},
)
telemetry.set("template", spec.template)
return AgentResponse.done(self)
async def check_compatibility(self, arch: Architecture) -> bool:
warn_system_deps = [dep.name for dep in arch.system_dependencies if dep.name.lower() in WARN_SYSTEM_DEPS]
warn_package_deps = [dep.name for dep in arch.package_dependencies if dep.name.lower() in WARN_FRAMEWORKS]
if warn_system_deps:
await self.ask_question(
f"Warning: GPT Pilot doesn't officially support {', '.join(warn_system_deps)}. "
f"You can try to use {'it' if len(warn_system_deps) == 1 else 'them'}, but you may run into problems.",
buttons={"continue": "Continue"},
buttons_only=True,
default="continue",
)
if warn_package_deps:
await self.ask_question(
f"Warning: GPT Pilot works best with vanilla JavaScript. "
f"You can try try to use {', '.join(warn_package_deps)}, but you may run into problems. "
f"Visit {WARN_FRAMEWORKS_URL} for more information.",
buttons={"continue": "Continue"},
buttons_only=True,
default="continue",
)
# TODO: add "cancel" option to the above buttons; if pressed, Architect should
# return AgentResponse.revise_spec()
# that SpecWriter should catch and allow the user to reword the initial spec.
return True
async def check_system_dependencies(self, deps: list[SystemDependency]):
"""
Check whether the required system dependencies are installed.
"""
for dep in deps:
status_code, _, _ = await self.process_manager.run_command(dep.test)
if status_code != 0:
if dep.required_locally:
remedy = "Please install it before proceeding with your app."
else:
remedy = "If you would like to use it locally, please install it before proceeding."
await self.send_message(f"{dep.name} is not available. {remedy}")
else:
await self.send_message(f"{dep.name} is available.")

174
core/agents/base.py Normal file
View File

@@ -0,0 +1,174 @@
from typing import Any, Callable, Optional
from core.agents.response import AgentResponse
from core.config import get_config
from core.db.models import ProjectState
from core.llm.base import BaseLLMClient, LLMError
from core.log import get_logger
from core.proc.process_manager import ProcessManager
from core.state.state_manager import StateManager
from core.ui.base import AgentSource, UIBase, UserInput
log = get_logger(__name__)
class BaseAgent:
"""
Base class for agents.
"""
agent_type: str
display_name: str
def __init__(
self,
state_manager: StateManager,
ui: UIBase,
*,
step: Optional[Any] = None,
prev_response: Optional["AgentResponse"] = None,
process_manager: Optional["ProcessManager"] = None,
):
"""
Create a new agent.
"""
self.ui_source = AgentSource(self.display_name, self.agent_type)
self.ui = ui
self.stream_output = True
self.state_manager = state_manager
self.process_manager = process_manager
self.prev_response = prev_response
self.step = step
@property
def current_state(self) -> ProjectState:
"""Current state of the project (read-only)."""
return self.state_manager.current_state
@property
def next_state(self) -> ProjectState:
"""Next state of the project (write-only)."""
return self.state_manager.next_state
async def send_message(self, message: str):
"""
Send a message to the user.
Convenience method, uses `UIBase.send_message()` to send the message,
setting the correct source.
:param message: Message to send.
"""
await self.ui.send_message(message + "\n", source=self.ui_source)
async def ask_question(
self,
question: str,
*,
buttons: Optional[dict[str, str]] = None,
default: Optional[str] = None,
buttons_only: bool = False,
initial_text: Optional[str] = None,
allow_empty: bool = False,
hint: Optional[str] = None,
) -> UserInput:
"""
Ask a question to the user and return the response.
Convenience method, uses `UIBase.ask_question()` to
ask the question, setting the correct source and
logging the question/response.
:param question: Question to ask.
:param buttons: Buttons to display with the question.
:param default: Default button to select.
:param buttons_only: Only display buttons, no text input.
:param allow_empty: Allow empty input.
:param hint: Text to display in a popup as a hint to the question.
:param initial_text: Initial text input.
:return: User response.
"""
response = await self.ui.ask_question(
question,
buttons=buttons,
default=default,
buttons_only=buttons_only,
allow_empty=allow_empty,
hint=hint,
initial_text=initial_text,
source=self.ui_source,
)
await self.state_manager.log_user_input(question, response)
return response
async def stream_handler(self, content: str):
"""
Handle streamed response from the LLM.
Serves as a callback to `AgentBase.llm()` so it can stream the responses to the UI.
This can be turned on/off on a pe-request basis by setting `BaseAgent.stream_output`
to True or False.
:param content: Response content.
"""
if self.stream_output:
await self.ui.send_stream_chunk(content, source=self.ui_source)
if content is None:
await self.ui.send_message("")
async def error_handler(self, error: LLMError, message: Optional[str] = None):
"""
Handle error responses from the LLM.
:param error: The exception that was thrown the the LLM client.
:param message: Optional message to show.
"""
if error == LLMError.KEY_EXPIRED:
await self.ui.send_key_expired(message)
elif error == LLMError.RATE_LIMITED:
await self.stream_handler(message)
def get_llm(self, name=None) -> Callable:
"""
Get a new instance of the agent-specific LLM client.
The client initializes the UI stream handler and stores the
request/response to the current state's log. The agent name
can be overridden in case the agent needs to use a different
model configuration.
:param name: Name of the agent for configuration (default: class name).
:return: LLM client for the agent.
"""
if name is None:
name = self.__class__.__name__
config = get_config()
llm_config = config.llm_for_agent(name)
client_class = BaseLLMClient.for_provider(llm_config.provider)
llm_client = client_class(llm_config, stream_handler=self.stream_handler, error_handler=self.error_handler)
async def client(convo, **kwargs) -> Any:
"""
Agent-specific LLM client.
For details on optional arguments to pass to the LLM client,
see `pythagora.llm.openai_client.OpenAIClient()`.
"""
response, request_log = await llm_client(convo, **kwargs)
await self.state_manager.log_llm_request(request_log, agent=self)
return response
return client
async def run() -> AgentResponse:
"""
Run the agent.
:return: Response from the agent.
"""
raise NotImplementedError()

127
core/agents/code_monkey.py Normal file
View File

@@ -0,0 +1,127 @@
from pydantic import BaseModel, Field
from core.agents.base import BaseAgent
from core.agents.convo import AgentConvo
from core.agents.response import AgentResponse, ResponseType
from core.config import DESCRIBE_FILES_AGENT_NAME
from core.llm.parser import JSONParser, OptionalCodeBlockParser
from core.log import get_logger
log = get_logger(__name__)
class FileDescription(BaseModel):
summary: str = Field(
description="Detailed description summarized what the file is about, and what the major classes, functions, elements or other functionality is implemented."
)
references: list[str] = Field(
description="List of references the file imports or includes (only files local to the project), where each element specifies the project-relative path of the referenced file, including the file extension."
)
class CodeMonkey(BaseAgent):
agent_type = "code-monkey"
display_name = "Code Monkey"
async def run(self) -> AgentResponse:
if self.prev_response and self.prev_response.type == ResponseType.DESCRIBE_FILES:
return await self.describe_files()
else:
return await self.implement_changes()
def _get_task_convo(self) -> AgentConvo:
# FIXME: Current prompts reuse task breakdown / iteration messages so we have to resort to this
task = self.current_state.current_task
current_task_index = self.current_state.tasks.index(task)
convo = AgentConvo(self).template(
"breakdown",
task=task,
iteration=None,
current_task_index=current_task_index,
)
# TODO: We currently show last iteration to the code monkey; we might need to show the task
# breakdown and all the iterations instead? To think about when refactoring prompts
if self.current_state.iterations:
convo.assistant(self.current_state.iterations[-1]["description"])
else:
convo.assistant(self.current_state.current_task["instructions"])
return convo
async def implement_changes(self) -> AgentResponse:
file_name = self.step["save_file"]["path"]
current_file = await self.state_manager.get_file_by_path(file_name)
file_content = current_file.content.content if current_file else ""
task = self.current_state.current_task
if self.prev_response and self.prev_response.type == ResponseType.CODE_REVIEW_FEEDBACK:
attempt = self.prev_response.data["attempt"] + 1
feedback = self.prev_response.data["feedback"]
log.debug(f"Fixing file {file_name} after review feedback: {feedback} ({attempt}. attempt)")
await self.send_message(f"Reworking changes I made to {file_name} ...")
else:
log.debug(f"Implementing file {file_name}")
await self.send_message(f"{'Updating existing' if file_content else 'Creating new'} file {file_name} ...")
attempt = 1
feedback = None
llm = self.get_llm()
convo = self._get_task_convo().template(
"implement_changes",
file_name=file_name,
file_content=file_content,
instructions=task["instructions"],
)
if feedback:
convo.assistant(f"```\n{self.prev_response.data['new_content']}\n```\n").template(
"review_feedback",
content=self.prev_response.data["approved_content"],
original_content=file_content,
rework_feedback=feedback,
)
response: str = await llm(convo, temperature=0, parser=OptionalCodeBlockParser())
# FIXME: provide a counter here so that we don't have an endless loop here
return AgentResponse.code_review(self, file_name, task["instructions"], file_content, response, attempt)
async def describe_files(self) -> AgentResponse:
llm = self.get_llm(DESCRIBE_FILES_AGENT_NAME)
to_describe = {
file.path: file.content.content for file in self.current_state.files if not file.meta.get("description")
}
for file in self.next_state.files:
content = to_describe.get(file.path)
if content is None:
continue
if content == "":
file.meta = {
**file.meta,
"description": "Empty file",
"references": [],
}
continue
log.debug(f"Describing file {file.path}")
await self.send_message(f"Describing file {file.path} ...")
convo = (
AgentConvo(self)
.template(
"describe_file",
path=file.path,
content=content,
)
.require_schema(FileDescription)
)
llm_response: FileDescription = await llm(convo, parser=JSONParser(spec=FileDescription))
file.meta = {
**file.meta,
"description": llm_response.summary,
"references": llm_response.references,
}
return AgentResponse.done(self)

View File

@@ -0,0 +1,328 @@
import re
from difflib import unified_diff
from enum import Enum
from pydantic import BaseModel, Field
from core.agents.base import BaseAgent
from core.agents.convo import AgentConvo
from core.agents.response import AgentResponse
from core.llm.parser import JSONParser
from core.log import get_logger
log = get_logger(__name__)
# Constant for indicating missing new line at the end of a file in a unified diff
NO_EOL = "\\ No newline at end of file"
# Regular expression pattern for matching hunk headers
PATCH_HEADER_PATTERN = re.compile(r"^@@ -(\d+),?(\d+)? \+(\d+),?(\d+)? @@")
# Maximum number of attempts to ask for review if it can't be parsed
MAX_REVIEW_RETRIES = 2
# Maximum number of code implementation attempts after which we accept the changes unconditionaly
MAX_CODING_ATTEMPTS = 3
class Decision(str, Enum):
APPLY = "apply"
IGNORE = "ignore"
REWORK = "rework"
class Hunk(BaseModel):
number: int = Field(description="Index of the hunk in the diff. Starts from 1.")
reason: str = Field(description="Reason for applying or ignoring this hunk, or for asking for it to be reworked.")
decision: Decision = Field(description="Whether to apply this hunk, rework, or ignore it.")
class ReviewChanges(BaseModel):
hunks: list[Hunk]
review_notes: str = Field(description="Additional review notes (optional, can be empty).")
class CodeReviewer(BaseAgent):
agent_type = "code-reviewer"
display_name = "Code Reviewer"
async def run(self) -> AgentResponse:
if (
not self.prev_response.data["old_content"]
or self.prev_response.data["new_content"] == self.prev_response.data["old_content"]
or self.prev_response.data["attempt"] >= MAX_CODING_ATTEMPTS
):
# we always auto-accept new files and unchanged files, or if we've tried too many times
return await self.accept_changes(self.prev_response.data["path"], self.prev_response.data["new_content"])
approved_content, feedback = await self.review_change(
self.prev_response.data["path"],
self.prev_response.data["instructions"],
self.prev_response.data["old_content"],
self.prev_response.data["new_content"],
)
if feedback:
return AgentResponse.code_review_feedback(
self,
new_content=self.prev_response.data["new_content"],
approved_content=approved_content,
feedback=feedback,
attempt=self.prev_response.data["attempt"],
)
else:
return await self.accept_changes(self.prev_response.data["path"], approved_content)
async def accept_changes(self, path: str, content: str) -> AgentResponse:
await self.state_manager.save_file(path, content)
self.next_state.complete_step()
input_required = self.state_manager.get_input_required(content)
if input_required:
return AgentResponse.input_required(
self,
[{"file": path, "line": line} for line in input_required],
)
else:
return AgentResponse.done(self)
def _get_task_convo(self) -> AgentConvo:
# FIXME: Current prompts reuse conversation from the developer so we have to resort to this
task = self.current_state.current_task
current_task_index = self.current_state.tasks.index(task)
convo = AgentConvo(self).template(
"breakdown",
task=task,
iteration=None,
current_task_index=current_task_index,
)
# TODO: We currently show last iteration to the code monkey; we might need to show the task
# breakdown and all the iterations instead? To think about when refactoring prompts
if self.current_state.iterations:
convo.assistant(self.current_state.iterations[-1]["description"])
else:
convo.assistant(self.current_state.current_task["instructions"])
return convo
async def review_change(
self, file_name: str, instructions: str, old_content: str, new_content: str
) -> tuple[str, str]:
"""
Review changes that were applied to the file.
This asks the LLM to act as a PR reviewer and for each part (hunk) of the
diff, decide if it should be applied (kept) or ignored (removed from the PR).
:param file_name: name of the file being modified
:param instructions: instructions for the reviewer
:param old_content: old file content
:param new_content: new file content (with proposed changes)
:return: tuple with file content update with approved changes, and review feedback
Diff hunk explanation: https://www.gnu.org/software/diffutils/manual/html_node/Hunks.html
"""
hunks = self.get_diff_hunks(file_name, old_content, new_content)
llm = self.get_llm()
convo = (
self._get_task_convo()
.template(
"review_changes",
instructions=instructions,
file_name=file_name,
old_content=old_content,
hunks=hunks,
)
.require_schema(ReviewChanges)
)
llm_response: ReviewChanges = await llm(convo, temperature=0, parser=JSONParser(ReviewChanges))
for i in range(MAX_REVIEW_RETRIES):
reasons = {}
ids_to_apply = set()
ids_to_ignore = set()
ids_to_rework = set()
for hunk in llm_response.hunks:
reasons[hunk.number - 1] = hunk.reason
if hunk.decision == "apply":
ids_to_apply.add(hunk.number - 1)
elif hunk.decision == "ignore":
ids_to_ignore.add(hunk.number - 1)
elif hunk.decision == "rework":
ids_to_rework.add(hunk.number - 1)
n_hunks = len(hunks)
n_review_hunks = len(reasons)
if n_review_hunks == n_hunks:
break
elif n_review_hunks < n_hunks:
error = "Not all hunks have been reviewed. Please review all hunks and add 'apply', 'ignore' or 'rework' decision for each."
elif n_review_hunks > n_hunks:
error = f"Your review contains more hunks ({n_review_hunks}) than in the original diff ({n_hunks}). Note that one hunk may have multiple changed lines."
# Max two retries; if the reviewer still hasn't reviewed all hunks, we'll just use the entire new content
convo.assistant(llm_response.model_dump_json()).user(error)
llm_response = await llm(convo, parser=JSONParser(ReviewChanges))
else:
return new_content, None
hunks_to_apply = [h for i, h in enumerate(hunks) if i in ids_to_apply]
diff_log = f"--- {file_name}\n+++ {file_name}\n" + "\n".join(hunks_to_apply)
hunks_to_rework = [(i, h) for i, h in enumerate(hunks) if i in ids_to_rework]
review_log = (
"\n\n".join([f"## Change\n```{hunk}```\nReviewer feedback:\n{reasons[i]}" for (i, hunk) in hunks_to_rework])
+ "\n\nReview notes:\n"
+ llm_response.review_notes
)
if len(hunks_to_apply) == len(hunks):
await self.send_message("Applying entire change")
log.info(f"Applying entire change to {file_name}")
return new_content, None
elif len(hunks_to_apply) == 0:
if hunks_to_rework:
await self.send_message(
f"Requesting rework for {len(hunks_to_rework)} changes with reason: {llm_response.review_notes}"
)
log.info(f"Requesting rework for {len(hunks_to_rework)} changes to {file_name} (0 hunks to apply)")
return old_content, review_log
else:
# If everything can be safely ignored, it's probably because the files already implement the changes
# from previous tasks (which can happen often). Insisting on a change here is likely to cause problems.
await self.send_message(f"Rejecting entire change with reason: {llm_response.review_notes}")
log.info(f"Rejecting entire change to {file_name} with reason: {llm_response.review_notes}")
return old_content, None
print("Applying code change:\n" + diff_log)
log.info(f"Applying code change to {file_name}:\n{diff_log}")
new_content = self.apply_diff(file_name, old_content, hunks_to_apply, new_content)
if hunks_to_rework:
print(f"Requesting rework for {len(hunks_to_rework)} changes with reason: {llm_response.review_notes}")
log.info(f"Requesting further rework for {len(hunks_to_rework)} changes to {file_name}")
return new_content, review_log
else:
return new_content, None
@staticmethod
def get_diff_hunks(file_name: str, old_content: str, new_content: str) -> list[str]:
"""
Get the diff between two files.
This uses Python difflib to produce an unified diff, then splits
it into hunks that will be separately reviewed by the reviewer.
:param file_name: name of the file being modified
:param old_content: old file content
:param new_content: new file content
:return: change hunks from the unified diff
"""
from_name = "old_" + file_name
to_name = "to_" + file_name
from_lines = old_content.splitlines(keepends=True)
to_lines = new_content.splitlines(keepends=True)
diff_gen = unified_diff(from_lines, to_lines, fromfile=from_name, tofile=to_name)
diff_txt = "".join(diff_gen)
hunks = re.split(r"\n@@", diff_txt, re.MULTILINE)
result = []
for i, h in enumerate(hunks):
# Skip the prologue (file names)
if i == 0:
continue
txt = h.splitlines()
txt[0] = "@@" + txt[0]
result.append("\n".join(txt))
return result
def apply_diff(self, file_name: str, old_content: str, hunks: list[str], fallback: str):
"""
Apply the diff to the original file content.
This uses the internal `_apply_patch` method to apply the
approved diff hunks to the original file content.
If patch apply fails, the fallback is the full new file content
with all the changes applied (as if the reviewer approved everythng).
:param file_name: name of the file being modified
:param old_content: old file content
:param hunks: change hunks from the unified diff
:param fallback: proposed new file content (with all the changes applied)
"""
diff = (
"\n".join(
[
f"--- {file_name}",
f"+++ {file_name}",
]
+ hunks
)
+ "\n"
)
try:
fixed_content = self._apply_patch(old_content, diff)
except Exception as e:
# This should never happen but if it does, just use the new version from
# the LLM and hope for the best
print(f"Error applying diff: {e}; hoping all changes are valid")
return fallback
return fixed_content
# Adapted from https://gist.github.com/noporpoise/16e731849eb1231e86d78f9dfeca3abc (Public Domain)
@staticmethod
def _apply_patch(original: str, patch: str, revert: bool = False):
"""
Apply a patch to a string to recover a newer version of the string.
:param original: The original string.
:param patch: The patch to apply.
:param revert: If True, treat the original string as the newer version and recover the older string.
:return: The updated string after applying the patch.
"""
original_lines = original.splitlines(True)
patch_lines = patch.splitlines(True)
updated_text = ""
index_original = start_line = 0
# Choose which group of the regex to use based on the revert flag
match_index, line_sign = (1, "+") if not revert else (3, "-")
# Skip header lines of the patch
while index_original < len(patch_lines) and patch_lines[index_original].startswith(("---", "+++")):
index_original += 1
while index_original < len(patch_lines):
match = PATCH_HEADER_PATTERN.match(patch_lines[index_original])
if not match:
raise Exception("Bad patch -- regex mismatch [line " + str(index_original) + "]")
line_number = int(match.group(match_index)) - 1 + (match.group(match_index + 1) == "0")
if start_line > line_number or line_number > len(original_lines):
raise Exception("Bad patch -- bad line number [line " + str(index_original) + "]")
updated_text += "".join(original_lines[start_line:line_number])
start_line = line_number
index_original += 1
while index_original < len(patch_lines) and patch_lines[index_original][0] != "@":
if index_original + 1 < len(patch_lines) and patch_lines[index_original + 1][0] == "\\":
line_content = patch_lines[index_original][:-1]
index_original += 2
else:
line_content = patch_lines[index_original]
index_original += 1
if line_content:
if line_content[0] == line_sign or line_content[0] == " ":
updated_text += line_content[1:]
start_line += line_content[0] != line_sign
updated_text += "".join(original_lines[start_line:])
return updated_text

75
core/agents/convo.py Normal file
View File

@@ -0,0 +1,75 @@
import json
import sys
from copy import deepcopy
from typing import TYPE_CHECKING, Optional
from pydantic import BaseModel
from core.config import get_config
from core.llm.convo import Convo
from core.llm.prompt import JinjaFileTemplate
from core.log import get_logger
if TYPE_CHECKING:
from core.agents.response import BaseAgent
log = get_logger(__name__)
class AgentConvo(Convo):
prompt_loader: Optional[JinjaFileTemplate] = None
def __init__(self, agent: "BaseAgent"):
self.agent_instance = agent
super().__init__()
try:
system_message = self.render("system")
self.system(system_message)
except ValueError as err:
log.warning(f"Agent {agent.__class__.__name__} has no system prompt: {err}")
@classmethod
def _init_templates(cls):
if cls.prompt_loader is not None:
return
config = get_config()
cls.prompt_loader = JinjaFileTemplate(config.prompt.paths)
def _get_default_template_vars(self) -> dict:
if sys.platform == "win32":
os = "Windows"
elif sys.platform == "darwin":
os = "macOS"
else:
os = "Linux"
return {
"state": self.agent_instance.current_state,
"os": os,
}
def render(self, name: str, **kwargs) -> str:
self._init_templates()
kwargs.update(self._get_default_template_vars())
# Jinja uses "/" even in Windows
template_name = f"{self.agent_instance.agent_type}/{name}.prompt"
log.debug(f"Loading template {template_name}")
return self.prompt_loader(template_name, **kwargs)
def template(self, template_name: str, **kwargs) -> "AgentConvo":
message = self.render(template_name, **kwargs)
self.user(message)
return self
def fork(self) -> "AgentConvo":
child = AgentConvo(self.agent_instance)
child.messages = deepcopy(self.messages)
return child
def require_schema(self, model: BaseModel) -> "AgentConvo":
schema_txt = json.dumps(model.model_json_schema())
self.user(f"IMPORTANT: Your response MUST conform to this JSON schema:\n```\n{schema_txt}\n```")
return self

294
core/agents/developer.py Normal file
View File

@@ -0,0 +1,294 @@
from enum import Enum
from typing import Annotated, Literal, Optional, Union
from uuid import uuid4
from pydantic import BaseModel, Field
from core.agents.base import BaseAgent
from core.agents.convo import AgentConvo
from core.agents.response import AgentResponse, ResponseType
from core.llm.parser import JSONParser
from core.log import get_logger
log = get_logger(__name__)
class StepType(str, Enum):
COMMAND = "command"
SAVE_FILE = "save_file"
HUMAN_INTERVENTION = "human_intervention"
class CommandOptions(BaseModel):
command: str = Field(description="Command to run")
timeout: int = Field(description="Timeout in seconds")
success_message: str = ""
class SaveFileOptions(BaseModel):
path: str
class SaveFileStep(BaseModel):
type: Literal[StepType.SAVE_FILE] = StepType.SAVE_FILE
save_file: SaveFileOptions
class CommandStep(BaseModel):
type: Literal[StepType.COMMAND] = StepType.COMMAND
command: CommandOptions
class HumanInterventionStep(BaseModel):
type: Literal[StepType.HUMAN_INTERVENTION] = StepType.HUMAN_INTERVENTION
human_intervention_description: str
Step = Annotated[
Union[SaveFileStep, CommandStep, HumanInterventionStep],
Field(discriminator="type"),
]
class TaskSteps(BaseModel):
steps: list[Step]
class Developer(BaseAgent):
agent_type = "developer"
display_name = "Developer"
async def run(self) -> AgentResponse:
if self.prev_response and self.prev_response.type == ResponseType.TASK_REVIEW_FEEDBACK:
return await self.breakdown_current_iteration(self.prev_response.data["feedback"])
# If any of the files are missing metadata/descriptions, those need to be filled-in
missing_descriptions = [file.path for file in self.current_state.files if not file.meta.get("description")]
if missing_descriptions:
log.debug(f"Some files are missing descriptions: {', '.join(missing_descriptions)}, reqesting analysis")
return AgentResponse.describe_files(self)
log.debug(f"Current state files: {len(self.current_state.files)}, relevant {self.current_state.relevant_files}")
# Check which files are relevant to the current task
if self.current_state.files and not self.current_state.relevant_files:
await self.get_relevant_files()
return AgentResponse.done(self)
if not self.current_state.unfinished_tasks:
log.warning("No unfinished tasks found, nothing to do (why am I called? is this a bug?)")
return AgentResponse.done(self)
if self.current_state.unfinished_iterations:
return await self.breakdown_current_iteration()
# By default, we want to ask the user if they want to run the task,
# except in certain cases (such as they've just edited it).
if not self.current_state.current_task.get("run_always", False):
if not await self.ask_to_execute_task():
return AgentResponse.done(self)
return await self.breakdown_current_task()
async def breakdown_current_iteration(self, review_feedback: Optional[str] = None) -> AgentResponse:
"""
Breaks down current iteration or task review into steps.
:param review_feedback: If provided, the task review feedback is broken down instead of the current iteration
:return: AgentResponse.done(self) when the breakdown is done
"""
if self.current_state.unfinished_steps:
# if this happens, it's most probably a bug as we should have gone through all the
# steps before getting new new iteration instructions
log.warning(
f"Unfinished steps found before the next iteration is broken down: {self.current_state.unfinished_steps}"
)
if review_feedback is not None:
iteration = None
description = review_feedback
user_feedback = ""
source = "review"
n_tasks = 1
log.debug(f"Breaking down the task review feedback {review_feedback}")
await self.send_message("Breaking down the task review feedback...")
else:
iteration = self.current_state.current_iteration
if iteration is None:
log.error("Iteration breakdown called but there's no current iteration or task review, possible bug?")
return AgentResponse.done(self)
description = iteration["description"]
user_feedback = iteration["user_feedback"]
source = "troubleshooting"
n_tasks = len(self.next_state.iterations)
log.debug(f"Breaking down the iteration {description}")
await self.send_message("Breaking down the current task iteration ...")
await self.ui.send_task_progress(
n_tasks, # iterations and reviews can be created only one at a time, so we are always on last one
n_tasks,
self.current_state.current_task["description"],
source,
"in-progress",
)
llm = self.get_llm()
# FIXME: In case of iteration, parse_task depends on the context (files, tasks, etc) set there.
# Ideally this prompt would be self-contained.
convo = (
AgentConvo(self)
.template(
"iteration",
current_task=self.current_state.current_task,
user_feedback=user_feedback,
user_feedback_qa=None,
next_solution_to_try=None,
)
.assistant(description)
.template("parse_task")
.require_schema(TaskSteps)
)
response: TaskSteps = await llm(convo, parser=JSONParser(TaskSteps), temperature=0)
self.set_next_steps(response, source)
if iteration:
self.next_state.complete_iteration()
return AgentResponse.done(self)
async def breakdown_current_task(self) -> AgentResponse:
task = self.current_state.current_task
source = self.current_state.current_epic.get("source", "app")
await self.ui.send_task_progress(
self.current_state.tasks.index(self.current_state.current_task) + 1,
len(self.current_state.tasks),
self.current_state.current_task["description"],
source,
"in-progress",
)
log.debug(f"Breaking down the current task: {task['description']}")
await self.send_message("Thinking about how to implement this task ...")
current_task_index = self.current_state.tasks.index(task)
llm = self.get_llm()
convo = AgentConvo(self).template(
"breakdown",
task=task,
iteration=None,
current_task_index=current_task_index,
)
response: str = await llm(convo)
# FIXME: check if this is correct, as sqlalchemy can't figure out modifications
# to attributes; however, self.next is not saved yet so maybe this is fine
self.next_state.tasks[current_task_index] = {
**task,
"instructions": response,
}
await self.send_message("Breaking down the task into steps ...")
convo.template("parse_task").require_schema(TaskSteps)
response: TaskSteps = await llm(convo, parser=JSONParser(TaskSteps), temperature=0)
# There might be state leftovers from previous tasks that we need to clean here
self.next_state.modified_files = {}
self.set_next_steps(response, source)
return AgentResponse.done(self)
async def get_relevant_files(self) -> AgentResponse:
log.debug("Getting relevant files for the current task")
await self.send_message("Figuring out which project files are relevant for the next task ...")
llm = self.get_llm()
convo = AgentConvo(self).template("filter_files", current_task=self.current_state.current_task)
# FIXME: this doesn't validate correct structure format, we should use pydantic for that as well
llm_response: list[str] = await llm(convo, parser=JSONParser(), temperature=0)
existing_files = {file.path for file in self.current_state.files}
self.next_state.relevant_files = [path for path in llm_response if path in existing_files]
return AgentResponse.done(self)
def set_next_steps(self, response: TaskSteps, source: str):
# For logging/debugging purposes, we don't want to remove the finished steps
# until we're done with the task.
finished_steps = [step for step in self.current_state.steps if step["completed"]]
self.next_state.steps = finished_steps + [
{
"id": uuid4().hex,
"completed": False,
"source": source,
**step.model_dump(),
}
for step in response.steps
]
if len(self.next_state.unfinished_steps) > 0:
self.next_state.steps += [
# TODO: add refactor step here once we have the refactor agent
{
"id": uuid4().hex,
"completed": False,
"type": "review_task",
"source": source,
},
{
"id": uuid4().hex,
"completed": False,
"type": "create_readme",
"source": source,
},
]
log.debug(f"Next steps: {self.next_state.unfinished_steps}")
async def ask_to_execute_task(self) -> bool:
"""
Asks the user to approve, skip or edit the current task.
If task is edited, the method returns False so that the changes are saved. The
Orchestrator will rerun the agent on the next iteration.
:return: True if the task should be executed as is, False if the task is skipped or edited
"""
description = self.current_state.current_task["description"]
user_response = await self.ask_question(
"Do you want to execute the this task:\n\n" + description,
buttons={"yes": "Yes", "edit": "Edit Task", "skip": "Skip Task"},
default="yes",
buttons_only=True,
)
if user_response.button == "yes":
# Execute the task as is
return True
if user_response.cancelled or user_response.button == "skip":
log.info(f"Skipping task: {description}")
self.next_state.current_task["instructions"] = "(skipped on user request)"
self.next_state.complete_task()
await self.send_message(f"Skipping task {description}")
# We're done here, and will pick up the next task (if any) on the next run
return False
user_response = await self.ask_question(
"Edit the task description:",
buttons={
# FIXME: Continue doesn't actually work, VSCode doesn't send the user
# message if it's clicked. Long term we need to fix the extension.
# "continue": "Continue",
"cancel": "Cancel",
},
default="continue",
initial_text=description,
)
if user_response.button == "cancel" or user_response.cancelled:
# User hasn't edited the task so we can execute it immediately as is
return True
self.next_state.current_task["description"] = user_response.text
self.next_state.current_task["run_always"] = True
self.next_state.relevant_files = []
log.info(f"Task description updated to: {user_response.text}")
# Orchestrator will rerun us with the new task description
return False

View File

@@ -0,0 +1,108 @@
from uuid import uuid4
from core.agents.base import BaseAgent
from core.agents.convo import AgentConvo
from core.agents.response import AgentResponse
from core.log import get_logger
log = get_logger(__name__)
class ErrorHandler(BaseAgent):
"""
Error handler agent.
Error handler is responsible for handling errors returned by other agents. If it's possible
to recover from the error, it should do it (which may include updating the "next" state) and
return DONE. Otherwise it should return EXIT to tell Orchestrator to quit the application.
"""
agent_type = "error-handler"
display_name = "Error Handler"
async def run(self) -> AgentResponse:
from core.agents.executor import Executor
from core.agents.spec_writer import SpecWriter
error = self.prev_response
if error is None:
log.warning("ErrorHandler called without a previous error", stack_info=True)
return AgentResponse.done(self)
log.error(
f"Agent {error.agent.display_name} returned error response: {error.type}",
extra={"data": error.data},
)
if isinstance(error.agent, SpecWriter):
# If SpecWriter wasn't able to get the project description, there's nothing for
# us to do.
return AgentResponse.exit(self)
if isinstance(error.agent, Executor):
return await self.handle_command_error(
error.data.get("message", "Unknown error"), error.data.get("details", {})
)
log.error(
f"Unhandled error response from agent {error.agent.display_name}",
extra={"data": error.data},
)
return AgentResponse.exit(self)
async def handle_command_error(self, message: str, details: dict) -> AgentResponse:
"""
Handle an error returned by Executor agent.
Error message must be the analyis of the command execution, and the details must contain:
* cmd - command that was executed
* timeout - timeout for the command if any (or None if no timeout was used)
* status_code - exit code for the command (or None if the command timed out)
* stdout - standard output of the command
* stderr - standard error of the command
:return: AgentResponse
"""
cmd = details.get("cmd")
timeout = details.get("timeout")
status_code = details.get("status_code")
stdout = details.get("stdout", "")
stderr = details.get("stderr", "")
if not message:
raise ValueError("No error message provided in command error response")
if not cmd:
raise ValueError("No command provided in command error response details")
llm = self.get_llm()
convo = AgentConvo(self).template(
"debug",
task_steps=self.current_state.steps,
current_task=self.current_state.current_task,
# FIXME: can this break?
step_index=self.current_state.steps.index(self.current_state.current_step),
cmd=cmd,
timeout=timeout,
stdout=stdout,
stderr=stderr,
status_code=status_code,
# fixme: everything above copypasted from Executor
analysis=message,
)
llm_response: str = await llm(convo)
# TODO: duplicate from Troubleshooter, maybe extract to a ProjectState method?
self.next_state.iterations = self.current_state.iterations + [
{
"id": uuid4().hex,
"user_feedback": f"Error running command: {cmd}",
"description": llm_response,
"alternative_solutions": [],
"attempts": 1,
"completed": False,
}
]
# TODO: maybe have ProjectState.finished_steps as well? would make the debug/ran_command prompts nicer too
self.next_state.steps = [s for s in self.current_state.steps if s.get("completed") is True]
# No need to call complete_step() here as we've just removed the steps so that Developer can break down the iteration
return AgentResponse.done(self)

166
core/agents/executor.py Normal file
View File

@@ -0,0 +1,166 @@
from datetime import datetime
from typing import Optional
from pydantic import BaseModel, Field
from core.agents.base import BaseAgent
from core.agents.convo import AgentConvo
from core.agents.response import AgentResponse
from core.llm.parser import JSONParser
from core.log import get_logger
from core.proc.exec_log import ExecLog
from core.proc.process_manager import ProcessManager
from core.state.state_manager import StateManager
from core.ui.base import AgentSource, UIBase
log = get_logger(__name__)
class CommandResult(BaseModel):
"""
Analysis of the command run and decision on the next steps.
"""
analysis: str = Field(
description="Analysis of the command output (stdout, stderr) and exit code, in context of the current task"
)
success: bool = Field(
description="True if the command should be treated as successful and the task should continue, false if the command unexpectedly failed and we should debug the issue"
)
class Executor(BaseAgent):
agent_type = "executor"
display_name = "Executor"
def __init__(
self,
state_manager: StateManager,
ui: UIBase,
):
"""
Create a new Executor agent
"""
self.ui_source = AgentSource(self.display_name, self.agent_type)
self.ui = ui
self.state_manager = state_manager
self.process_manager = ProcessManager(
root_dir=state_manager.get_full_project_root(),
output_handler=self.output_handler,
exit_handler=self.exit_handler,
)
self.stream_output = True
def for_step(self, step):
# FIXME: not needed, refactor to use self.current_state.current_step
# in general, passing current step is not needed
self.step = step
return self
async def output_handler(self, out, err):
await self.stream_handler(out)
await self.stream_handler(err)
async def exit_handler(self, process):
pass
async def run(self) -> AgentResponse:
if not self.step:
raise ValueError("No current step set (probably an Orchestrator bug)")
options = self.step["command"]
cmd = options["command"]
timeout = options.get("timeout")
if timeout:
q = f"Can I run command: {cmd} with {timeout}s timeout?"
else:
q = f"Can I run command: {cmd}?"
confirm = await self.ask_question(
q,
buttons={"yes": "Yes", "no": "No"},
default="yes",
buttons_only=True,
)
if confirm.button == "no":
log.info(f"Skipping command execution of `{cmd}` (requested by user)")
await self.send_message(f"Skipping command {cmd}")
self.complete()
return AgentResponse.done(self)
started_at = datetime.now()
log.info(f"Running command `{cmd}` with timeout {timeout}s")
status_code, stdout, stderr = await self.process_manager.run_command(cmd, timeout=timeout)
llm_response = await self.check_command_output(cmd, timeout, stdout, stderr, status_code)
duration = (datetime.now() - started_at).total_seconds()
self.complete()
exec_log = ExecLog(
started_at=started_at,
duration=duration,
cmd=cmd,
cwd=".",
env={},
timeout=timeout,
status_code=status_code,
stdout=stdout,
stderr=stderr,
analysis=llm_response.analysis,
success=llm_response.success,
)
await self.state_manager.log_command_run(exec_log)
if llm_response.success:
return AgentResponse.done(self)
return AgentResponse.error(
self,
llm_response.analysis,
{
"cmd": cmd,
"timeout": timeout,
"stdout": stdout,
"stderr": stderr,
"status_code": status_code,
},
)
async def check_command_output(
self, cmd: str, timeout: Optional[int], stdout: str, stderr: str, status_code: int
) -> CommandResult:
llm = self.get_llm()
convo = (
AgentConvo(self)
.template(
"ran_command",
task_steps=self.current_state.steps,
current_task=self.current_state.current_task,
# FIXME: can step ever happen *not* to be in current steps?
step_index=self.current_state.steps.index(self.step),
cmd=cmd,
timeout=timeout,
stdout=stdout,
stderr=stderr,
status_code=status_code,
)
.require_schema(CommandResult)
)
return await llm(convo, parser=JSONParser(spec=CommandResult), temperature=0)
def complete(self):
"""
Mark the step as complete.
Note that this marks the step complete in the next state. If there's an error,
the state won't get committed and the error handler will have access to the
current state, where this step is still unfinished.
This is intentional, so that the error handler can decide what to do with the
information we give it.
"""
self.step = None
self.next_state.complete_step()

View File

@@ -0,0 +1,46 @@
from core.agents.base import BaseAgent
from core.agents.response import AgentResponse, ResponseType
class HumanInput(BaseAgent):
agent_type = "human-input"
display_name = "Human Input"
async def run(self) -> AgentResponse:
if self.prev_response and self.prev_response.type == ResponseType.INPUT_REQUIRED:
return await self.input_required(self.prev_response.data.get("files", []))
return await self.human_intervention(self.step)
async def human_intervention(self, step) -> AgentResponse:
description = step["human_intervention_description"]
await self.ask_question(
f"I need human intervention: {description}",
buttons={"continue": "Continue"},
default="continue",
buttons_only=True,
)
self.next_state.complete_step()
return AgentResponse.done(self)
async def input_required(self, files: list[dict]) -> AgentResponse:
for item in files:
file = item["file"]
line = item["line"]
# FIXME: this is an ugly hack, we shouldn't need to know how to get to VFS and
# anyways the full path is only available for local vfs, so this is doubly wrong;
# instead, we should just send the relative path to the extension and it should
# figure out where its local files are and how to open it.
full_path = self.state_manager.file_system.get_full_path(file)
await self.send_message(f"Input required on {file}:{line}")
await self.ui.open_editor(full_path, line)
await self.ask_question(
f"Please open {file} and modify line {line} according to the instructions.",
buttons={"continue": "Continue"},
default="continue",
buttons_only=True,
)
return AgentResponse.done(self)

37
core/agents/mixins.py Normal file
View File

@@ -0,0 +1,37 @@
from typing import Optional
from core.agents.convo import AgentConvo
class IterationPromptMixin:
"""
Provides a method to find a solution to a problem based on user feedback.
Used by ProblemSolver and Troubleshooter agents.
"""
async def find_solution(
self,
user_feedback: str,
*,
user_feedback_qa: Optional[list[str]] = None,
next_solution_to_try: Optional[str] = None,
) -> str:
"""
Generate a new solution for the problem the user reported.
:param user_feedback: User feedback about the problem.
:param user_feedback_qa: Additional q/a about the problem provided by the user (optional).
:param next_solution_to_try: Hint from ProblemSolver on which solution to try (optional).
:return: The generated solution to the problem.
"""
llm = self.get_llm()
convo = AgentConvo(self).template(
"iteration",
current_task=self.current_state.current_task,
user_feedback=user_feedback,
user_feedback_qa=user_feedback_qa,
next_solution_to_try=next_solution_to_try,
)
llm_solution: str = await llm(convo)
return llm_solution

329
core/agents/orchestrator.py Normal file
View File

@@ -0,0 +1,329 @@
from typing import Optional
from core.agents.architect import Architect
from core.agents.base import BaseAgent
from core.agents.code_monkey import CodeMonkey
from core.agents.code_reviewer import CodeReviewer
from core.agents.developer import Developer
from core.agents.error_handler import ErrorHandler
from core.agents.executor import Executor
from core.agents.human_input import HumanInput
from core.agents.problem_solver import ProblemSolver
from core.agents.response import AgentResponse, ResponseType
from core.agents.spec_writer import SpecWriter
from core.agents.task_reviewer import TaskReviewer
from core.agents.tech_lead import TechLead
from core.agents.tech_writer import TechnicalWriter
from core.agents.troubleshooter import Troubleshooter
from core.config import LLMProvider, get_config
from core.llm.convo import Convo
from core.log import get_logger
from core.telemetry import telemetry
from core.ui.base import ProjectStage
log = get_logger(__name__)
class Orchestrator(BaseAgent):
"""
Main agent that controls the flow of the process.
Based on the current state of the project, the orchestrator invokes
all other agents. It is also responsible for determining when each
step is done and the project state needs to be committed to the database.
"""
agent_type = "orchestrator"
display_name = "Orchestrator"
async def run(self) -> bool:
"""
Run the Orchestrator agent.
:return: True if the Orchestrator exited successfully, False otherwise.
"""
response = None
log.info(f"Starting {__name__}.Orchestrator")
self.executor = Executor(self.state_manager, self.ui)
self.process_manager = self.executor.process_manager
# self.chat = Chat() TODO
await self.init_ui()
await self.offline_changes_check()
llm_api_check = await self.test_llm_access()
if not llm_api_check:
return False
# TODO: consider refactoring this into two loop; the outer with one iteration per comitted step,
# and the inner which runs the agents for the current step until they're done. This would simplify
# handle_done() and let us do other per-step processing (eg. describing files) in between agent runs.
while True:
await self.update_stats()
agent = self.create_agent(response)
log.debug(f"Running agent {agent.__class__.__name__} (step {self.current_state.step_index})")
response = await agent.run()
if response.type == ResponseType.EXIT:
log.debug(f"Agent {agent.__class__.__name__} requested exit")
break
if response.type == ResponseType.DONE:
response = await self.handle_done(agent, response)
continue
# TODO: rollback changes to "next" so they aren't accidentally committed?
return True
async def test_llm_access(self) -> bool:
"""
Make sure the LLMs for all the defined agents are reachable.
Each LLM provider is only checked once.
Returns True if the check for successful for all LLMs.
"""
config = get_config()
defined_agents = config.agent.keys()
convo = Convo()
convo.user(
" ".join(
[
"This is a connection test. If you can see this,",
"please respond only with 'START' and nothing else.",
]
)
)
success = True
tested_llms: set[LLMProvider] = set()
for agent_name in defined_agents:
llm = self.get_llm(agent_name)
llm_config = config.llm_for_agent(agent_name)
if llm_config.provider in tested_llms:
continue
tested_llms.add(llm_config.provider)
provider_model_combo = f"{llm_config.provider.value} {llm_config.model}"
try:
resp = await llm(convo)
except Exception as err:
log.warning(f"API check for {provider_model_combo} failed: {err}")
success = False
await self.ui.send_message(f"Error connecting to the {provider_model_combo} API: {err}")
continue
if resp and len(resp) > 0:
log.debug(f"API check for {provider_model_combo} passed.")
else:
log.warning(f"API check for {provider_model_combo} failed.")
await self.ui.send_message(
f"Error connecting to the {provider_model_combo} API. Please check your settings and internet connection."
)
success = False
return success
async def offline_changes_check(self):
"""
Check for changes outside of Pythagora.
If there are changes, ask the user if they want to keep them, and
import if needed.
"""
log.info("Checking for offline changes.")
modified_files = await self.state_manager.get_modified_files()
if self.state_manager.workspace_is_empty():
# NOTE: this will currently get triggered on a new project, but will do
# nothing as there's no files in the database.
log.info("Detected empty workspace, restoring state from the database.")
await self.state_manager.restore_files()
elif modified_files:
await self.send_message(f"We found {len(modified_files)} new and/or modified files.")
hint = "".join(
[
"If you would like Pythagora to import those changes, click 'Yes'.\n",
"Clicking 'No' means Pythagora will restore (overwrite) all files to the last stored state.\n",
]
)
use_changes = await self.ask_question(
question="Would you like to keep your changes?",
buttons={
"yes": "Yes, keep my changes",
"no": "No, restore last Pythagora state",
},
buttons_only=True,
hint=hint,
)
if use_changes.button == "yes":
log.debug("Importing offline changes into Pythagora.")
await self.import_files()
else:
log.debug("Restoring last stored state.")
await self.state_manager.restore_files()
log.info("Offline changes check done.")
async def handle_done(self, agent: BaseAgent, response: AgentResponse) -> AgentResponse:
"""
Handle the DONE response from the agent and commit current state to the database.
This also checks for any files created or modified outside Pythagora and
imports them. If any of the files require input from the user, the returned response
will trigger the HumanInput agent to ask the user to provide the required input.
"""
n_epics = len(self.next_state.epics)
n_finished_epics = n_epics - len(self.next_state.unfinished_epics)
n_tasks = len(self.next_state.tasks)
n_finished_tasks = n_tasks - len(self.next_state.unfinished_tasks)
n_iterations = len(self.next_state.iterations)
n_finished_iterations = n_iterations - len(self.next_state.unfinished_iterations)
n_steps = len(self.next_state.steps)
n_finished_steps = n_steps - len(self.next_state.unfinished_steps)
log.debug(
f"Agent {agent.__class__.__name__} is done, "
f"committing state for step {self.current_state.step_index}: "
f"{n_finished_epics}/{n_epics} epics, "
f"{n_finished_tasks}/{n_tasks} tasks, "
f"{n_finished_iterations}/{n_iterations} iterations, "
f"{n_finished_steps}/{n_steps} dev steps."
)
await self.state_manager.commit()
# If there are any new or modified files changed outside Pythagora,
# this is a good time to add them to the project. If any of them have
# INPUT_REQUIRED, we'll first ask the user to provide the required input.
return await self.import_files()
def create_agent(self, prev_response: Optional[AgentResponse]) -> BaseAgent:
state = self.current_state
if prev_response:
if prev_response.type in [ResponseType.CANCEL, ResponseType.ERROR]:
return ErrorHandler(self.state_manager, self.ui, prev_response=prev_response)
if prev_response.type == ResponseType.CODE_REVIEW:
return CodeReviewer(self.state_manager, self.ui, prev_response=prev_response)
if prev_response.type == ResponseType.CODE_REVIEW_FEEDBACK:
return CodeMonkey(self.state_manager, self.ui, prev_response=prev_response, step=state.current_step)
if prev_response.type == ResponseType.DESCRIBE_FILES:
return CodeMonkey(self.state_manager, self.ui, prev_response=prev_response)
if prev_response.type == ResponseType.INPUT_REQUIRED:
# FIXME: HumanInput should be on the whole time and intercept chat/interrupt
return HumanInput(self.state_manager, self.ui, prev_response=prev_response)
if prev_response.type == ResponseType.UPDATE_EPIC:
return TechLead(self.state_manager, self.ui, prev_response=prev_response)
if prev_response.type == ResponseType.TASK_REVIEW_FEEDBACK:
return Developer(self.state_manager, self.ui, prev_response=prev_response)
if not state.specification.description:
# Ask the Spec Writer to refine and save the project specification
return SpecWriter(self.state_manager, self.ui)
elif not state.specification.architecture:
# Ask the Architect to design the project architecture and determine dependencies
return Architect(self.state_manager, self.ui, process_manager=self.process_manager)
elif (
not state.epics
or not self.current_state.unfinished_tasks
or (state.specification.template and not state.files)
):
# Ask the Tech Lead to break down the initial project or feature into tasks and apply projet template
return TechLead(self.state_manager, self.ui, process_manager=self.process_manager)
elif not state.steps and not state.iterations:
# Ask the Developer to break down current task into actionable steps
return Developer(self.state_manager, self.ui)
if state.current_step:
# Execute next step in the task
# TODO: this can be parallelized in the future
return self.create_agent_for_step(state.current_step)
if state.unfinished_iterations:
if state.current_iteration["description"]:
# Break down the next iteration into steps
return Developer(self.state_manager, self.ui)
else:
# We need to iterate over the current task but there's no solution, as Pythagora
# is stuck in a loop, and ProblemSolver needs to find alternative solutions.
return ProblemSolver(self.state_manager, self.ui)
# We have just finished the task, call Troubleshooter to ask the user to review
return Troubleshooter(self.state_manager, self.ui)
def create_agent_for_step(self, step: dict) -> BaseAgent:
step_type = step.get("type")
if step_type == "save_file":
return CodeMonkey(self.state_manager, self.ui, step=step)
elif step_type == "command":
return self.executor.for_step(step)
elif step_type == "human_intervention":
return HumanInput(self.state_manager, self.ui, step=step)
elif step_type == "review_task":
return TaskReviewer(self.state_manager, self.ui)
elif step_type == "create_readme":
return TechnicalWriter(self.state_manager, self.ui)
else:
raise ValueError(f"Unknown step type: {step_type}")
async def import_files(self) -> Optional[AgentResponse]:
imported_files = await self.state_manager.import_files()
if not imported_files:
return None
log.info(f"Imported new/changed files to project: {', '.join(f.path for f in imported_files)}")
input_required_files: list[dict[str, int]] = []
for file in imported_files:
for line in self.state_manager.get_input_required(file.content.content):
input_required_files.append({"file": file.path, "line": line})
if input_required_files:
# This will trigger the HumanInput agent to ask the user to provide the required changes
# If the user changes anything (removes the "required changes"), the file will be re-imported.
return AgentResponse.input_required(self, input_required_files)
# Commit the newly imported file
log.debug(f"Committing imported files as a separate step {self.current_state.step_index}")
await self.state_manager.commit()
return None
async def init_ui(self):
await self.ui.send_project_root(self.state_manager.get_full_project_root())
if self.current_state.epics:
await self.ui.send_project_stage(ProjectStage.CODING)
elif self.current_state.specification:
await self.ui.send_project_stage(ProjectStage.ARCHITECTURE)
else:
await self.ui.send_project_stage(ProjectStage.DESCRIPTION)
async def update_stats(self):
if self.current_state.steps and self.current_state.current_step:
source = self.current_state.current_step.get("source")
source_steps = [s for s in self.current_state.steps if s.get("source") == source]
await self.ui.send_step_progress(
source_steps.index(self.current_state.current_step) + 1,
len(source_steps),
self.current_state.current_step,
source,
)
total_files = 0
total_lines = 0
for file in self.current_state.files:
total_files += 1
total_lines += len(file.content.content.splitlines())
telemetry.set("num_files", total_files)
telemetry.set("num_lines", total_lines)
stats = telemetry.get_project_stats()
await self.ui.send_project_stats(stats)

View File

@@ -0,0 +1,126 @@
from typing import Optional
from pydantic import BaseModel, Field
from core.agents.base import BaseAgent
from core.agents.convo import AgentConvo
from core.agents.response import AgentResponse
from core.agents.troubleshooter import IterationPromptMixin
from core.llm.parser import JSONParser
from core.log import get_logger
log = get_logger(__name__)
class AlternativeSolutions(BaseModel):
# FIXME: This is probably extra leftover from some dead code in the old implementation
description_of_tried_solutions: str = Field(
description="A description of the solutions that were tried to solve the recurring issue that was labeled as loop by the user.",
)
alternative_solutions: list[str] = Field(
description=("List of all alternative solutions to the recurring issue that was labeled as loop by the user.")
)
class ProblemSolver(IterationPromptMixin, BaseAgent):
agent_type = "problem-solver"
display_name = "Problem Solver"
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.iteration = self.current_state.current_iteration
self.next_state_iteration = self.next_state.current_iteration
self.previous_solutions = [s for s in self.iteration["alternative_solutions"] if s["tried"]]
self.possible_solutions = [s for s in self.iteration["alternative_solutions"] if not s["tried"]]
async def run(self) -> AgentResponse:
if self.iteration is None:
log.warning("ProblemSolver agent started without an iteration to work on, possible bug?")
return AgentResponse.done(self)
if not self.possible_solutions:
await self.generate_alternative_solutions()
return AgentResponse.done(self)
return await self.try_alternative_solutions()
async def generate_alternative_solutions(self):
llm = self.get_llm()
convo = (
AgentConvo(self)
.template(
"get_alternative_solutions",
user_input=self.iteration["user_feedback"],
iteration=self.iteration,
previous_solutions=self.previous_solutions,
)
.require_schema(AlternativeSolutions)
)
llm_response: AlternativeSolutions = await llm(
convo,
parser=JSONParser(spec=AlternativeSolutions),
temperature=1,
)
self.next_state_iteration["alternative_solutions"] = self.iteration["alternative_solutions"] + [
{
"user_feedback": None,
"description": solution,
"tried": False,
}
for solution in llm_response.alternative_solutions
]
self.next_state.flag_iterations_as_modified()
async def try_alternative_solutions(self) -> AgentResponse:
preferred_solution = await self.ask_for_preferred_solution()
if preferred_solution is None:
# TODO: We have several alternative solutions but the user didn't choose any.
# This means the user either needs expert help, or that they need to go back and
# maybe rephrase the tasks or even the project specs.
# For now, we'll just mark these as not working and try to regenerate.
self.next_state_iteration["alternative_solutions"] = [
{
**s,
"tried": True,
"user_feedback": s["user_feedback"] or "That doesn't sound like a good idea, try something else.",
}
for s in self.possible_solutions
]
self.next_state.flag_iterations_as_modified()
return AgentResponse.done(self)
index, next_solution_to_try = preferred_solution
llm_solution = await self.find_solution(
self.iteration["user_feedback"],
next_solution_to_try=next_solution_to_try,
)
self.next_state_iteration["alternative_solutions"][index]["tried"] = True
self.next_state_iteration["description"] = llm_solution
self.next_state_iteration["attempts"] = self.iteration["attempts"] + 1
self.next_state.flag_iterations_as_modified()
return AgentResponse.done(self)
async def ask_for_preferred_solution(self) -> Optional[tuple[int, str]]:
solutions = self.possible_solutions
buttons = {}
for i in range(len(solutions)):
buttons[str(i)] = str(i + 1)
buttons["none"] = "None of these"
solutions_txt = "\n\n".join([f"{i+1}: {s['description']}" for i, s in enumerate(solutions)])
user_response = await self.ask_question(
"Choose which solution would you like Pythagora to try next:\n\n" + solutions_txt,
buttons=buttons,
default="0",
buttons_only=True,
)
if user_response.button == "none" or user_response.cancelled:
return None
try:
i = int(user_response.button)
return i, solutions[i]
except (ValueError, IndexError):
return None

139
core/agents/response.py Normal file
View File

@@ -0,0 +1,139 @@
from enum import Enum
from typing import TYPE_CHECKING, Optional
from core.log import get_logger
if TYPE_CHECKING:
from core.agents.base import BaseAgent
from core.agents.error_handler import ErrorHandler
log = get_logger(__name__)
class ResponseType(str, Enum):
DONE = "done"
"""Agent has finished processing."""
ERROR = "error"
"""There was an error processing the request."""
CANCEL = "cancel"
"""User explicitly cancelled the operation."""
EXIT = "exit"
"""Pythagora should exit."""
CODE_REVIEW = "code-review"
"""Agent is requesting a review of the created code."""
CODE_REVIEW_FEEDBACK = "code-review-feedback"
"""Agent is providing feedback on the code review."""
DESCRIBE_FILES = "describe-files"
"""Analysis of the files in the project is requested."""
INPUT_REQUIRED = "input-required"
"""User needs to modify a line in the generated code."""
UPDATE_EPIC = "update-epic"
"""Update the epic development plan after a task was iterated on."""
TASK_REVIEW_FEEDBACK = "task-review-feedback"
"""Agent is providing feedback on the entire task."""
class AgentResponse:
type: ResponseType = ResponseType.DONE
agent: "BaseAgent"
data: Optional[dict]
def __init__(self, type: ResponseType, agent: "BaseAgent", data: Optional[dict] = None):
self.type = type
self.agent = agent
self.data = data
def __repr__(self) -> str:
return f"<AgentResponse type={self.type} agent={self.agent}>"
@staticmethod
def done(agent: "BaseAgent") -> "AgentResponse":
return AgentResponse(type=ResponseType.DONE, agent=agent)
@staticmethod
def error(agent: "BaseAgent", message: str, details: Optional[dict] = None) -> "AgentResponse":
return AgentResponse(
type=ResponseType.ERROR,
agent=agent,
data={"message": message, "details": details},
)
@staticmethod
def cancel(agent: "BaseAgent") -> "AgentResponse":
return AgentResponse(type=ResponseType.CANCEL, agent=agent)
@staticmethod
def exit(agent: "ErrorHandler") -> "AgentResponse":
return AgentResponse(type=ResponseType.EXIT, agent=agent)
@staticmethod
def code_review(
agent: "BaseAgent",
path: str,
instructions: str,
old_content: str,
new_content: str,
attempt: int,
) -> "AgentResponse":
return AgentResponse(
type=ResponseType.CODE_REVIEW,
agent=agent,
data={
"path": path,
"instructions": instructions,
"old_content": old_content,
"new_content": new_content,
"attempt": attempt,
},
)
@staticmethod
def code_review_feedback(
agent: "BaseAgent",
new_content: str,
approved_content: str,
feedback: str,
attempt: int,
) -> "AgentResponse":
return AgentResponse(
type=ResponseType.CODE_REVIEW_FEEDBACK,
agent=agent,
data={
"new_content": new_content,
"approved_content": approved_content,
"feedback": feedback,
"attempt": attempt,
},
)
@staticmethod
def describe_files(agent: "BaseAgent") -> "AgentResponse":
return AgentResponse(type=ResponseType.DESCRIBE_FILES, agent=agent)
@staticmethod
def input_required(agent: "BaseAgent", files: list[dict[str, int]]) -> "AgentResponse":
return AgentResponse(type=ResponseType.INPUT_REQUIRED, agent=agent, data={"files": files})
@staticmethod
def update_epic(agent: "BaseAgent") -> "AgentResponse":
return AgentResponse(type=ResponseType.UPDATE_EPIC, agent=agent)
@staticmethod
def task_review_feedback(agent: "BaseAgent", feedback: str) -> "AgentResponse":
return AgentResponse(
type=ResponseType.TASK_REVIEW_FEEDBACK,
agent=agent,
data={
"feedback": feedback,
},
)

143
core/agents/spec_writer.py Normal file
View File

@@ -0,0 +1,143 @@
from core.agents.base import BaseAgent
from core.agents.convo import AgentConvo
from core.agents.response import AgentResponse
from core.db.models import Complexity
from core.llm.parser import StringParser
from core.telemetry import telemetry
from core.templates.example_project import (
EXAMPLE_PROJECT_ARCHITECTURE,
EXAMPLE_PROJECT_DESCRIPTION,
EXAMPLE_PROJECT_PLAN,
)
# If the project description is less than this, perform an analysis using LLM
ANALYZE_THRESHOLD = 1500
# URL to the wiki page with tips on how to write a good project description
INITIAL_PROJECT_HOWTO_URL = (
"https://github.com/Pythagora-io/gpt-pilot/wiki/How-to-write-a-good-initial-project-description"
)
class SpecWriter(BaseAgent):
agent_type = "spec-writer"
display_name = "Spec Writer"
async def run(self) -> AgentResponse:
response = await self.ask_question(
"Describe your app in as much detail as possible",
allow_empty=False,
buttons={"example": "Start an example project"},
)
if response.cancelled:
return AgentResponse.error(self, "No project description")
if response.button == "example":
self.prepare_example_project()
return AgentResponse.done(self)
spec = response.text
complexity = await self.check_prompt_complexity(spec)
if len(spec) < ANALYZE_THRESHOLD and complexity != Complexity.SIMPLE:
spec = await self.analyze_spec(spec)
spec = await self.review_spec(spec)
self.next_state.specification = self.current_state.specification.clone()
self.next_state.specification.description = spec
self.next_state.specification.complexity = complexity
telemetry.set("initial_prompt", spec)
telemetry.set("is_complex_app", complexity != Complexity.SIMPLE)
return AgentResponse.done(self)
async def check_prompt_complexity(self, prompt: str) -> str:
await self.send_message("Checking the complexity of the prompt ...")
llm = self.get_llm()
convo = AgentConvo(self).template("prompt_complexity", prompt=prompt)
llm_response: str = await llm(convo, temperature=0, parser=StringParser())
return llm_response.lower()
def prepare_example_project(self):
spec = self.current_state.specification.clone()
spec.description = EXAMPLE_PROJECT_DESCRIPTION
spec.architecture = EXAMPLE_PROJECT_ARCHITECTURE["architecture"]
spec.system_dependencies = EXAMPLE_PROJECT_ARCHITECTURE["system_dependencies"]
spec.package_dependencies = EXAMPLE_PROJECT_ARCHITECTURE["package_dependencies"]
spec.template = EXAMPLE_PROJECT_ARCHITECTURE["template"]
spec.complexity = Complexity.SIMPLE
telemetry.set("initial_prompt", spec.description.strip())
telemetry.set("is_complex_app", False)
telemetry.set("template", spec.template)
telemetry.set(
"architecture",
{
"architecture": spec.architecture,
"system_dependencies": spec.system_dependencies,
"package_dependencies": spec.package_dependencies,
},
)
self.next_state.specification = spec
self.next_state.epics = [
{
"name": "Initial Project",
"description": EXAMPLE_PROJECT_DESCRIPTION,
"completed": False,
"complexity": Complexity.SIMPLE,
}
]
self.next_state.tasks = EXAMPLE_PROJECT_PLAN
async def analyze_spec(self, spec: str) -> str:
msg = (
"Your project description seems a bit short. "
"The better you can describe the project, the better GPT Pilot will understand what you'd like to build.\n\n"
f"Here are some tips on how to better describe the project: {INITIAL_PROJECT_HOWTO_URL}\n\n"
"Let's start by refining your project idea:"
)
await self.send_message(msg)
llm = self.get_llm()
convo = AgentConvo(self).template("ask_questions").user(spec)
while True:
response: str = await llm(convo)
if len(response) > 500:
# The response is too long for it to be a question, assume it's the spec
confirm = await self.ask_question(
(
"Can we proceed with this project description? If so, just press ENTER. "
"Otherwise, please tell me what's missing or what you'd like to add."
),
allow_empty=True,
buttons={"continue": "Continue"},
)
if confirm.cancelled or confirm.button == "continue" or confirm.text == "":
return spec
convo.user(confirm.text)
else:
convo.assistant(response)
user_response = await self.ask_question(
response,
buttons={"skip": "Skip questions"},
)
if user_response.cancelled or user_response.button == "skip":
convo.user(
"This is enough clarification, you have all the information. "
"Please output the spec now, without additional comments or questions."
)
response: str = await llm(convo)
return response
convo.user(user_response.text)
async def review_spec(self, spec: str) -> str:
convo = AgentConvo(self).template("review_spec", spec=spec)
llm = self.get_llm()
llm_response: str = await llm(convo, temperature=0)
additional_info = llm_response.strip()
if additional_info:
spec += "\nAdditional info/examples:\n" + additional_info
return spec

View File

@@ -0,0 +1,53 @@
from core.agents.base import BaseAgent
from core.agents.convo import AgentConvo
from core.agents.response import AgentResponse
from core.log import get_logger
log = get_logger(__name__)
class TaskReviewer(BaseAgent):
agent_type = "task-reviewer"
display_name = "Task Reviewer"
async def run(self) -> AgentResponse:
response = await self.review_code_changes()
self.next_state.complete_step()
return response
async def review_code_changes(self) -> AgentResponse:
"""
Review all the code changes during current task.
"""
log.debug(f"Reviewing code changes for task {self.current_state.current_task['description']}")
await self.send_message("Reviewing the task implementation ...")
all_feedbacks = [
iteration["user_feedback"].replace("```", "").strip()
for iteration in self.current_state.iterations
# Some iterations are created by the task reviewer and have no user feedback
if iteration["user_feedback"]
]
files_before_modification = self.current_state.modified_files
files_after_modification = [
(file.path, file.content.content)
for file in self.current_state.files
if (file.path in files_before_modification)
]
llm = self.get_llm()
# TODO instead of sending files before and after maybe add nice way to show diff for multiple files
convo = AgentConvo(self).template(
"review_task",
current_task=self.current_state.current_task,
all_feedbacks=all_feedbacks,
files_before_modification=files_before_modification,
files_after_modification=files_after_modification,
)
llm_response: str = await llm(convo, temperature=0.7)
if llm_response.strip().lower() == "done":
return AgentResponse.done(self)
else:
return AgentResponse.task_review_feedback(self, llm_response)

196
core/agents/tech_lead.py Normal file
View File

@@ -0,0 +1,196 @@
from typing import Optional
from uuid import uuid4
from pydantic import BaseModel, Field
from core.agents.base import BaseAgent
from core.agents.convo import AgentConvo
from core.agents.response import AgentResponse, ResponseType
from core.db.models import Complexity
from core.llm.parser import JSONParser
from core.log import get_logger
from core.templates.registry import apply_project_template, get_template_summary
from core.ui.base import ProjectStage
log = get_logger(__name__)
class Task(BaseModel):
description: str = Field(description=("Very detailed description of a development task."))
class DevelopmentPlan(BaseModel):
plan: list[Task] = Field(description="List of development tasks that need to be done to implement the entire plan.")
class UpdatedDevelopmentPlan(BaseModel):
updated_current_task: Task = Field(
description="Updated detailed description of what was implemented while working on the current development task."
)
plan: list[Task] = Field(description="List of unfinished development tasks.")
class TechLead(BaseAgent):
agent_type = "tech-lead"
display_name = "Tech Lead"
async def run(self) -> AgentResponse:
if self.prev_response and self.prev_response.type == ResponseType.UPDATE_EPIC:
return await self.update_epic()
if len(self.current_state.epics) == 0:
self.create_initial_project_epic()
# Orchestrator will rerun us to break down the initial project epic
return AgentResponse.done(self)
await self.ui.send_project_stage(ProjectStage.CODING)
if self.current_state.specification.template and not self.current_state.files:
await self.apply_project_template()
return AgentResponse.done(self)
unfinished_epics = self.current_state.unfinished_epics
if unfinished_epics:
return await self.plan_epic(unfinished_epics[0])
else:
return await self.ask_for_new_feature()
def create_initial_project_epic(self):
log.debug("Creating initial project epic")
self.next_state.epics = [
{
"id": uuid4().hex,
"name": "Initial Project",
"source": "app",
"description": self.current_state.specification.description,
"summary": None,
"completed": False,
"complexity": self.current_state.specification.complexity,
}
]
async def apply_project_template(self) -> Optional[str]:
state = self.current_state
# Only do this for the initial project and if the template is specified
if len(state.epics) != 1 or not state.specification.template:
return None
log.info(f"Applying project template: {self.current_state.specification.template}")
await self.send_message(f"Applying project template {self.current_state.specification.template} ...")
summary = await apply_project_template(
self.current_state.specification.template,
self.state_manager,
self.process_manager,
)
# Saving template files will fill this in and we want it clear for the
# first task.
self.next_state.relevant_files = []
return summary
async def ask_for_new_feature(self) -> AgentResponse:
log.debug("Asking for new feature")
response = await self.ask_question(
"Do you have a new feature to add to the project? Just write it here",
buttons={"end": "No, I'm done"},
allow_empty=True,
)
if response.cancelled or response.button == "end" or not response.text:
return AgentResponse.exit(self)
self.next_state.epics = self.current_state.epics + [
{
"id": uuid4().hex,
"name": f"Feature #{len(self.current_state.epics)}",
"source": "feature",
"description": response.text,
"summary": None,
"completed": False,
"complexity": Complexity.HARD,
}
]
# Orchestrator will rerun us to break down the new feature epic
return AgentResponse.done(self)
async def plan_epic(self, epic) -> AgentResponse:
log.debug(f"Planning tasks for the epic: {epic['name']}")
await self.send_message("Starting to create the action plan for development ...")
llm = self.get_llm()
convo = (
AgentConvo(self)
.template(
"plan",
epic=epic,
task_type=self.current_state.current_epic.get("source", "app"),
existing_summary=get_template_summary(self.current_state.specification.template),
)
.require_schema(DevelopmentPlan)
)
response: DevelopmentPlan = await llm(convo, parser=JSONParser(DevelopmentPlan))
self.next_state.tasks = self.current_state.tasks + [
{
"id": uuid4().hex,
"description": task.description,
"instructions": None,
"completed": False,
}
for task in response.plan
]
return AgentResponse.done(self)
async def update_epic(self) -> AgentResponse:
"""
Update the development plan for the current epic.
As a side-effect, this also marks the current task as a complete,
and should only be called by Troubleshooter once the task is done,
if the Troubleshooter decides plan update is needed.
"""
epic = self.current_state.current_epic
self.next_state.complete_task()
await self.state_manager.log_task_completed()
if not self.next_state.unfinished_tasks:
# There are no tasks after this one, so there's nothing to update
return AgentResponse.done(self)
finished_tasks = [task for task in self.next_state.tasks if task["completed"]]
log.debug(f"Updating development plan for {epic['name']}")
await self.ui.send_message("Updating development plan ...")
llm = self.get_llm()
convo = (
AgentConvo(self)
.template(
"update_plan",
finished_tasks=finished_tasks,
task_type=self.current_state.current_epic.get("source", "app"),
modified_files=[f for f in self.current_state.files if f.path in self.current_state.modified_files],
)
.require_schema(UpdatedDevelopmentPlan)
)
response: UpdatedDevelopmentPlan = await llm(
convo,
parser=JSONParser(UpdatedDevelopmentPlan),
temperature=0,
)
log.debug(f"Reworded last task as: {response.updated_current_task.description}")
finished_tasks[-1]["description"] = response.updated_current_task.description
self.next_state.tasks = finished_tasks + [
{
"id": uuid4().hex,
"description": task.description,
"instructions": None,
"completed": False,
}
for task in response.plan
]
log.debug(f"Updated development plan for {epic['name']}, {len(response.plan)} tasks remaining")
return AgentResponse.done(self)

View File

@@ -0,0 +1,30 @@
from core.agents.base import BaseAgent
from core.agents.convo import AgentConvo
from core.agents.response import AgentResponse
from core.log import get_logger
log = get_logger(__name__)
class TechnicalWriter(BaseAgent):
agent_type = "tech-writer"
display_name = "Technical Writer"
async def run(self) -> AgentResponse:
n_tasks = len(self.current_state.tasks)
n_unfinished = len(self.current_state.unfinished_tasks)
if n_unfinished in [n_tasks // 2, 1]:
# Halfway through the initial project, and and at the last task
await self.create_readme()
self.next_state.complete_step()
return AgentResponse.done(self)
async def create_readme(self):
await self.ui.send_message("Creating README ...")
llm = self.get_llm()
convo = AgentConvo(self).template("create_readme")
llm_response: str = await llm(convo)
await self.state_manager.save_file("README.md", llm_response)

View File

@@ -0,0 +1,281 @@
from typing import Optional
from uuid import uuid4
from pydantic import BaseModel, Field
from core.agents.base import BaseAgent
from core.agents.convo import AgentConvo
from core.agents.mixins import IterationPromptMixin
from core.agents.response import AgentResponse
from core.llm.parser import JSONParser, OptionalCodeBlockParser
from core.log import get_logger
from core.telemetry import telemetry
log = get_logger(__name__)
LOOP_THRESHOLD = 3 # number of iterations in task to be considered a loop
class BugReportQuestions(BaseModel):
missing_data: list[str] = Field(
description="Very clear question that needs to be answered to have good bug report."
)
class Troubleshooter(IterationPromptMixin, BaseAgent):
agent_type = "troubleshooter"
display_name = "Troubleshooter"
async def run(self) -> AgentResponse:
run_command = await self.get_run_command()
user_instructions = await self.get_user_instructions()
if user_instructions is None:
# LLM decided we don't need to test anything, so we're done with the task
return await self.complete_task()
# Developer sets iteration as "completed" when it generates the step breakdown, so we can't
# use "current_iteration" here
last_iteration = self.current_state.iterations[-1] if self.current_state.iterations else None
should_iterate, is_loop, user_feedback = await self.get_user_feedback(
run_command,
user_instructions,
last_iteration is not None,
)
if not should_iterate:
# User tested and reported no problems, we're done with the task
return await self.complete_task()
user_feedback_qa = await self.generate_bug_report(run_command, user_instructions, user_feedback)
if is_loop:
if last_iteration["alternative_solutions"]:
# If we already have alternative solutions, it means we were already in a loop.
return self.try_next_alternative_solution(user_feedback, user_feedback_qa)
else:
# Newly detected loop, set up an empty new iteration to trigger ProblemSolver
llm_solution = ""
await self.trace_loop("loop-feedback")
else:
llm_solution = await self.find_solution(user_feedback, user_feedback_qa=user_feedback_qa)
self.next_state.iterations = self.current_state.iterations + [
{
"id": uuid4().hex,
"user_feedback": user_feedback,
"user_feedback_qa": user_feedback_qa,
"description": llm_solution,
"alternative_solutions": [],
# FIXME - this is incorrect if this is a new problem; otherwise we could
# just count the iterations
"attempts": 1,
"completed": False,
}
]
if len(self.next_state.iterations) == LOOP_THRESHOLD:
await self.trace_loop("loop-start")
return AgentResponse.done(self)
async def complete_task(self) -> AgentResponse:
"""
Mark the current task as completed.
If there were iterations for the task, instead of marking the task as completed directly,
we ask the TechLead to update the epic (it needs state to the current task) and then mark
the task as completed.
"""
self.next_state.steps = []
if len(self.current_state.iterations) >= LOOP_THRESHOLD:
await self.trace_loop("loop-end")
if self.current_state.iterations:
return AgentResponse.update_epic(self)
else:
self.next_state.complete_task()
await self.state_manager.log_task_completed()
await self.ui.send_task_progress(
self.current_state.tasks.index(self.current_state.current_task) + 1,
len(self.current_state.tasks),
self.current_state.current_task["description"],
self.current_state.current_epic.get("source", "app"),
"done",
)
return AgentResponse.done(self)
def _get_task_convo(self) -> AgentConvo:
# FIXME: Current prompts reuse conversation from the developer so we have to resort to this
task = self.current_state.current_task
current_task_index = self.current_state.tasks.index(task)
return (
AgentConvo(self)
.template(
"breakdown",
task=task,
iteration=None,
current_task_index=current_task_index,
)
.assistant(self.current_state.current_task["instructions"])
)
async def get_run_command(self) -> Optional[str]:
if self.current_state.run_command:
return self.current_state.run_command
await self.send_message("Figuring out how to run the app ...")
llm = self.get_llm()
convo = self._get_task_convo().template("get_run_command")
# Although the prompt is explicit about not using "```", LLM may still return it
llm_response: str = await llm(convo, temperature=0, parser=OptionalCodeBlockParser())
self.next_state.run_command = llm_response
return llm_response
async def get_user_instructions(self) -> Optional[str]:
await self.send_message("Determining how to test the app ...")
llm = self.get_llm()
convo = self._get_task_convo().template("define_user_review_goal", task=self.current_state.current_task)
user_instructions: str = await llm(convo)
user_instructions = user_instructions.strip()
if user_instructions.lower() == "done":
log.debug(f"Nothing to do for user testing for task {self.current_state.current_task['description']}")
return None
return user_instructions
async def get_user_feedback(
self,
run_command: str,
user_instructions: str,
last_iteration: Optional[dict],
) -> tuple[bool, bool, str, str]:
"""
Ask the user to test the app and provide feedback.
:return (bool, bool, str): Tuple containing "should_iterate", "is_loop" and
"user_feedback" respectively.
If "should_iterate" is False, the user has confirmed that the app works as expected and there's
nothing for the troubleshooter or problem solver to do.
If "is_loop" is True, Pythagora is stuck in a loop and needs to consider alternative solutions.
The last element in the tuple is the user feedback, which may be empty if the user provided no
feedback (eg. if they just clicked on "Continue" or "I'm stuck in a loop").
"""
test_message = "Can you check if the app works please?"
if user_instructions:
test_message += " Here is a description of what should be working:\n\n" + user_instructions
if run_command:
await self.ui.send_run_command(run_command)
buttons = {"continue": "Everything works, continue"}
if last_iteration:
buttons["loop"] = "I still have the same issue"
user_response = await self.ask_question(
test_message,
buttons=buttons,
default="continue",
)
if user_response.button == "continue" or user_response.cancelled:
return False, False, ""
if user_response.button == "loop":
return True, True, ""
return True, False, user_response.text
def try_next_alternative_solution(self, user_feedback: str, user_feedback_qa: list[str]) -> AgentResponse:
"""
Call the ProblemSolver to try an alternative solution.
Stores the user feedback and sets iteration state (not completed, no description)
so that ProblemSolver will be triggered.
:param user_feedback: User feedback to store in the iteration state.
:param user_feedback_qa: Additional questions/answers about the problem.
:return: Agent response done.
"""
next_state_iteration = self.next_state.iterations[-1]
next_state_iteration["description"] = ""
next_state_iteration["user_feedback"] = user_feedback
next_state_iteration["user_feedback_qa"] = user_feedback_qa
next_state_iteration["attempts"] += 1
next_state_iteration["completed"] = False
self.next_state.flag_iterations_as_modified()
return AgentResponse.done(self)
async def generate_bug_report(
self,
run_command: Optional[str],
user_instructions: str,
user_feedback: str,
) -> list[str]:
"""
Generate a bug report from the user feedback.
:param run_command: The command to run to test the app.
:param user_instructions: Instructions on how to test the functionality.
:param user_feedback: The user feedback.
:return: Additional questions and answers to generate a better bug report.
"""
additional_qa = []
llm = self.get_llm()
convo = (
AgentConvo(self)
.template(
"bug_report",
user_instructions=user_instructions,
user_feedback=user_feedback,
# TODO: revisit if we again want to run this in a loop, where this is useful
additional_qa=additional_qa,
)
.require_schema(BugReportQuestions)
)
llm_response: BugReportQuestions = await llm(convo, parser=JSONParser(BugReportQuestions))
if not llm_response.missing_data:
return []
for question in llm_response.missing_data:
if run_command:
await self.ui.send_run_command(run_command)
user_response = await self.ask_question(
question,
buttons={
"continue": "Submit answer",
"skip": "Skip this question",
"skip-all": "Skip all questions",
},
allow_empty=False,
)
if user_response.cancelled or user_response.button == "skip-all":
break
elif user_response.button == "skip":
continue
additional_qa.append(
{
"question": question,
"answer": user_response.text,
}
)
return additional_qa
async def trace_loop(self, trace_event: str):
state = self.current_state
task_with_loop = {
"task_description": state.current_task["description"],
"task_number": len([t for t in state.tasks if t["completed"]]) + 1,
"steps": len(state.steps),
"iterations": len(state.iterations),
}
await telemetry.trace_loop(trace_event, task_with_loop)

0
core/cli/__init__.py Normal file
View File

319
core/cli/helpers.py Normal file
View File

@@ -0,0 +1,319 @@
import json
import os
import os.path
import sys
from argparse import ArgumentParser, ArgumentTypeError, Namespace
from typing import Optional
from urllib.parse import urlparse
from uuid import UUID
from core.config import Config, LLMProvider, LocalIPCConfig, ProviderConfig, UIAdapter, get_config, loader
from core.config.env_importer import import_from_dotenv
from core.config.version import get_version
from core.db.session import SessionManager
from core.db.setup import run_migrations
from core.log import setup
from core.state.state_manager import StateManager
from core.ui.base import UIBase
from core.ui.console import PlainConsoleUI
from core.ui.ipc_client import IPCClientUI
def parse_llm_endpoint(value: str) -> Optional[tuple[LLMProvider, str]]:
"""
Parse --llm-endpoint command-line option.
Option syntax is: --llm-endpoint <provider>:<url>
:param value: Argument value.
:return: Tuple with LLM provider and URL, or None if if the option wasn't provided.
"""
if not value:
return None
parts = value.split(":", 1)
if len(parts) != 2:
raise ArgumentTypeError("Invalid LLM endpoint format; expected 'provider:url'")
try:
provider = LLMProvider(parts[0])
except ValueError as err:
raise ArgumentTypeError(f"Unsupported LLM provider: {err}")
url = urlparse(parts[1])
if url.scheme not in ("http", "https"):
raise ArgumentTypeError(f"Invalid LLM endpoint URL: {parts[1]}")
return provider, url.geturl()
def parse_llm_key(value: str) -> Optional[tuple[LLMProvider, str]]:
"""
Parse --llm-key command-line option.
Option syntax is: --llm-key <provider>:<key>
:param value: Argument value.
:return: Tuple with LLM provider and key, or None if if the option wasn't provided.
"""
if not value:
return None
parts = value.split(":", 1)
if len(parts) != 2:
raise ArgumentTypeError("Invalid LLM endpoint format; expected 'provider:key'")
try:
provider = LLMProvider(parts[0])
except ValueError as err:
raise ArgumentTypeError(f"Unsupported LLM provider: {err}")
return provider, parts[1]
def parse_arguments() -> Namespace:
"""
Parse command-line arguments.
Available arguments:
--help: Show the help message
--config: Path to the configuration file
--show-config: Output the default configuration to stdout
--default-config: Output the configuration to stdout
--level: Log level (debug,info,warning,error,critical)
--database: Database URL
--local-ipc-port: Local IPC port to connect to
--local-ipc-host: Local IPC host to connect to
--version: Show the version and exit
--list: List all projects
--list-json: List all projects in JSON format
--project: Load a specific project
--branch: Load a specific branch
--step: Load a specific step in a project/branch
--llm-endpoint: Use specific API endpoint for the given provider
--llm-key: Use specific LLM key for the given provider
--import-v0: Import data from a v0 (gpt-pilot) database with the given path
--email: User's email address, if provided
--extension-version: Version of the VSCode extension, if used
:return: Parsed arguments object.
"""
version = get_version()
parser = ArgumentParser()
parser.add_argument("--config", help="Path to the configuration file", default="config.json")
parser.add_argument("--show-config", help="Output the default configuration to stdout", action="store_true")
parser.add_argument("--level", help="Log level (debug,info,warning,error,critical)", required=False)
parser.add_argument("--database", help="Database URL", required=False)
parser.add_argument("--local-ipc-port", help="Local IPC port to connect to", type=int, required=False)
parser.add_argument("--local-ipc-host", help="Local IPC host to connect to", default="localhost", required=False)
parser.add_argument("--version", action="version", version=version)
parser.add_argument("--list", help="List all projects", action="store_true")
parser.add_argument("--list-json", help="List all projects in JSON format", action="store_true")
parser.add_argument("--project", help="Load a specific project", type=UUID, required=False)
parser.add_argument("--branch", help="Load a specific branch", type=UUID, required=False)
parser.add_argument("--step", help="Load a specific step in a project/branch", type=int, required=False)
parser.add_argument("--delete", help="Delete a specific project", type=UUID, required=False)
parser.add_argument(
"--llm-endpoint",
help="Use specific API endpoint for the given provider",
type=parse_llm_endpoint,
action="append",
required=False,
)
parser.add_argument(
"--llm-key",
help="Use specific LLM key for the given provider",
type=parse_llm_key,
action="append",
required=False,
)
parser.add_argument(
"--import-v0",
help="Import data from a v0 (gpt-pilot) database with the given path",
required=False,
)
parser.add_argument("--email", help="User's email address", required=False)
parser.add_argument("--extension-version", help="Version of the VSCode extension", required=False)
return parser.parse_args()
def load_config(args: Namespace) -> Optional[Config]:
"""
Load Pythagora JSON configuration file and apply command-line arguments.
:param args: Command-line arguments (at least `config` must be present).
:return: Configuration object, or None if config couldn't be loaded.
"""
if not os.path.isfile(args.config):
imported = import_from_dotenv(args.config)
if not imported:
print(f"Configuration file not found: {args.config}; using default", file=sys.stderr)
return get_config()
try:
config = loader.load(args.config)
except ValueError as err:
print(f"Error parsing config file {args.config}: {err}", file=sys.stderr)
return None
if args.level:
config.log.level = args.level.upper()
if args.database:
config.db.url = args.database
if args.local_ipc_port:
config.ui = LocalIPCConfig(port=args.local_ipc_port, host=args.local_ipc_host)
if args.llm_endpoint:
for provider, endpoint in args.llm_endpoint:
if provider not in config.llm:
config.llm[provider] = ProviderConfig()
config.llm[provider].base_url = endpoint
if args.llm_key:
for provider, key in args.llm_key:
if provider not in config.llm:
config.llm[provider] = ProviderConfig()
config.llm[provider].api_key = key
try:
Config.model_validate(config)
except ValueError as err:
print(f"Configuration error: {err}", file=sys.stderr)
return None
return config
async def list_projects_json(db: SessionManager):
"""
List all projects in the database in JSON format.
"""
sm = StateManager(db)
projects = await sm.list_projects()
data = []
for project in projects:
p = {
"name": project.name,
"id": project.id.hex,
"branches": [],
}
for branch in project.branches:
b = {
"name": branch.name,
"id": branch.id.hex,
"steps": [],
}
for state in branch.states:
s = {
"name": f"Step #{state.step_index}",
"step": state.step_index,
}
b["steps"].append(s)
if b["steps"]:
b["steps"][-1]["name"] = "Latest step"
p["branches"].append(b)
data.append(p)
print(json.dumps(data, indent=2))
async def list_projects(db: SessionManager):
"""
List all projects in the database.
"""
sm = StateManager(db)
projects = await sm.list_projects()
print(f"Available projects ({len(projects)}):")
for project in projects:
print(f"* {project.name} ({project.id})")
for branch in project.branches:
last_step = max(state.step_index for state in branch.states)
print(f" - {branch.name} ({branch.id}) - last step: {last_step}")
async def load_project(
sm: StateManager,
project_id: Optional[UUID] = None,
branch_id: Optional[UUID] = None,
step_index: Optional[int] = None,
) -> bool:
"""
Load a project from the database.
:param sm: State manager.
:param project_id: Project ID (optional, loads the last step in the main branch).
:param branch_id: Branch ID (optional, loads the last step in the branch).
:param step_index: Step index (optional, loads the state at the given step).
:return: True if the project was loaded successfully, False otherwise.
"""
step_txt = f" step {step_index}" if step_index else ""
if branch_id:
project_state = await sm.load_project(branch_id=branch_id, step_index=step_index)
if project_state:
return True
else:
print(f"Branch {branch_id}{step_txt} not found; use --list to list all projects", file=sys.stderr)
return False
elif project_id:
project_state = await sm.load_project(project_id=project_id, step_index=step_index)
if project_state:
return True
else:
print(f"Project {project_id}{step_txt} not found; use --list to list all projects", file=sys.stderr)
return False
return False
async def delete_project(sm: StateManager, project_id: UUID) -> bool:
"""
Delete a project from a database.
:param sm: State manager.
:param project_id: Project ID.
:return: True if project was deleted, False otherwise.
"""
return await sm.delete_project(project_id)
def show_config():
"""
Print the current configuration to stdout.
"""
cfg = get_config()
print(cfg.model_dump_json(indent=2))
def init() -> tuple[UIBase, SessionManager, Namespace]:
"""
Initialize the application.
Loads configuration, sets up logging and UI, initializes the database
and runs database migrations.
:return: Tuple with UI, db session manager, file manager, and command-line arguments.
"""
args = parse_arguments()
config = load_config(args)
if not config:
return (None, None, args)
setup(config.log, force=True)
if config.ui.type == UIAdapter.IPC_CLIENT:
ui = IPCClientUI(config.ui)
else:
ui = PlainConsoleUI()
run_migrations(config.db)
db = SessionManager(config.db)
return (ui, db, args)
__all__ = ["parse_arguments", "load_config", "list_projects_json", "list_projects", "load_project", "init"]

145
core/cli/main.py Normal file
View File

@@ -0,0 +1,145 @@
import sys
from argparse import Namespace
from asyncio import run
from core.agents.orchestrator import Orchestrator
from core.cli.helpers import delete_project, init, list_projects, list_projects_json, load_project, show_config
from core.db.session import SessionManager
from core.db.v0importer import LegacyDatabaseImporter
from core.llm.base import APIError
from core.log import get_logger
from core.state.state_manager import StateManager
from core.telemetry import telemetry
from core.ui.base import UIBase
log = get_logger(__name__)
async def run_project(sm: StateManager, ui: UIBase) -> bool:
"""
Work on the project.
Starts the orchestrator agent with the newly loaded/created project
and runs it until the orchestrator decides to exit.
:param sm: State manager.
:param ui: User interface.
:return: True if the orchestrator exited successfully, False otherwise.
"""
telemetry.start()
telemetry.set("app_id", str(sm.project.id))
telemetry.set("initial_prompt", sm.current_state.specification.description)
orca = Orchestrator(sm, ui)
success = False
try:
success = await orca.run()
except KeyboardInterrupt:
log.info("Interrupted by user")
telemetry.set("end_result", "interrupt")
await sm.rollback()
except APIError as err:
log.warning(f"LLM API error occurred: {err.message}")
await ui.send_message(f"LLM API error occurred: {err.message}")
await ui.send_message("Stopping Pythagora due to previous error.")
telemetry.set("end_result", "failure:api-error")
await sm.rollback()
except Exception as err:
telemetry.record_crash(err)
await sm.rollback()
log.error(f"Uncaught exception: {err}", exc_info=True)
await ui.send_message(f"Unrecoverable error occurred: {err}")
if success:
telemetry.set("end_result", "success:exit")
else:
# We assume unsuccessful exit (but not an exception) is a result
# of an API error that the user didn't retry.
telemetry.set("end_result", "failure:api-error")
await telemetry.send()
return success
async def start_new_project(sm: StateManager, ui: UIBase) -> bool:
"""
Start a new project.
:param sm: State manager.
:param ui: User interface.
:return: True if the project was created successfully, False otherwise.
"""
user_input = await ui.ask_question("What is the name of the project", allow_empty=False)
if user_input.cancelled:
return False
project_state = await sm.create_project(user_input.text)
return project_state is not None
async def async_main(
ui: UIBase,
db: SessionManager,
args: Namespace,
) -> bool:
"""
Main application coroutine.
:param ui: User interface.
:param db: Database session manager.
:param args: Command-line arguments.
:return: True if the application ran successfully, False otherwise.
"""
if args.list:
await list_projects(db)
return True
elif args.list_json:
await list_projects_json(db)
return True
if args.show_config:
show_config()
return True
elif args.import_v0:
importer = LegacyDatabaseImporter(db, args.import_v0)
await importer.import_database()
return True
telemetry.set("user_contact", args.email)
if args.extension_version:
telemetry.set("is_extension", True)
telemetry.set("extension_version", args.extension_version)
sm = StateManager(db, ui)
ui_started = await ui.start()
if not ui_started:
return False
if args.project or args.branch or args.step:
telemetry.set("is_continuation", True)
# FIXME: we should send the project stage and other runtime info to the UI
success = await load_project(sm, args.project, args.branch, args.step)
if not success:
return False
elif args.delete:
success = await delete_project(sm, args.delete)
return success
else:
success = await start_new_project(sm, ui)
if not success:
return False
return await run_project(sm, ui)
def run_pythagora():
ui, db, args = init()
if not ui or not db:
return -1
success = run(async_main(ui, db, args))
return 0 if success else -1
if __name__ == "__main__":
sys.exit(run_pythagora())

375
core/config/__init__.py Normal file
View File

@@ -0,0 +1,375 @@
from enum import Enum
from os.path import abspath, dirname, isdir, join
from typing import Literal, Optional, Union
from pydantic import BaseModel, ConfigDict, Field, field_validator
from typing_extensions import Annotated
ROOT_DIR = abspath(join(dirname(__file__), "..", ".."))
DEFAULT_IGNORE_PATHS = [
".git",
".gpt-pilot",
".idea",
".vscode",
".next",
".DS_Store",
"__pycache__",
"site-packages",
"node_modules",
"package-lock.json",
"venv",
"dist",
"build",
"target",
"*.min.js",
"*.min.css",
"*.svg",
"*.csv",
"*.log",
"go.sum",
]
IGNORE_SIZE_THRESHOLD = 50000 # 50K+ files are ignored by default
# Agents with sane setup in the default configuration
DEFAULT_AGENT_NAME = "default"
DESCRIBE_FILES_AGENT_NAME = "CodeMonkey.describe_files"
class _StrictModel(BaseModel):
"""
Pydantic parser configuration options.
"""
model_config = ConfigDict(
extra="forbid",
)
class LLMProvider(str, Enum):
"""
Supported LLM providers.
"""
OPENAI = "openai"
ANTHROPIC = "anthropic"
GROQ = "groq"
LM_STUDIO = "lm-studio"
class UIAdapter(str, Enum):
"""
Supported UI adapters.
"""
PLAIN = "plain"
IPC_CLIENT = "ipc-client"
class ProviderConfig(_StrictModel):
"""
LLM provider configuration.
"""
base_url: Optional[str] = Field(
None,
description="Base URL for the provider's API (if different from the provider default)",
)
api_key: Optional[str] = Field(
None,
description="API key to use for authentication (if not set, provider uses default from environment variable)",
)
connect_timeout: float = Field(
default=60.0,
description="Timeout (in seconds) for connecting to the provider's API",
ge=0.0,
)
read_timeout: float = Field(
default=10.0,
description="Timeout (in seconds) for receiving a new chunk of data from the response stream",
ge=0.0,
)
class AgentLLMConfig(_StrictModel):
"""
Configuration for the various LLMs used by Pythagora.
Each Agent has an LLM provider, from the Enum LLMProvider. If
AgentLLMConfig is not specified, default will be used.
"""
provider: LLMProvider = LLMProvider.OPENAI
model: str = Field(description="Model to use", default="gpt-4-turbo")
temperature: float = Field(
default=0.5,
description="Temperature to use for sampling",
ge=0.0,
le=1.0,
)
class LLMConfig(_StrictModel):
"""
Complete agent-specific configuration for an LLM.
"""
provider: LLMProvider = LLMProvider.OPENAI
model: str = Field(description="Model to use")
base_url: Optional[str] = Field(
None,
description="Base URL for the provider's API (if different from the provider default)",
)
api_key: Optional[str] = Field(
None,
description="API key to use for authentication (if not set, provider uses default from environment variable)",
)
temperature: float = Field(
default=0.5,
description="Temperature to use for sampling",
ge=0.0,
le=1.0,
)
connect_timeout: float = Field(
default=60.0,
description="Timeout (in seconds) for connecting to the provider's API",
ge=0.0,
)
read_timeout: float = Field(
default=10.0,
description="Timeout (in seconds) for receiving a new chunk of data from the response stream",
ge=0.0,
)
@classmethod
def from_provider_and_agent_configs(cls, provider: ProviderConfig, agent: AgentLLMConfig):
return cls(
provider=agent.provider,
model=agent.model,
base_url=provider.base_url,
api_key=provider.api_key,
temperature=agent.temperature,
connect_timeout=provider.connect_timeout,
read_timeout=provider.read_timeout,
)
class PromptConfig(_StrictModel):
"""
Configuration for prompt templates:
"""
paths: list[str] = Field(
[join(ROOT_DIR, "core", "prompts")],
description="List of directories to search for prompt templates",
)
@field_validator("paths")
@classmethod
def validate_paths(cls, v: list[str]) -> list[str]:
for path in v:
if not isdir(path):
raise ValueError(f"Invalid prompt path: {path}")
return v
class LogConfig(_StrictModel):
"""
Configuration for logging.
"""
level: str = Field(
"DEBUG",
description="Logging level",
pattern=r"^(DEBUG|INFO|WARNING|ERROR|CRITICAL)$",
)
format: str = Field(
"%(asctime)s %(levelname)s [%(name)s] %(message)s",
description="Logging format",
)
output: Optional[str] = Field(
"pythagora.log",
description="Output file for logs (if not specified, logs are printed to stderr)",
)
class DBConfig(_StrictModel):
"""
Configuration for database connections.
Supported URL schemes:
* sqlite+aiosqlite: SQLite database using the aiosqlite driver
"""
url: str = Field(
"sqlite+aiosqlite:///pythagora.db",
description="Database connection URL",
)
debug_sql: bool = Field(False, description="Log all SQL queries to the console")
@field_validator("url")
@classmethod
def validate_url_scheme(cls, v: str) -> str:
if v.startswith("sqlite+aiosqlite://"):
return v
raise ValueError(f"Unsupported database URL scheme in: {v}")
class PlainUIConfig(_StrictModel):
"""
Configuration for plaintext console UI.
"""
type: Literal[UIAdapter.PLAIN] = UIAdapter.PLAIN
class LocalIPCConfig(_StrictModel):
"""
Configuration for VSCode extension IPC client.
"""
type: Literal[UIAdapter.IPC_CLIENT] = UIAdapter.IPC_CLIENT
host: str = "localhost"
port: int = 8125
UIConfig = Annotated[
Union[PlainUIConfig, LocalIPCConfig],
Field(discriminator="type"),
]
class FileSystemType(str, Enum):
"""
Supported filesystem types.
"""
MEMORY = "memory"
LOCAL = "local"
class FileSystemConfig(_StrictModel):
"""
Configuration for project workspace.
"""
type: Literal[FileSystemType.LOCAL] = FileSystemType.LOCAL
workspace_root: str = Field(
join(ROOT_DIR, "workspace"),
description="Workspace directory containing all the projects",
)
ignore_paths: list[str] = Field(
DEFAULT_IGNORE_PATHS,
description="List of paths to ignore when scanning for files and folders",
)
ignore_size_threshold: int = Field(
IGNORE_SIZE_THRESHOLD,
description="Files larger than this size should be ignored",
)
class Config(_StrictModel):
"""
Pythagora Core configuration
"""
llm: dict[LLMProvider, ProviderConfig] = Field(default={LLMProvider.OPENAI: ProviderConfig()})
agent: dict[str, AgentLLMConfig] = Field(
default={
DEFAULT_AGENT_NAME: AgentLLMConfig(),
DESCRIBE_FILES_AGENT_NAME: AgentLLMConfig(model="gpt-3.5-turbo", temperature=0.0),
}
)
prompt: PromptConfig = PromptConfig()
log: LogConfig = LogConfig()
db: DBConfig = DBConfig()
ui: UIConfig = PlainUIConfig()
fs: FileSystemConfig = FileSystemConfig()
def llm_for_agent(self, agent_name: str = "default") -> LLMConfig:
"""
Fetch an LLM configuration for a given agent.
If the agent specific configuration doesn't exist, returns the configuration
for the 'default' agent.
"""
agent_name = agent_name if agent_name in self.agent else "default"
agent_config = self.agent[agent_name]
provider_config = self.llm[agent_config.provider]
return LLMConfig.from_provider_and_agent_configs(provider_config, agent_config)
class ConfigLoader:
"""
Configuration loader takes care of loading and parsing configuration files.
The default loader is already initialized as `core.config.loader`. To
load the configuration from a file, use `core.config.loader.load(path)`.
To get the current configuration, use `core.config.get_config()`.
"""
config: Config
config_path: Optional[str]
def __init__(self):
self.config_path = None
self.config = Config()
@staticmethod
def _remove_json_comments(json_str: str) -> str:
"""
Remove comments from a JSON string.
Removes all lines that start with "//" from the JSON string.
:param json_str: JSON string with comments.
:return: JSON string without comments.
"""
return "\n".join([line for line in json_str.splitlines() if not line.strip().startswith("//")])
@classmethod
def from_json(cls: "ConfigLoader", config: str) -> Config:
"""
Parse JSON Into a Config object.
:param config: JSON string to parse.
:return: Config object.
"""
return Config.model_validate_json(cls._remove_json_comments(config), strict=True)
def load(self, path: str) -> Config:
"""
Load a configuration from a file.
:param path: Path to the configuration file.
:return: Config object.
"""
with open(path, "rb") as f:
raw_config = f.read()
if b"\x00" in raw_config:
encoding = "utf-16"
else:
encoding = "utf-8"
text_config = raw_config.decode(encoding)
self.config = self.from_json(text_config)
self.config_path = path
return self.config
loader = ConfigLoader()
def get_config() -> Config:
"""
Return current configuration.
:return: Current configuration object.
"""
return loader.config
__all__ = ["loader", "get_config"]

View File

@@ -0,0 +1,90 @@
from os.path import dirname, exists, join
from dotenv import dotenv_values
from core.config import Config, LLMProvider, ProviderConfig, loader
def import_from_dotenv(new_config_path: str) -> bool:
"""
Import configuration from old gpt-pilot .env file and save it to a new format.
If the configuration is already loaded, does nothing. If the target file
already exists, it's parsed as is (it's not overwritten).
Otherwise, loads the values from `pilot/.env` file and creates a new configuration
with the relevant settings.
This intentionally DOES NOT load the .env variables into the current process
environments, to avoid polluting it with old settings.
:param new_config_path: Path to save the new configuration file.
:return: True if the configuration was imported, False otherwise.
"""
if loader.config_path or exists(new_config_path):
# Config already exists, nothing to do
return True
env_path = join(dirname(__file__), "..", "..", "pilot", ".env")
if not exists(env_path):
return False
values = dotenv_values(env_path)
if not values:
return False
config = convert_config(values)
with open(new_config_path, "w", encoding="utf-8") as fp:
fp.write(config.model_dump_json(indent=2))
return True
def convert_config(values: dict) -> Config:
config = Config()
for provider in LLMProvider:
endpoint = values.get(f"{provider.value.upper()}_ENDPOINT")
key = values.get(f"{provider.value.upper()}_API_KEY")
if provider == LLMProvider.OPENAI:
# OpenAI is also used for Azure and OpenRouter and local LLMs
if endpoint is None:
endpoint = values.get("AZURE_ENDPOINT")
if endpoint is None:
endpoint = values.get("OPENROUTER_ENDPOINT")
if key is None:
key = values.get("AZURE_API_KEY")
if key is None:
key = values.get("OPENROUTER_API_KEY")
if key and endpoint is None:
endpoint = "https://openrouter.ai/api/v1/chat/completions"
if endpoint or key and provider not in config.llm:
config.llm[provider] = ProviderConfig()
if endpoint:
endpoint = endpoint.replace("chat/completions", "")
config.llm[provider].base_url = endpoint
if key:
config.llm[provider].api_key = key
provider = "openai"
model = values.get("MODEL_NAME", "gpt-4-turbo")
if "/" in model:
provider, model = model.split("/", 1)
try:
agent_provider = LLMProvider(provider.upper())
except ValueError:
agent_provider = LLMProvider.OPENAI
config.agent["default"].model = model
config.agent["default"].provider = agent_provider
ignore_paths = [p for p in values.get("IGNORE_PATHS", "").split(",") if p]
if ignore_paths:
config.fs.ignore_paths += ignore_paths
return config

View File

@@ -0,0 +1,94 @@
import sys
from os import getenv, makedirs
from pathlib import Path
from uuid import uuid4
from pydantic import BaseModel, Field, PrivateAttr
from core.log import get_logger
log = get_logger(__name__)
SETTINGS_APP_NAME = "GPT Pilot"
DEFAULT_TELEMETRY_ENDPOINT = "https://api.pythagora.io/telemetry"
class TelemetrySettings(BaseModel):
id: str = Field(default_factory=lambda: uuid4().hex, description="Unique telemetry ID")
enabled: bool = Field(True, description="Whether telemetry should send stats to the server")
endpoint: str = Field(DEFAULT_TELEMETRY_ENDPOINT, description="Telemetry server endpoint")
def resolve_config_dir() -> Path:
"""
Figure out where to store the global config file(s).
:return: path to the desired location config directory
See the UserSettings docstring for details on how the config directory is
determined.
"""
posix_app_name = SETTINGS_APP_NAME.replace(" ", "-").lower()
xdg_config_home = getenv("XDG_CONFIG_HOME")
if xdg_config_home:
return Path(xdg_config_home) / Path(posix_app_name)
if sys.platform == "win32" and getenv("APPDATA"):
return Path(getenv("APPDATA")) / Path(SETTINGS_APP_NAME)
return Path("~").expanduser() / Path(f".{posix_app_name}")
class UserSettings(BaseModel):
"""
This object holds all the global user settings, that are applicable for
all Pythagora/GPT-Pilot installations.
The use settings are stored in a JSON file in the config directory.
The config directory is determined by the following rules:
* If the XDG_CONFIG_HOME environment variable is set (desktop Linux), use that.
* If the APPDATA environment variable is set (Windows), use that.
* Otherwise, use the POSIX default ~/.<app-name> (MacOS, server Linux).
This is a singleton object, use it by importing the instance directly
from the module:
>>> from config.user_settings import settings
>>> print(settings.telemetry.id)
>>> print(settings.config_path)
"""
telemetry: TelemetrySettings = TelemetrySettings()
_config_path: str = PrivateAttr("")
@staticmethod
def load():
config_path = resolve_config_dir() / "config.json"
if not config_path.exists():
default = UserSettings()
default._config_path = config_path
default.save()
with open(config_path, "r", encoding="utf-8") as fp:
settings = UserSettings.model_validate_json(fp.read())
settings._config_path = str(config_path)
return settings
def save(self):
makedirs(Path(self._config_path).parent, exist_ok=True)
with open(self._config_path, "w", encoding="utf-8") as fp:
fp.write(self.model_dump_json(indent=2))
@property
def config_path(self):
return self._config_path
settings = UserSettings.load()
__all__ = ["settings"]

86
core/config/version.py Normal file
View File

@@ -0,0 +1,86 @@
import re
from os.path import abspath, basename, dirname, isdir, isfile, join
from typing import Optional
GIT_DIR_PATH = abspath(join(dirname(__file__), "..", "..", ".git"))
def get_git_commit() -> Optional[str]:
"""
Return the current git commit (if running from a repo).
:return: commit hash or None if not running from a git repo
"""
if not isdir(GIT_DIR_PATH):
return None
git_head = join(GIT_DIR_PATH, "HEAD")
if not isfile(git_head):
return None
with open(git_head, "r", encoding="utf-8") as f:
ref = f.read().strip()
# Direct reference to commit hash
if not ref.startswith("ref: "):
return ref
# Follow the reference
ref = ref[5:]
ref_path = join(GIT_DIR_PATH, ref)
# Dangling reference, return the reference name
if not isfile(ref_path):
return basename(ref_path)
# Return the reference commit hash
with open(ref_path, "r", encoding="utf-8") as f:
return f.read().strip()
def get_package_version() -> str:
"""
Get package version as defined pyproject.toml.
If not found, returns "0.0.0."
:return: package version as defined in pyproject.toml
"""
UNKNOWN = "0.0.0"
PYPOETRY_VERSION_PATTERN = re.compile(r'^\s*version\s*=\s*"(.*)"\s*(#.*)?$')
pyproject_path = join(dirname(__file__), "..", "..", "pyproject.toml")
if not isfile(pyproject_path):
return UNKNOWN
with open(pyproject_path, "r", encoding="utf-8") as fp:
for line in fp:
m = PYPOETRY_VERSION_PATTERN.match(line)
if m:
return m.group(1)
return UNKNOWN
def get_version() -> str:
"""
Find and return the current version of Pythagora Core.
The version string is built from the package version and the current
git commit hash (if running from a git repo).
Example: 0.0.0-gitbf01c19
:return: version string
"""
version = get_package_version()
commit = get_git_commit()
if commit:
version = version + "-git" + commit[:7]
return version
__all__ = ["get_version"]

0
core/db/__init__.py Normal file
View File

116
core/db/alembic.ini Normal file
View File

@@ -0,0 +1,116 @@
# A generic, single database configuration.
[alembic]
# path to migration scripts
script_location = core/db/migrations
# template used to generate migration file names; The default value is %%(rev)s_%%(slug)s
# Uncomment the line below if you want the files to be prepended with date and time
# see https://alembic.sqlalchemy.org/en/latest/tutorial.html#editing-the-ini-file
# for all available tokens
# file_template = %%(year)d_%%(month).2d_%%(day).2d_%%(hour).2d%%(minute).2d-%%(rev)s_%%(slug)s
# sys.path path, will be prepended to sys.path if present.
# defaults to the current working directory.
prepend_sys_path = .
# timezone to use when rendering the date within the migration file
# as well as the filename.
# If specified, requires the python>=3.9 or backports.zoneinfo library.
# Any required deps can installed by adding `alembic[tz]` to the pip requirements
# string value is passed to ZoneInfo()
# leave blank for localtime
# timezone =
# max length of characters to apply to the
# "slug" field
# truncate_slug_length = 40
# set to 'true' to run the environment during
# the 'revision' command, regardless of autogenerate
# revision_environment = false
# set to 'true' to allow .pyc and .pyo files without
# a source .py file to be detected as revisions in the
# versions/ directory
# sourceless = false
# version location specification; This defaults
# to migrations/versions. When using multiple version
# directories, initial revisions must be specified with --version-path.
# The path separator used here should be the separator specified by "version_path_separator" below.
version_locations = core/db/migrations/versions
# version path separator; As mentioned above, this is the character used to split
# version_locations. The default within new alembic.ini files is "os", which uses os.pathsep.
# If this key is omitted entirely, it falls back to the legacy behavior of splitting on spaces and/or commas.
# Valid values for version_path_separator are:
#
# version_path_separator = :
# version_path_separator = ;
# version_path_separator = space
# Use os.pathsep. Default configuration used for new projects.
version_path_separator = os
# set to 'true' to search source files recursively
# in each "version_locations" directory
# new in Alembic version 1.10
# recursive_version_locations = false
# the output encoding used when revision files
# are written from script.py.mako
# output_encoding = utf-8
sqlalchemy.url = sqlite:///pythagora.db
[post_write_hooks]
# post_write_hooks defines scripts or Python functions that are run
# on newly generated revision scripts. See the documentation for further
# detail and examples
# format using "black" - use the console_scripts runner, against the "black" entrypoint
# hooks = black
# black.type = console_scripts
# black.entrypoint = black
# black.options = -l 79 REVISION_SCRIPT_FILENAME
# lint with attempts to fix using "ruff" - use the exec runner, execute a binary
hooks = ruff
ruff.type = exec
ruff.executable = ruff
ruff.options = check --fix REVISION_SCRIPT_FILENAME
# Logging configuration
[loggers]
keys = root,sqlalchemy,alembic
[handlers]
keys = console
[formatters]
keys = generic
[logger_root]
level = WARN
handlers = console
qualname =
[logger_sqlalchemy]
level = WARN
handlers =
qualname = sqlalchemy.engine
[logger_alembic]
level = INFO
handlers =
qualname = alembic
[handler_console]
class = StreamHandler
args = (sys.stderr,)
level = NOTSET
formatter = generic
[formatter_generic]
format = %(levelname)-5.5s [%(name)s] %(message)s
datefmt = %H:%M:%S

View File

@@ -0,0 +1 @@
Generic single-database configuration.

83
core/db/migrations/env.py Normal file
View File

@@ -0,0 +1,83 @@
from logging.config import fileConfig
from alembic import context
from sqlalchemy import engine_from_config, pool
from core.db.models import Base
# this is the Alembic Config object, which provides
# access to the values within the .ini file in use.
config = context.config
# Interpret the config file for Python logging.
# This line sets up loggers basically.
if config.config_file_name is not None and not config.get_main_option("pythagora_runtime"):
fileConfig(config.config_file_name)
# Set database URL from environment
# config.set_main_option("sqlalchemy.url", getenv("DATABASE_URL"))
# add your model's MetaData object here
# for 'autogenerate' support
target_metadata = Base.metadata
# other values from the config, defined by the needs of env.py,
# can be acquired:
# my_important_option = config.get_main_option("my_important_option")
# ... etc.
def run_migrations_offline() -> None:
"""Run migrations in 'offline' mode.
This configures the context with just a URL
and not an Engine, though an Engine is acceptable
here as well. By skipping the Engine creation
we don't even need a DBAPI to be available.
Calls to context.execute() here emit the given string to the
script output.
"""
url = config.get_main_option("sqlalchemy.url")
context.configure(
url=url,
target_metadata=target_metadata,
literal_binds=True,
dialect_opts={"paramstyle": "named"},
render_as_batch="sqlite://" in url,
)
with context.begin_transaction():
context.run_migrations()
def run_migrations_online() -> None:
"""Run migrations in 'online' mode.
In this scenario we need to create an Engine
and associate a connection with the context.
"""
url = config.get_main_option("sqlalchemy.url")
connectable = engine_from_config(
config.get_section(config.config_ini_section, {}),
prefix="sqlalchemy.",
poolclass=pool.NullPool,
)
with connectable.connect() as connection:
context.configure(
connection=connection,
target_metadata=target_metadata,
render_as_batch="sqlite://" in url,
)
with context.begin_transaction():
context.run_migrations()
if context.is_offline_mode():
run_migrations_offline()
else:
run_migrations_online()

View File

@@ -0,0 +1,26 @@
"""${message}
Revision ID: ${up_revision}
Revises: ${down_revision | comma,n}
Create Date: ${create_date}
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
${imports if imports else ""}
# revision identifiers, used by Alembic.
revision: str = ${repr(up_revision)}
down_revision: Union[str, None] = ${repr(down_revision)}
branch_labels: Union[str, Sequence[str], None] = ${repr(branch_labels)}
depends_on: Union[str, Sequence[str], None] = ${repr(depends_on)}
def upgrade() -> None:
${upgrades if upgrades else "pass"}
def downgrade() -> None:
${downgrades if downgrades else "pass"}

View File

@@ -0,0 +1,34 @@
"""added complexity to specification
Revision ID: 4f79e6952354
Revises: 5b04ea6afce5
Create Date: 2024-05-16 18:01:49.024811
"""
from typing import Sequence, Union
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "4f79e6952354"
down_revision: Union[str, None] = "5b04ea6afce5"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table("specifications", schema=None) as batch_op:
batch_op.add_column(sa.Column("complexity", sa.String(), server_default="hard", nullable=False))
# ### end Alembic commands ###
def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table("specifications", schema=None) as batch_op:
batch_op.drop_column("complexity")
# ### end Alembic commands ###

View File

@@ -0,0 +1,34 @@
"""add agent info to llm request log
Revision ID: 5b04ea6afce5
Revises: fd206d3095d0
Create Date: 2024-05-12 11:07:40.271217
"""
from typing import Sequence, Union
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "5b04ea6afce5"
down_revision: Union[str, None] = "fd206d3095d0"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table("llm_requests", schema=None) as batch_op:
batch_op.add_column(sa.Column("agent", sa.String(), nullable=True))
# ### end Alembic commands ###
def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table("llm_requests", schema=None) as batch_op:
batch_op.drop_column("agent")
# ### end Alembic commands ###

View File

@@ -0,0 +1,120 @@
"""initial
Revision ID: e7b54beadf8f
Revises:
Create Date: 2024-05-06 09:38:05.391674
"""
from typing import Sequence, Union
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "e7b54beadf8f"
down_revision: Union[str, None] = None
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
op.create_table(
"file_contents",
sa.Column("id", sa.String(), nullable=False),
sa.Column("content", sa.String(), nullable=False),
sa.PrimaryKeyConstraint("id", name=op.f("pk_file_contents")),
)
op.create_table(
"projects",
sa.Column("id", sa.Uuid(), nullable=False),
sa.Column("name", sa.String(), nullable=False),
sa.Column("created_at", sa.DateTime(), server_default=sa.text("(CURRENT_TIMESTAMP)"), nullable=False),
sa.Column("folder_name", sa.String(), nullable=False),
sa.PrimaryKeyConstraint("id", name=op.f("pk_projects")),
)
op.create_table(
"specifications",
sa.Column("id", sa.Integer(), autoincrement=True, nullable=False),
sa.Column("description", sa.String(), nullable=False),
sa.Column("architecture", sa.String(), nullable=False),
sa.Column("system_dependencies", sa.JSON(), nullable=False),
sa.Column("package_dependencies", sa.JSON(), nullable=False),
sa.Column("template", sa.String(), nullable=True),
sa.PrimaryKeyConstraint("id", name=op.f("pk_specifications")),
)
op.create_table(
"branches",
sa.Column("id", sa.Uuid(), nullable=False),
sa.Column("project_id", sa.Uuid(), nullable=False),
sa.Column("created_at", sa.DateTime(), server_default=sa.text("(CURRENT_TIMESTAMP)"), nullable=False),
sa.Column("name", sa.String(), nullable=False),
sa.ForeignKeyConstraint(
["project_id"], ["projects.id"], name=op.f("fk_branches_project_id_projects"), ondelete="CASCADE"
),
sa.PrimaryKeyConstraint("id", name=op.f("pk_branches")),
)
op.create_table(
"project_states",
sa.Column("id", sa.Uuid(), nullable=False),
sa.Column("branch_id", sa.Uuid(), nullable=False),
sa.Column("prev_state_id", sa.Uuid(), nullable=True),
sa.Column("specification_id", sa.Integer(), nullable=False),
sa.Column("created_at", sa.DateTime(), server_default=sa.text("(CURRENT_TIMESTAMP)"), nullable=False),
sa.Column("step_index", sa.Integer(), server_default="1", nullable=False),
sa.Column("epics", sa.JSON(), nullable=False),
sa.Column("tasks", sa.JSON(), nullable=False),
sa.Column("steps", sa.JSON(), nullable=False),
sa.Column("iterations", sa.JSON(), nullable=False),
sa.Column("relevant_files", sa.JSON(), nullable=False),
sa.Column("modified_files", sa.JSON(), nullable=False),
sa.Column("run_command", sa.String(), nullable=True),
sa.ForeignKeyConstraint(
["branch_id"], ["branches.id"], name=op.f("fk_project_states_branch_id_branches"), ondelete="CASCADE"
),
sa.ForeignKeyConstraint(
["prev_state_id"],
["project_states.id"],
name=op.f("fk_project_states_prev_state_id_project_states"),
ondelete="CASCADE",
),
sa.ForeignKeyConstraint(
["specification_id"], ["specifications.id"], name=op.f("fk_project_states_specification_id_specifications")
),
sa.PrimaryKeyConstraint("id", name=op.f("pk_project_states")),
sa.UniqueConstraint("branch_id", "step_index", name=op.f("uq_project_states_branch_id")),
sa.UniqueConstraint("prev_state_id", name=op.f("uq_project_states_prev_state_id")),
sqlite_autoincrement=True,
)
op.create_table(
"files",
sa.Column("id", sa.Integer(), nullable=False),
sa.Column("project_state_id", sa.Uuid(), nullable=False),
sa.Column("content_id", sa.String(), nullable=False),
sa.Column("path", sa.String(), nullable=False),
sa.Column("meta", sa.JSON(), server_default="{}", nullable=False),
sa.ForeignKeyConstraint(
["content_id"], ["file_contents.id"], name=op.f("fk_files_content_id_file_contents"), ondelete="RESTRICT"
),
sa.ForeignKeyConstraint(
["project_state_id"],
["project_states.id"],
name=op.f("fk_files_project_state_id_project_states"),
ondelete="CASCADE",
),
sa.PrimaryKeyConstraint("id", name=op.f("pk_files")),
sa.UniqueConstraint("project_state_id", "path", name=op.f("uq_files_project_state_id")),
)
# ### end Alembic commands ###
def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
op.drop_table("files")
op.drop_table("project_states")
op.drop_table("branches")
op.drop_table("specifications")
op.drop_table("projects")
op.drop_table("file_contents")
# ### end Alembic commands ###

View File

@@ -0,0 +1,106 @@
"""store request input exec logs to db
Revision ID: fd206d3095d0
Revises: e7b54beadf8f
Create Date: 2024-05-09 08:25:10.698607
"""
from typing import Sequence, Union
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "fd206d3095d0"
down_revision: Union[str, None] = "e7b54beadf8f"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
op.create_table(
"exec_logs",
sa.Column("id", sa.Integer(), autoincrement=True, nullable=False),
sa.Column("branch_id", sa.Uuid(), nullable=False),
sa.Column("project_state_id", sa.Uuid(), nullable=True),
sa.Column("started_at", sa.DateTime(), nullable=False),
sa.Column("duration", sa.Float(), nullable=False),
sa.Column("cmd", sa.String(), nullable=False),
sa.Column("cwd", sa.String(), nullable=False),
sa.Column("env", sa.JSON(), nullable=False),
sa.Column("timeout", sa.Float(), nullable=True),
sa.Column("status_code", sa.Integer(), nullable=True),
sa.Column("stdout", sa.String(), nullable=False),
sa.Column("stderr", sa.String(), nullable=False),
sa.Column("analysis", sa.String(), nullable=False),
sa.Column("success", sa.Boolean(), nullable=False),
sa.ForeignKeyConstraint(
["branch_id"], ["branches.id"], name=op.f("fk_exec_logs_branch_id_branches"), ondelete="CASCADE"
),
sa.ForeignKeyConstraint(
["project_state_id"],
["project_states.id"],
name=op.f("fk_exec_logs_project_state_id_project_states"),
ondelete="SET NULL",
),
sa.PrimaryKeyConstraint("id", name=op.f("pk_exec_logs")),
)
op.create_table(
"llm_requests",
sa.Column("id", sa.Integer(), autoincrement=True, nullable=False),
sa.Column("branch_id", sa.Uuid(), nullable=False),
sa.Column("project_state_id", sa.Uuid(), nullable=True),
sa.Column("started_at", sa.DateTime(), server_default=sa.text("(CURRENT_TIMESTAMP)"), nullable=False),
sa.Column("provider", sa.String(), nullable=False),
sa.Column("model", sa.String(), nullable=False),
sa.Column("temperature", sa.Float(), nullable=False),
sa.Column("messages", sa.JSON(), nullable=False),
sa.Column("response", sa.String(), nullable=True),
sa.Column("prompt_tokens", sa.Integer(), nullable=False),
sa.Column("completion_tokens", sa.Integer(), nullable=False),
sa.Column("duration", sa.Float(), nullable=False),
sa.Column("status", sa.String(), nullable=False),
sa.Column("error", sa.String(), nullable=True),
sa.ForeignKeyConstraint(
["branch_id"], ["branches.id"], name=op.f("fk_llm_requests_branch_id_branches"), ondelete="CASCADE"
),
sa.ForeignKeyConstraint(
["project_state_id"],
["project_states.id"],
name=op.f("fk_llm_requests_project_state_id_project_states"),
ondelete="SET NULL",
),
sa.PrimaryKeyConstraint("id", name=op.f("pk_llm_requests")),
)
op.create_table(
"user_inputs",
sa.Column("id", sa.Integer(), autoincrement=True, nullable=False),
sa.Column("branch_id", sa.Uuid(), nullable=False),
sa.Column("project_state_id", sa.Uuid(), nullable=True),
sa.Column("created_at", sa.DateTime(), server_default=sa.text("(CURRENT_TIMESTAMP)"), nullable=False),
sa.Column("question", sa.String(), nullable=False),
sa.Column("answer_text", sa.String(), nullable=True),
sa.Column("answer_button", sa.String(), nullable=True),
sa.Column("cancelled", sa.Boolean(), nullable=False),
sa.ForeignKeyConstraint(
["branch_id"], ["branches.id"], name=op.f("fk_user_inputs_branch_id_branches"), ondelete="CASCADE"
),
sa.ForeignKeyConstraint(
["project_state_id"],
["project_states.id"],
name=op.f("fk_user_inputs_project_state_id_project_states"),
ondelete="SET NULL",
),
sa.PrimaryKeyConstraint("id", name=op.f("pk_user_inputs")),
)
# ### end Alembic commands ###
def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
op.drop_table("user_inputs")
op.drop_table("llm_requests")
op.drop_table("exec_logs")
# ### end Alembic commands ###

View File

@@ -0,0 +1,29 @@
# Pythagora database models
#
# Always import models from this module to ensure the SQLAlchemy registry
# is correctly populated.
from .base import Base
from .branch import Branch
from .exec_log import ExecLog
from .file import File
from .file_content import FileContent
from .llm_request import LLMRequest
from .project import Project
from .project_state import ProjectState
from .specification import Complexity, Specification
from .user_input import UserInput
__all__ = [
"Base",
"Branch",
"Complexity",
"ExecLog",
"File",
"FileContent",
"LLMRequest",
"Project",
"ProjectState",
"Specification",
"UserInput",
]

45
core/db/models/base.py Normal file
View File

@@ -0,0 +1,45 @@
# DeclarativeBase enables declarative configuration of
# database models within SQLAlchemy.
#
# It also sets up a registry for the classes that inherit from it,
# so that SQLAlechemy understands how they map to database tables.
from sqlalchemy import MetaData
from sqlalchemy.ext.asyncio import AsyncAttrs
from sqlalchemy.orm import DeclarativeBase
from sqlalchemy.types import JSON
class Base(AsyncAttrs, DeclarativeBase):
"""Base class for all SQL database models."""
# Mapping of Python types to SQLAlchemy types.
type_annotation_map = {
list[dict]: JSON,
list[str]: JSON,
dict: JSON,
}
metadata = MetaData(
# Naming conventions for constraints, foreign keys, etc.
naming_convention={
"ix": "ix_%(column_0_label)s",
"uq": "uq_%(table_name)s_%(column_0_name)s",
"ck": "ck_%(table_name)s_`%(constraint_name)s`",
"fk": "fk_%(table_name)s_%(column_0_name)s_%(referred_table_name)s",
"pk": "pk_%(table_name)s",
}
)
def __eq__(self, other) -> bool:
"""
Two instances of the same model class are the same if their
IDs are the same.
This allows comparison of models bound to different sessions.
"""
return isinstance(other, self.__class__) and self.id == other.id
def __repr__(self) -> str:
"""Return a string representation of the model."""
return f"<{self.__class__.__name__}(id={self.id})>"

89
core/db/models/branch.py Normal file
View File

@@ -0,0 +1,89 @@
from datetime import datetime
from typing import TYPE_CHECKING, Optional, Union
from uuid import UUID, uuid4
from sqlalchemy import ForeignKey, inspect, select
from sqlalchemy.orm import Mapped, mapped_column, relationship
from sqlalchemy.sql import func
from core.db.models import Base
if TYPE_CHECKING:
from sqlalchemy.ext.asyncio import AsyncSession
from core.db.models import ExecLog, LLMRequest, Project, ProjectState, UserInput
class Branch(Base):
__tablename__ = "branches"
DEFAULT = "main"
# ID and parent FKs
id: Mapped[UUID] = mapped_column(primary_key=True, default=uuid4)
project_id: Mapped[UUID] = mapped_column(ForeignKey("projects.id", ondelete="CASCADE"))
# Attributes
created_at: Mapped[datetime] = mapped_column(server_default=func.now())
name: Mapped[str] = mapped_column(default=DEFAULT)
# Relationships
project: Mapped["Project"] = relationship(back_populates="branches", lazy="selectin")
states: Mapped[list["ProjectState"]] = relationship(back_populates="branch", cascade="all")
llm_requests: Mapped[list["LLMRequest"]] = relationship(back_populates="branch", cascade="all")
user_inputs: Mapped[list["UserInput"]] = relationship(back_populates="branch", cascade="all")
exec_logs: Mapped[list["ExecLog"]] = relationship(back_populates="branch", cascade="all")
@staticmethod
async def get_by_id(session: "AsyncSession", branch_id: Union[str, UUID]) -> Optional["Branch"]:
"""
Get a project by ID.
:param session: The SQLAlchemy session.
:param project_id: The branch ID (as str or UUID value).
:return: The Branch object if found, None otherwise.
"""
if not isinstance(branch_id, UUID):
branch_id = UUID(branch_id)
result = await session.execute(select(Branch).where(Branch.id == branch_id))
return result.scalar_one_or_none()
async def get_last_state(self) -> Optional["ProjectState"]:
"""
Get the last project state of the branch.
:return: The last step of the branch, or None if there are no steps.
"""
from core.db.models import ProjectState
session = inspect(self).async_session
if session is None:
raise ValueError("Branch instance not associated with a DB session.")
result = await session.execute(
select(ProjectState)
.where(ProjectState.branch_id == self.id)
.order_by(ProjectState.step_index.desc())
.limit(1)
)
return result.scalar_one_or_none()
async def get_state_at_step(self, step_index: int) -> Optional["ProjectState"]:
"""
Get the project state at the given step index for the branch.
:return: The indicated step within the branch, or None if there's no such step.
"""
from core.db.models import ProjectState
session = inspect(self).async_session
if session is None:
raise ValueError("Branch instance not associated with a DB session.")
result = await session.execute(
select(ProjectState).where((ProjectState.branch_id == self.id) & (ProjectState.step_index == step_index))
)
return result.scalar_one_or_none()

View File

@@ -0,0 +1,71 @@
from datetime import datetime
from typing import TYPE_CHECKING, Optional
from uuid import UUID
from sqlalchemy import ForeignKey, inspect
from sqlalchemy.orm import Mapped, mapped_column, relationship
from core.db.models import Base
from core.proc.exec_log import ExecLog as ExecLogData
if TYPE_CHECKING:
from core.db.models import Branch, ProjectState
class ExecLog(Base):
__tablename__ = "exec_logs"
# ID and parent FKs
id: Mapped[int] = mapped_column(primary_key=True, autoincrement=True)
branch_id: Mapped[UUID] = mapped_column(ForeignKey("branches.id", ondelete="CASCADE"))
project_state_id: Mapped[Optional[UUID]] = mapped_column(ForeignKey("project_states.id", ondelete="SET NULL"))
# Attributes
started_at: Mapped[datetime] = mapped_column()
duration: Mapped[float] = mapped_column()
cmd: Mapped[str] = mapped_column()
cwd: Mapped[str] = mapped_column()
env: Mapped[dict] = mapped_column()
timeout: Mapped[Optional[float]] = mapped_column()
status_code: Mapped[Optional[int]] = mapped_column()
stdout: Mapped[str] = mapped_column()
stderr: Mapped[str] = mapped_column()
analysis: Mapped[str] = mapped_column()
success: Mapped[bool] = mapped_column()
# Relationships
branch: Mapped["Branch"] = relationship(back_populates="exec_logs")
project_state: Mapped["ProjectState"] = relationship(back_populates="exec_logs")
@classmethod
def from_exec_log(cls, project_state: "ProjectState", exec_log: ExecLogData) -> "ExecLog":
"""
Store the user input in the database.
Note this just creates the UserInput object. It is committed to the
database only when the DB session itself is comitted.
:param project_state: Project state to associate the request log with.
:param question: Question the user was asked.
:param user_input: User input.
:return: Newly created User input in the database.
"""
session = inspect(project_state).async_session
obj = cls(
project_state=project_state,
branch=project_state.branch,
started_at=exec_log.started_at,
duration=exec_log.duration,
cmd=exec_log.cmd,
cwd=exec_log.cwd,
env=exec_log.env,
timeout=exec_log.timeout,
status_code=exec_log.status_code,
stdout=exec_log.stdout,
stderr=exec_log.stderr,
analysis=exec_log.analysis,
success=exec_log.success,
)
session.add(obj)
return obj

43
core/db/models/file.py Normal file
View File

@@ -0,0 +1,43 @@
from typing import TYPE_CHECKING, Optional
from uuid import UUID
from sqlalchemy import ForeignKey, UniqueConstraint
from sqlalchemy.orm import Mapped, mapped_column, relationship
from core.db.models import Base
if TYPE_CHECKING:
from core.db.models import FileContent, ProjectState
class File(Base):
__tablename__ = "files"
__table_args__ = (UniqueConstraint("project_state_id", "path"),)
# ID and parent FKs
id: Mapped[int] = mapped_column(primary_key=True)
project_state_id: Mapped[UUID] = mapped_column(ForeignKey("project_states.id", ondelete="CASCADE"))
content_id: Mapped[str] = mapped_column(ForeignKey("file_contents.id", ondelete="RESTRICT"))
# Attributes
path: Mapped[str] = mapped_column()
meta: Mapped[dict] = mapped_column(default=dict, server_default="{}")
# Relationships
project_state: Mapped[Optional["ProjectState"]] = relationship(back_populates="files")
content: Mapped["FileContent"] = relationship(back_populates="files", lazy="selectin")
def clone(self) -> "File":
"""
Clone the file object, to be used in a new project state.
The clone references the same file content object as the original.
:return: The cloned file object.
"""
return File(
project_state=None,
content_id=self.content_id,
path=self.path,
meta=self.meta,
)

View File

@@ -0,0 +1,47 @@
from typing import TYPE_CHECKING
from sqlalchemy import select
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy.orm import Mapped, mapped_column, relationship
from core.db.models import Base
if TYPE_CHECKING:
from core.db.models import File
class FileContent(Base):
__tablename__ = "file_contents"
# ID and parent FKs
id: Mapped[str] = mapped_column(primary_key=True)
# Attributes
content: Mapped[str] = mapped_column()
# Relationships
files: Mapped[list["File"]] = relationship(back_populates="content")
@classmethod
async def store(cls, session: AsyncSession, hash: str, content: str) -> "FileContent":
"""
Store the file content in the database.
If the content is already stored, returns the reference to the existing
content object. Otherwise stores it to the database and returns the newly
created content object.
:param session: The database session.
:param hash: The hash of the file content, used as an unique ID.
:param content: The file content as unicode string.
:return: The file content object.
"""
result = await session.execute(select(FileContent).where(FileContent.id == hash))
fc = result.scalar_one_or_none()
if fc is not None:
return fc
fc = cls(id=hash, content=content)
session.add(fc)
return fc

View File

@@ -0,0 +1,79 @@
from datetime import datetime
from typing import TYPE_CHECKING, Optional
from uuid import UUID
from sqlalchemy import ForeignKey, inspect
from sqlalchemy.orm import Mapped, mapped_column, relationship
from sqlalchemy.sql import func
from core.db.models import Base
from core.llm.request_log import LLMRequestLog
if TYPE_CHECKING:
from core.agents.base import BaseAgent
from core.db.models import Branch, ProjectState
class LLMRequest(Base):
__tablename__ = "llm_requests"
# ID and parent FKs
id: Mapped[int] = mapped_column(primary_key=True, autoincrement=True)
branch_id: Mapped[UUID] = mapped_column(ForeignKey("branches.id", ondelete="CASCADE"))
project_state_id: Mapped[Optional[UUID]] = mapped_column(ForeignKey("project_states.id", ondelete="SET NULL"))
# Attributes
started_at: Mapped[datetime] = mapped_column(server_default=func.now())
agent: Mapped[Optional[str]] = mapped_column()
provider: Mapped[str] = mapped_column()
model: Mapped[str] = mapped_column()
temperature: Mapped[float] = mapped_column()
messages: Mapped[list[dict]] = mapped_column()
response: Mapped[Optional[str]] = mapped_column()
prompt_tokens: Mapped[int] = mapped_column()
completion_tokens: Mapped[int] = mapped_column()
duration: Mapped[float] = mapped_column()
status: Mapped[str] = mapped_column()
error: Mapped[Optional[str]] = mapped_column()
# Relationships
branch: Mapped["Branch"] = relationship(back_populates="llm_requests")
project_state: Mapped["ProjectState"] = relationship(back_populates="llm_requests")
@classmethod
def from_request_log(
cls,
project_state: "ProjectState",
agent: Optional["BaseAgent"],
request_log: LLMRequestLog,
) -> "LLMRequest":
"""
Store the request log in the database.
Note this just creates the request log object. It is committed to the
database only when the DB session itself is comitted.
:param project_state: Project state to associate the request log with.
:param agent: Agent that made the request (if the caller was an agent).
:param request_log: Request log.
:return: Newly created LLM request log in the database.
"""
session = inspect(project_state).async_session
obj = cls(
project_state=project_state,
branch=project_state.branch,
agent=agent.agent_type,
provider=request_log.provider,
model=request_log.model,
temperature=request_log.temperature,
messages=request_log.messages,
response=request_log.response,
prompt_tokens=request_log.prompt_tokens,
completion_tokens=request_log.completion_tokens,
duration=request_log.duration,
status=request_log.status,
error=request_log.error,
)
session.add(obj)
return obj

124
core/db/models/project.py Normal file
View File

@@ -0,0 +1,124 @@
import re
from datetime import datetime
from typing import TYPE_CHECKING, Optional, Union
from unicodedata import normalize
from uuid import UUID, uuid4
from sqlalchemy import delete, inspect, select
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy.orm import Mapped, mapped_column, relationship, selectinload
from sqlalchemy.sql import func
from core.db.models import Base
if TYPE_CHECKING:
from core.db.models import Branch
class Project(Base):
__tablename__ = "projects"
# ID and parent FKs
id: Mapped[UUID] = mapped_column(primary_key=True, default=uuid4)
# Attributes
name: Mapped[str] = mapped_column()
created_at: Mapped[datetime] = mapped_column(server_default=func.now())
folder_name: Mapped[str] = mapped_column(
default=lambda context: Project.get_folder_from_project_name(context.get_current_parameters()["name"])
)
# Relationships
branches: Mapped[list["Branch"]] = relationship(back_populates="project", cascade="all")
@staticmethod
async def get_by_id(session: "AsyncSession", project_id: Union[str, UUID]) -> Optional["Project"]:
"""
Get a project by ID.
:param session: The SQLAlchemy session.
:param project_id: The project ID (as str or UUID value).
:return: The Project object if found, None otherwise.
"""
if not isinstance(project_id, UUID):
project_id = UUID(project_id)
result = await session.execute(select(Project).where(Project.id == project_id))
return result.scalar_one_or_none()
async def get_branch(self, name: Optional[str] = None) -> Optional["Branch"]:
"""
Get a project branch by name.
:param session: The SQLAlchemy session.
:param branch_name: The name of the branch (default "main").
:return: The Branch object if found, None otherwise.
"""
from core.db.models import Branch
session = inspect(self).async_session
if session is None:
raise ValueError("Project instance not associated with a DB session.")
if name is None:
name = Branch.DEFAULT
result = await session.execute(select(Branch).where(Branch.project_id == self.id, Branch.name == name))
return result.scalar_one_or_none()
@staticmethod
async def get_all_projects(session: "AsyncSession") -> list["Project"]:
"""
Get all projects.
This assumes the projects have at least one branch and one state.
:param session: The SQLAlchemy session.
:return: List of Project objects.
"""
from core.db.models import Branch, ProjectState
latest_state_query = (
select(ProjectState.branch_id, func.max(ProjectState.id).label("max_id"))
.group_by(ProjectState.branch_id)
.subquery()
)
query = (
select(Project, Branch, ProjectState)
.join(Branch, Project.branches)
.join(ProjectState, Branch.states)
.join(latest_state_query, ProjectState.id == latest_state_query.columns.max_id)
.options(selectinload(Project.branches), selectinload(Branch.states))
.order_by(Project.name, Branch.name)
)
results = await session.execute(query)
return results.scalars().all()
@staticmethod
def get_folder_from_project_name(name: str):
"""
Get the folder name from the project name.
:param name: Project name.
:return: Folder name.
"""
# replace unicode with accents with base characters (eg "šašavi" → "sasavi")
name = normalize("NFKD", name).encode("ascii", "ignore").decode("utf-8")
# replace spaces/interpunction with a single dash
return re.sub(r"[^a-zA-Z0-9]+", "-", name).lower().strip("-")
@staticmethod
async def delete_by_id(session: "AsyncSession", project_id: UUID) -> int:
"""
Delete a project by ID.
:param session: The SQLAlchemy session.
:param project_id: The project ID
:return: Number of rows deleted.
"""
result = await session.execute(delete(Project).where(Project.id == project_id))
return result.rowcount

View File

@@ -0,0 +1,338 @@
from copy import deepcopy
from datetime import datetime
from typing import TYPE_CHECKING, Optional
from uuid import UUID, uuid4
from sqlalchemy import ForeignKey, UniqueConstraint, delete, inspect
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy.orm import Mapped, mapped_column, relationship
from sqlalchemy.orm.attributes import flag_modified
from sqlalchemy.sql import func
from core.db.models import Base
from core.log import get_logger
if TYPE_CHECKING:
from core.db.models import Branch, ExecLog, File, FileContent, LLMRequest, Specification, UserInput
log = get_logger(__name__)
class ProjectState(Base):
__tablename__ = "project_states"
__table_args__ = (
UniqueConstraint("prev_state_id"),
UniqueConstraint("branch_id", "step_index"),
{"sqlite_autoincrement": True},
)
# ID and parent FKs
id: Mapped[UUID] = mapped_column(primary_key=True, default=uuid4)
branch_id: Mapped[UUID] = mapped_column(ForeignKey("branches.id", ondelete="CASCADE"))
prev_state_id: Mapped[Optional[UUID]] = mapped_column(ForeignKey("project_states.id", ondelete="CASCADE"))
specification_id: Mapped[int] = mapped_column(ForeignKey("specifications.id"))
# Attributes
created_at: Mapped[datetime] = mapped_column(server_default=func.now())
step_index: Mapped[int] = mapped_column(default=1, server_default="1")
epics: Mapped[list[dict]] = mapped_column(default=list)
tasks: Mapped[list[dict]] = mapped_column(default=list)
steps: Mapped[list[dict]] = mapped_column(default=list)
iterations: Mapped[list[dict]] = mapped_column(default=list)
relevant_files: Mapped[list[str]] = mapped_column(default=list)
modified_files: Mapped[dict] = mapped_column(default=dict)
run_command: Mapped[Optional[str]] = mapped_column()
# Relationships
branch: Mapped["Branch"] = relationship(back_populates="states", lazy="selectin")
prev_state: Mapped[Optional["ProjectState"]] = relationship(
back_populates="next_state",
remote_side=[id],
single_parent=True,
)
next_state: Mapped[Optional["ProjectState"]] = relationship(back_populates="prev_state")
files: Mapped[list["File"]] = relationship(
back_populates="project_state",
lazy="selectin",
cascade="all,delete-orphan",
)
specification: Mapped["Specification"] = relationship(back_populates="project_states", lazy="selectin")
llm_requests: Mapped[list["LLMRequest"]] = relationship(back_populates="project_state", cascade="all")
user_inputs: Mapped[list["UserInput"]] = relationship(back_populates="project_state", cascade="all")
exec_logs: Mapped[list["ExecLog"]] = relationship(back_populates="project_state", cascade="all")
@property
def unfinished_steps(self) -> list[dict]:
"""
Get the list of unfinished steps.
:return: List of unfinished steps.
"""
return [step for step in self.steps if not step.get("completed")]
@property
def current_step(self) -> Optional[dict]:
"""
Get the current step.
Current step is always the first step that's not finished yet.
:return: The current step, or None if there are no more unfinished steps.
"""
li = self.unfinished_steps
return li[0] if li else None
@property
def unfinished_iterations(self) -> list[dict]:
"""
Get the list of unfinished iterations.
:return: List of unfinished iterations.
"""
return [iteration for iteration in self.iterations if not iteration.get("completed")]
@property
def current_iteration(self) -> Optional[dict]:
"""
Get the current iteration.
Current iteration is always the first iteration that's not finished yet.
:return: The current iteration, or None if there are no unfinished iterations.
"""
li = self.unfinished_iterations
return li[0] if li else None
@property
def unfinished_tasks(self) -> list[dict]:
"""
Get the list of unfinished tasks.
:return: List of unfinished tasks.
"""
return [task for task in self.tasks if not task.get("completed")]
@property
def current_task(self) -> Optional[dict]:
"""
Get the current task.
Current task is always the first task that's not finished yet.
:return: The current task, or None if there are no unfinished tasks.
"""
li = self.unfinished_tasks
return li[0] if li else None
@property
def unfinished_epics(self) -> list[dict]:
"""
Get the list of unfinished epics.
:return: List of unfinished epics.
"""
return [epic for epic in self.epics if not epic.get("completed")]
@property
def current_epic(self) -> Optional[dict]:
"""
Get the current epic.
Current epic is always the first epic that's not finished yet.
:return: The current epic, or None if there are no unfinished epics.
"""
li = self.unfinished_epics
return li[0] if li else None
@property
def relevant_file_objects(self):
"""
Get the relevant files with their content.
:return: List of tuples with file path and content.
"""
return [file for file in self.files if file.path in self.relevant_files]
@staticmethod
def create_initial_state(branch: "Branch") -> "ProjectState":
"""
Create the initial project state for a new branch.
This does *not* commit the new state to the database.
No checks are made to ensure that the branch does not
already have a state.
:param branch: The branch to create the state for.
:return: The new ProjectState object.
"""
from core.db.models import Specification
return ProjectState(
branch=branch,
specification=Specification(),
step_index=1,
)
async def create_next_state(self) -> "ProjectState":
"""
Create the next project state for the branch.
This does NOT insert the new state and the associated objects (spec,
files, ...) to the database.
:param session: The SQLAlchemy session.
:return: The new ProjectState object.
"""
if not self.id:
raise ValueError("Cannot create next state for unsaved state.")
if "next_state" in self.__dict__:
raise ValueError(f"Next state already exists for state with id={self.id}.")
new_state = ProjectState(
branch=self.branch,
prev_state=self,
step_index=self.step_index + 1,
specification=self.specification,
epics=deepcopy(self.epics),
tasks=deepcopy(self.tasks),
steps=deepcopy(self.steps),
iterations=deepcopy(self.iterations),
files=[],
relevant_files=deepcopy(self.relevant_files),
modified_files=deepcopy(self.modified_files),
)
session: AsyncSession = inspect(self).async_session
session.add(new_state)
for file in await self.awaitable_attrs.files:
clone = file.clone()
new_state.files.append(clone)
return new_state
def complete_step(self):
if not self.unfinished_steps:
raise ValueError("There are no unfinished steps to complete")
if "next_state" in self.__dict__:
raise ValueError("Current state is read-only (already has a next state).")
log.debug(f"Completing step {self.unfinished_steps[0]['type']}")
self.unfinished_steps[0]["completed"] = True
flag_modified(self, "steps")
def complete_task(self):
if not self.unfinished_tasks:
raise ValueError("There are no unfinished tasks to complete")
if "next_state" in self.__dict__:
raise ValueError("Current state is read-only (already has a next state).")
log.debug(f"Completing task {self.unfinished_tasks[0]['description']}")
self.unfinished_tasks[0]["completed"] = True
self.steps = []
self.iterations = []
self.relevant_files = []
self.modified_files = {}
flag_modified(self, "tasks")
if not self.unfinished_tasks and self.unfinished_epics:
self.complete_epic()
def complete_epic(self):
if not self.unfinished_epics:
raise ValueError("There are no unfinished epics to complete")
if "next_state" in self.__dict__:
raise ValueError("Current state is read-only (already has a next state).")
log.debug(f"Completing epic {self.unfinished_epics[0]['name']}")
self.unfinished_epics[0]["completed"] = True
flag_modified(self, "epics")
def complete_iteration(self):
if not self.unfinished_iterations:
raise ValueError("There are no unfinished iterations to complete")
if "next_state" in self.__dict__:
raise ValueError("Current state is read-only (already has a next state).")
log.debug(f"Completing iteration {self.unfinished_iterations[0]}")
self.unfinished_iterations[0]["completed"] = True
self.flag_iterations_as_modified()
def flag_iterations_as_modified(self):
"""
Flag the iteration field as having been modified
Used by Agents that perform modifications within the mutable iterations field,
to tell the database that it was modified and should get saved (as SQLalchemy
can't detect changes in mutable fields by itself).
"""
flag_modified(self, "iterations")
def get_file_by_path(self, path: str) -> Optional["File"]:
"""
Get a file from the current project state, by the file path.
:param path: The file path.
:return: The file object, or None if not found.
"""
for file in self.files:
if file.path == path:
return file
return None
def save_file(self, path: str, content: "FileContent", external: bool = False) -> "File":
"""
Save a file to the project state.
This either creates a new file pointing at the given content,
or updates the content of an existing file. This method
doesn't actually commit the file to the database, just attaches
it to the project state.
If the file was created by Pythagora (not externally by user or template import),
mark it as relevant for the current task.
:param path: The file path.
:param content: The file content.
:param external: Whether the file was added externally (e.g. by a user).
:return: The (unsaved) file object.
"""
from core.db.models import File
if "next_state" in self.__dict__:
raise ValueError("Current state is read-only (already has a next state).")
file = self.get_file_by_path(path)
if file:
original_content = file.content.content
file.content = content
else:
original_content = ""
file = File(path=path, content=content)
self.files.append(file)
if path not in self.modified_files and not external:
self.modified_files[path] = original_content
if path not in self.relevant_files:
self.relevant_files.append(path)
return file
async def delete_after(self):
"""
Delete all states in the branch after this one.
"""
session: AsyncSession = inspect(self).async_session
log.debug(f"Deleting all project states in branch {self.branch_id} after {self.id}")
await session.execute(
delete(ProjectState).where(
ProjectState.branch_id == self.branch_id,
ProjectState.step_index > self.step_index,
)
)

View File

@@ -0,0 +1,48 @@
from typing import TYPE_CHECKING, Optional
from sqlalchemy.orm import Mapped, mapped_column, relationship
from core.db.models import Base
if TYPE_CHECKING:
from core.db.models import ProjectState
class Complexity:
"""Estimate of the project or feature complexity."""
SIMPLE = "simple"
MODERATE = "moderate"
HARD = "hard"
class Specification(Base):
__tablename__ = "specifications"
# ID and parent FKs
id: Mapped[int] = mapped_column(primary_key=True, autoincrement=True)
# Attributes
description: Mapped[str] = mapped_column(default="")
architecture: Mapped[str] = mapped_column(default="")
system_dependencies: Mapped[list[dict]] = mapped_column(default=list)
package_dependencies: Mapped[list[dict]] = mapped_column(default=list)
template: Mapped[Optional[str]] = mapped_column()
complexity: Mapped[str] = mapped_column(server_default=Complexity.HARD)
# Relationships
project_states: Mapped[list["ProjectState"]] = relationship(back_populates="specification")
def clone(self) -> "Specification":
"""
Clone the specification.
"""
clone = Specification(
description=self.description,
architecture=self.architecture,
system_dependencies=self.system_dependencies,
package_dependencies=self.package_dependencies,
template=self.template,
complexity=self.complexity,
)
return clone

View File

@@ -0,0 +1,59 @@
from datetime import datetime
from typing import TYPE_CHECKING, Optional
from uuid import UUID
from sqlalchemy import ForeignKey, inspect
from sqlalchemy.orm import Mapped, mapped_column, relationship
from sqlalchemy.sql import func
from core.db.models import Base
from core.ui.base import UserInput as UserInputData
if TYPE_CHECKING:
from core.db.models import Branch, ProjectState
class UserInput(Base):
__tablename__ = "user_inputs"
# ID and parent FKs
id: Mapped[int] = mapped_column(primary_key=True, autoincrement=True)
branch_id: Mapped[UUID] = mapped_column(ForeignKey("branches.id", ondelete="CASCADE"))
project_state_id: Mapped[Optional[UUID]] = mapped_column(ForeignKey("project_states.id", ondelete="SET NULL"))
# Attributes
created_at: Mapped[datetime] = mapped_column(server_default=func.now())
question: Mapped[str] = mapped_column()
answer_text: Mapped[Optional[str]] = mapped_column()
answer_button: Mapped[Optional[str]] = mapped_column()
cancelled: Mapped[bool] = mapped_column()
# Relationships
branch: Mapped["Branch"] = relationship(back_populates="user_inputs")
project_state: Mapped["ProjectState"] = relationship(back_populates="user_inputs")
@classmethod
def from_user_input(cls, project_state: "ProjectState", question: str, user_input: UserInputData) -> "UserInput":
"""
Store the user input in the database.
Note this just creates the UserInput object. It is committed to the
database only when the DB session itself is comitted.
:param project_state: Project state to associate the request log with.
:param question: Question the user was asked.
:param user_input: User input.
:return: Newly created User input in the database.
"""
session = inspect(project_state).async_session
obj = cls(
project_state=project_state,
branch=project_state.branch,
question=question,
answer_text=user_input.text,
answer_button=user_input.button,
cancelled=user_input.cancelled,
)
session.add(obj)
return obj

75
core/db/session.py Normal file
View File

@@ -0,0 +1,75 @@
from sqlalchemy import event
from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine
from core.config import DBConfig
from core.log import get_logger
log = get_logger(__name__)
class SessionManager:
"""
Async-aware context manager for database session.
Usage:
>>> config = DBConfig(url="sqlite+aiosqlite:///test.db")
>>> async with DBSession(config) as session:
... # Do something with the session
"""
def __init__(self, config: DBConfig):
"""
Initialize the session manager with the given configuration.
:param config: Database configuration.
"""
self.config = config
self.engine = create_async_engine(
self.config.url, echo=config.debug_sql, echo_pool="debug" if config.debug_sql else None
)
self.SessionClass = async_sessionmaker(self.engine, expire_on_commit=False)
self.session = None
self.recursion_depth = 0
event.listen(self.engine.sync_engine, "connect", self._on_connect)
def _on_connect(self, dbapi_connection, _):
"""Connection event handler"""
log.debug(f"Connected to database {self.config.url}")
if self.config.url.startswith("sqlite"):
# Note that SQLite uses NullPool by default, meaning every session creates a
# database "connection". This is fine and preferred for SQLite because
# it's a local file. PostgreSQL or other database use a real connection pool
# by default.
dbapi_connection.execute("pragma foreign_keys=on")
async def start(self) -> AsyncSession:
if self.session is not None:
self.recursion_depth += 1
log.warning(f"Re-entering database session (depth: {self.recursion_depth}), potential bug", stack_info=True)
return self.session
self.session = self.SessionClass()
return self.session
async def close(self):
if self.session is None:
log.warning("Closing database session that was never opened", stack_info=True)
return
if self.recursion_depth > 0:
self.recursion_depth -= 1
return
await self.session.close()
self.session = None
async def __aenter__(self) -> AsyncSession:
return await self.start()
async def __aexit__(self, exc_type, exc_val, exc_tb):
return await self.close()
__all__ = ["SessionManager"]

49
core/db/setup.py Normal file
View File

@@ -0,0 +1,49 @@
from os.path import dirname, join
from alembic import command
from alembic.config import Config
from core.config import DBConfig
from core.log import get_logger
log = get_logger(__name__)
def _async_to_sync_db_scheme(url: str) -> str:
"""
Convert an async database URL to a synchronous one.
This is needed because Alembic does not support async database
connections.
:param url: Asynchronouse database URL.
:return: Synchronous database URL.
"""
if url.startswith("postgresql+asyncpg://"):
return url.replace("postgresql+asyncpg://", "postgresql://")
elif url.startswith("sqlite+aiosqlite://"):
return url.replace("sqlite+aiosqlite://", "sqlite://")
return url
def run_migrations(config: DBConfig):
"""
Run database migrations using Alembic.
This needs to happen synchronously, before the asyncio
mainloop is started, and before any database access.
:param config: Database configuration.
"""
url = _async_to_sync_db_scheme(config.url)
ini_location = join(dirname(__file__), "alembic.ini")
log.debug(f"Running database migrations for {url} (config: {ini_location})")
alembic_cfg = Config(ini_location)
alembic_cfg.set_main_option("sqlalchemy.url", url)
alembic_cfg.set_main_option("pythagora_runtime", "true")
command.upgrade(alembic_cfg, "head")
__all__ = ["run_migrations"]

246
core/db/v0importer.py Normal file
View File

@@ -0,0 +1,246 @@
from json import loads
from os.path import exists
from pathlib import Path
from uuid import UUID, uuid4
import aiosqlite
from core.db.models import Branch, Project, ProjectState
from core.db.session import SessionManager
from core.log import get_logger
from core.state.state_manager import StateManager
log = get_logger(__name__)
class LegacyDatabaseImporter:
def __init__(self, session_manager: SessionManager, dbpath: str):
self.session_manager = session_manager
self.state_manager = StateManager(self.session_manager, None)
self.dbpath = dbpath
self.conn = None
if not exists(dbpath):
raise FileNotFoundError(f"File not found: {dbpath}")
async def import_database(self):
info = await self.load_legacy_database()
await self.save_to_new_database(info)
async def load_legacy_database(self):
async with aiosqlite.connect(self.dbpath) as conn:
self.conn = conn
is_valid = await self.verify_schema()
if not is_valid:
raise ValueError(f"Database {self.dbpath} doesn't look like a GPT-Pilot database")
apps = await self.get_apps()
info = {}
for app_id in apps:
app_info = await self.get_app_info(app_id)
info[app_id] = {
"name": apps[app_id],
**app_info,
}
return info
async def verify_schema(self) -> bool:
tables = set()
async with self.conn.execute("select name from sqlite_master where type = 'table'") as cursor:
async for row in cursor:
tables.add(row[0])
return "app" in tables and "development_steps" in tables
async def get_apps(self) -> dict[str, str]:
apps = {}
async with self.conn.execute("select id, name, status from app") as cursor:
async for id, name, status in cursor:
if status == "coding":
apps[id] = name
return apps
async def get_app_info(self, app_id: str) -> dict:
app_info = {
"initial_prompt": None,
"architecture": None,
"tasks": [],
}
async with self.conn.execute("select architecture from architecture where app_id = ?", (app_id,)) as cursor:
row = await cursor.fetchone()
if row:
app_info["architecture"] = loads(row[0])
async with self.conn.execute("select prompt from project_description where app_id = ?", (app_id,)) as cursor:
row = await cursor.fetchone()
if row:
app_info["initial_prompt"] = row[0]
async with self.conn.execute(
"select id, prompt_path, prompt_data, messages, llm_response from development_steps "
"where app_id = ? order by created_at asc",
(app_id,),
) as cursor:
async for row in cursor:
dev_step_id, prompt_path, prompt_data, messages, llm_response = row
if prompt_path == "development/task/breakdown.prompt":
task_info = await self.get_task_info(dev_step_id, prompt_data, llm_response)
app_info["tasks"].append(task_info)
return app_info
async def get_task_info(self, dev_step_id, prompt_data_json: str, llm_response: dict) -> dict:
prompt_data = loads(prompt_data_json)
current_feature = prompt_data.get("current_feature")
previous_features = prompt_data.get("previous_features") or []
tasks = prompt_data["development_tasks"]
current_task_index = prompt_data["current_task_index"]
current_task = tasks[current_task_index]
instructions = llm_response
files = await self.get_task_files(dev_step_id)
return {
"current_feature": current_feature,
"previous_features": previous_features,
"tasks": tasks,
"current_task_index": current_task_index,
"current_task": current_task,
"instructions": instructions,
"files": files,
}
async def get_task_files(self, dev_step_id: int):
files = {}
async with self.conn.execute(
"select content, path, name, description from file_snapshot "
"inner join file on file_snapshot.file_id = file.id "
"where file_snapshot.development_step_id = ?",
(dev_step_id,),
) as cursor:
async for row in cursor:
content, path, name, description = row
file_path = Path(path + "/" + name).as_posix() if path else name
try:
if isinstance(content, bytes):
content = content.decode("utf-8")
except: # noqa
# skip binary file
continue
files[file_path] = {
"description": description or None,
"content": content,
}
return files
async def save_to_new_database(self, info: dict):
async with self.session_manager as session:
projects = await Project.get_all_projects(session)
for project in projects:
imported_app = info.pop(project.id.hex, None)
if imported_app:
log.info(f"Project {project.name} already exists in the new database, skipping")
for app_id, app_info in info.items():
await self.save_app(app_id, app_info)
async def save_app(self, app_id: str, app_info: dict):
log.info(f"Importing app {app_info['name']} (id={app_id}) ...")
async with self.session_manager as session:
project = Project(id=UUID(app_id), name=app_info["name"])
branch = Branch(project=project)
state = ProjectState.create_initial_state(branch)
spec = state.specification
spec.description = app_info["initial_prompt"]
spec.architecture = app_info["architecture"]["architecture"]
spec.system_dependencies = app_info["architecture"]["system_dependencies"]
spec.package_dependencies = app_info["architecture"]["package_dependencies"]
spec.template = app_info["architecture"].get("template")
session.add(project)
await session.commit()
project = await self.state_manager.load_project(project_id=app_id)
# It is much harder to import all tasks and keep features/tasks lists in sync, so
# we only support importing the latest task.
if app_info["tasks"]:
await self.save_latest_task(app_info["tasks"][-1])
# This just closes the session and removes the last (incomplete) state.
# Everything else should already be safely comitted.
await self.state_manager.rollback()
async def save_latest_task(self, task: dict):
sm = self.state_manager
state = sm.current_state
state.epics = [
{
"id": uuid4().hex,
"name": "Initial Project",
"description": state.specification.description,
"summary": None,
"completed": bool(task["previous_features"]) or (task["current_feature"] is not None),
"complexity": "hard",
}
]
for i, feature in enumerate(task["previous_features"]):
state.epics += [
{
"id": uuid4().hex,
"name": f"Feature #{i + 1}",
"description": feature["summary"], # FIXME: is this good enough
"summary": None,
"completed": True,
"complexity": "hard",
}
]
if task["current_feature"]:
state.epics = state.epics + [
{
"id": uuid4().hex,
"name": f"Feature #{len(state.epics)}",
"description": task["current_feature"],
"summary": None,
"completed": False,
"complexity": "hard",
}
]
current_task_index = task["current_task_index"]
state.tasks = [
{
"id": uuid4().hex,
"description": task_info["description"],
"instructions": None,
"completed": current_task_index > i,
}
for i, task_info in enumerate(task["tasks"])
]
state.tasks[current_task_index]["instructions"] = task["instructions"]
await sm.current_session.commit()
# Reload project at the initialized state to reinitialize the next state
await self.state_manager.load_project(project_id=state.branch.project.id, step_index=state.step_index)
await self.save_task_files(task["files"])
await self.state_manager.commit()
async def save_task_files(self, files: dict):
for path, file_info in files.items():
await self.state_manager.save_file(
path,
file_info["content"],
metadata={
"description": file_info["description"],
"references": [],
},
)

0
core/disk/__init__.py Normal file
View File

125
core/disk/ignore.py Normal file
View File

@@ -0,0 +1,125 @@
import os.path
from fnmatch import fnmatch
from typing import Optional
class IgnoreMatcher:
"""
A class to match paths against a list of ignore patterns or
file attributes (size, type).
"""
def __init__(
self,
root_path: str,
ignore_paths: list[str],
*,
ignore_size_threshold: Optional[int] = None,
):
"""
Initialize the IgnoreMatcher object.
Ignore paths are matched agains the file name and the full path,
and may include shell-like wildcards ("*" for any number of characters,
"?" for a single character). Paths are normalized, so "/" works on both
Unix and Windows, and Windows matching is case insensitive.
:param root_path: Root path to use when checking files on disk.
:param ignore_paths: List of patterns to ignore.
:param ignore_size_threshold: Files larger than this size will be ignored.
"""
self.root_path = root_path
self.ignore_paths = ignore_paths
self.ignore_size_threshold = ignore_size_threshold
def ignore(self, path: str) -> bool:
"""
Check if the given path matches any of the ignore patterns.
:param path: (Relative) path to the file or directory to check
:return: True if the path matches any of the ignore patterns, False otherwise
"""
full_path = os.path.normpath(os.path.join(self.root_path, path))
if self._is_in_ignore_list(path):
return True
if self._is_large_file(full_path):
return True
# Binary files are always ignored
if self._is_binary(full_path):
return True
return False
def _is_in_ignore_list(self, path: str) -> bool:
"""
Check if the given path matches any of the ignore patterns.
Both the (relative) file path and the file (base) name are matched.
:param path: The path to the file or directory to check
:return: True if the path matches any of the ignore patterns, False otherwise.
"""
name = os.path.basename(path)
for pattern in self.ignore_paths:
if fnmatch(name, pattern) or fnmatch(path, pattern):
return True
return False
def _is_large_file(self, full_path: str) -> bool:
"""
Check if the given file is larger than the threshold.
This also returns True if the file doesn't or is not a regular file (eg.
it's a symlink), since we want to ignore those kinds of files as well.
:param path: Full path to the file to check.
:return: True if the file is larger than the threshold, False otherwise.
"""
if self.ignore_size_threshold is None:
return False
# We don't handle directories here
if os.path.isdir(full_path):
return False
if not os.path.isfile(full_path):
return True
try:
return bool(os.path.getsize(full_path) > self.ignore_size_threshold)
except: # noqa
return True
def _is_binary(self, full_path: str) -> bool:
"""
Check if the given file is binary and should be ignored.
This also returns True if the file doesn't or is not a regular file (eg.
it's a symlink), or can't be opened, since we want to ignore those too.
:param path: Full path to the file to check.
:return: True if the file should be ignored, False otherwise.
"""
# We don't handle directories here
if os.path.isdir(full_path):
return False
if not os.path.isfile(full_path):
return True
try:
with open(full_path, "r", encoding="utf-8") as f:
f.read(128 * 1024)
return False
except: # noqa
# If we can't open the file for any reason (eg. PermissionError), it's
# best to ignore it anyway
return True
__all__ = ["IgnoreMatcher"]

188
core/disk/vfs.py Normal file
View File

@@ -0,0 +1,188 @@
import os
import os.path
from hashlib import sha1
from pathlib import Path
from core.disk.ignore import IgnoreMatcher
from core.log import get_logger
log = get_logger(__name__)
class VirtualFileSystem:
def save(self, path: str, content: str):
"""
Save content to a file. Use for both new and updated files.
:param path: Path to the file, relative to project root.
:param content: Content to save.
"""
raise NotImplementedError()
def read(self, path: str) -> str:
"""
Read file contents.
:param path: Path to the file, relative to project root.
:return: File contents.
"""
raise NotImplementedError()
def remove(self, path: str):
"""
Remove a file.
If file doesn't exist or is a directory, or if the file is ignored,
do nothing.
:param path: Path to the file, relative to project root.
"""
raise NotImplementedError()
def get_full_path(self, path: str) -> str:
"""
Get the full path to a file.
This should be used to check the full path of the file on whichever
file system it locally is stored. For example, getting a full path
to a file and then passing it to an external program via run_command
should work.
:param path: Path to the file, relative to project root.
:return: Full path to the file.
"""
raise NotImplementedError()
def _filter_by_prefix(self, file_list: list[str], prefix: str) -> list[str]:
# We use "/" internally on all platforms, including win32
if not prefix.endswith("/"):
prefix = prefix + "/"
return [f for f in file_list if f.startswith(prefix)]
def _get_file_list(self) -> list[str]:
raise NotImplementedError()
def list(self, prefix: str = None) -> list[str]:
"""
Return a list of files in the project.
File paths are relative to the project root.
:param prefix: Optional prefix to filter files for.
:return: List of file paths.
"""
retval = sorted(self._get_file_list())
if prefix:
retval = self._filter_by_prefix(retval, prefix)
return retval
def hash(self, path: str) -> str:
content = self.read(path)
return self.hash_string(content)
@staticmethod
def hash_string(content: str) -> str:
return sha1(content.encode("utf-8")).hexdigest()
class MemoryVFS(VirtualFileSystem):
files: dict[str, str]
def __init__(self):
self.files = {}
def save(self, path: str, content: str):
self.files[path] = content
def read(self, path: str) -> str:
try:
return self.files[path]
except KeyError:
raise ValueError(f"File not found: {path}")
def remove(self, path: str):
if path in self.files:
del self.files[path]
def get_full_path(self, path: str) -> str:
# We use "/" internally on all platforms, including win32
return "/" + path
def _get_file_list(self) -> list[str]:
return self.files.keys()
class LocalDiskVFS(VirtualFileSystem):
def __init__(
self,
root: str,
create: bool = True,
allow_existing: bool = True,
ignore_matcher: IgnoreMatcher = None,
):
if not os.path.isdir(root):
if create:
os.makedirs(root)
else:
raise ValueError(f"Root directory does not exist: {root}")
else:
if not allow_existing:
raise FileExistsError(f"Root directory already exists: {root}")
if ignore_matcher is None:
ignore_matcher = IgnoreMatcher(root, [])
self.root = root
self.ignore_matcher = ignore_matcher
def get_full_path(self, path: str) -> str:
return os.path.normpath(os.path.join(self.root, path))
def save(self, path: str, content: str):
full_path = self.get_full_path(path)
os.makedirs(os.path.dirname(full_path), exist_ok=True)
with open(full_path, "w", encoding="utf-8") as f:
f.write(content)
log.debug(f"Saved file {path} ({len(content)} bytes) to {full_path}")
def read(self, path: str) -> str:
full_path = self.get_full_path(path)
if not os.path.isfile(full_path):
raise ValueError(f"File not found: {path}")
# TODO: do we want error handling here?
with open(full_path, "r", encoding="utf-8") as f:
return f.read()
def remove(self, path: str):
if self.ignore_matcher.ignore(path):
return
full_path = self.get_full_path(path)
if os.path.isfile(full_path):
try:
os.remove(full_path)
log.debug(f"Removed file {path} from {full_path}")
except Exception as err: # noqa
log.error(f"Failed to remove file {path}: {err}", exc_info=True)
def _get_file_list(self) -> list[str]:
files = []
for dpath, dirnames, filenames in os.walk(self.root):
# Modify in place to prevent recursing into ignored directories
dirnames[:] = [
d
for d in dirnames
if not self.ignore_matcher.ignore(os.path.relpath(os.path.join(dpath, d), self.root))
]
for filename in filenames:
path = os.path.relpath(os.path.join(dpath, filename), self.root)
if not self.ignore_matcher.ignore(path):
# We use "/" internally on all platforms, including win32
files.append(Path(path).as_posix())
return files
__all__ = ["VirtualFileSystem", "MemoryVFS", "LocalDiskVFS"]

0
core/llm/__init__.py Normal file
View File

View File

@@ -0,0 +1,123 @@
import datetime
import zoneinfo
from typing import Optional
from anthropic import AsyncAnthropic, RateLimitError
from httpx import Timeout
from core.config import LLMProvider
from core.llm.convo import Convo
from core.log import get_logger
from .base import BaseLLMClient
log = get_logger(__name__)
# Maximum number of tokens supported by Anthropic Claude 3
MAX_TOKENS = 4096
class AnthropicClient(BaseLLMClient):
provider = LLMProvider.ANTHROPIC
def _init_client(self):
self.client = AsyncAnthropic(
api_key=self.config.api_key,
base_url=self.config.base_url,
timeout=Timeout(
max(self.config.connect_timeout, self.config.read_timeout),
connect=self.config.connect_timeout,
read=self.config.read_timeout,
),
)
self.stream_handler = self.stream_handler
def _adapt_messages(self, convo: Convo) -> list[dict[str, str]]:
"""
Adapt the conversation messages to the format expected by the Anthropic Claude model.
Claude only recognizes "user" and "assistant" roles, and requires them to be switched
for each message (ie. no consecutive messages from the same role).
:param convo: Conversation to adapt.
:return: Adapted conversation messages.
"""
messages = []
for msg in convo.messages:
if msg["role"] == "function":
raise ValueError("Anthropic Claude doesn't support function calling")
role = "user" if msg["role"] in ["user", "system"] else "assistant"
if messages and messages[-1]["role"] == role:
messages[-1]["content"] += "\n\n" + msg["content"]
else:
messages.append(
{
"role": role,
"content": msg["content"],
}
)
return messages
async def _make_request(
self,
convo: Convo,
temperature: Optional[float] = None,
json_mode: bool = False,
) -> tuple[str, int, int]:
messages = self._adapt_messages(convo)
completion_kwargs = {
"max_tokens": MAX_TOKENS,
"model": self.config.model,
"messages": messages,
"temperature": self.config.temperature if temperature is None else temperature,
}
if json_mode:
completion_kwargs["response_format"] = {"type": "json_object"}
response = []
async with self.client.messages.stream(**completion_kwargs) as stream:
async for content in stream.text_stream:
response.append(content)
if self.stream_handler:
await self.stream_handler(content)
# TODO: get tokens from the final message
final_message = await stream.get_final_message()
final_message.content
response_str = "".join(response)
# Tell the stream handler we're done
if self.stream_handler:
await self.stream_handler(None)
return response_str, final_message.usage.input_tokens, final_message.usage.output_tokens
def rate_limit_sleep(self, err: RateLimitError) -> Optional[datetime.timedelta]:
"""
Anthropic rate limits docs:
https://docs.anthropic.com/en/api/rate-limits#response-headers
Limit reset times are in RFC 3339 format.
"""
headers = err.response.headers
if "anthropic-ratelimit-tokens-remaining" not in headers:
return None
remaining_tokens = headers["anthropic-ratelimit-tokens-remaining"]
if remaining_tokens == 0:
relevant_dt = headers["anthropic-ratelimit-tokens-reset"]
else:
relevant_dt = headers["anthropic-ratelimit-requests-reset"]
try:
reset_time = datetime.datetime.fromisoformat(relevant_dt)
except ValueError:
return datetime.timedelta(seconds=5)
now = datetime.datetime.now(tz=zoneinfo.ZoneInfo("UTC"))
return reset_time - now
__all__ = ["AnthropicClient"]

306
core/llm/base.py Normal file
View File

@@ -0,0 +1,306 @@
import asyncio
import datetime
import json
from enum import Enum
from time import time
from typing import Any, Callable, Optional, Tuple
import httpx
from core.config import LLMConfig, LLMProvider
from core.llm.convo import Convo
from core.llm.request_log import LLMRequestLog, LLMRequestStatus
from core.log import get_logger
log = get_logger(__name__)
class LLMError(str, Enum):
KEY_EXPIRED = "key_expired"
RATE_LIMITED = "rate_limited"
class APIError(Exception):
def __init__(self, message: str):
self.message = message
class BaseLLMClient:
"""
Base asynchronous streaming client for language models.
Example usage:
>>> async def stream_handler(content: str):
... print(content)
...
>>> def parser(content: str) -> dict:
... return json.loads(content)
...
>>> client_class = BaseClient.for_provider(provider)
>>> client = client_class(config, stream_handler=stream_handler)
>>> response, request_log = await client(convo, parser=parser)
"""
provider: LLMProvider
def __init__(
self,
config: LLMConfig,
*,
stream_handler: Optional[Callable] = None,
error_handler: Optional[Callable] = None,
):
"""
Initialize the client with the given configuration.
:param config: Configuration for the client.
:param stream_handler: Optional handler for streamed responses.
"""
self.config = config
self.stream_handler = stream_handler
self.error_handler = error_handler
self._init_client()
def _init_client(self):
raise NotImplementedError()
async def _make_request(
self,
convo: Convo,
temperature: Optional[float] = None,
json_mode: bool = False,
) -> tuple[str, int, int]:
"""
Call the Anthropic Claude model with the given conversation.
Low-level method that streams the response chunks.
Use `__call__` instead of this method.
:param convo: Conversation to send to the LLM.
:param json_mode: If True, the response is expected to be JSON.
:return: Tuple containing the full response content, number of input tokens, and number of output tokens.
"""
raise NotImplementedError()
async def _adapt_messages(self, convo: Convo) -> list[dict[str, str]]:
"""
Adapt the conversation messages to the format expected by the LLM.
Claude only recognizes "user" and "assistant roles"
:param convo: Conversation to adapt.
:return: Adapted conversation messages.
"""
messages = []
for msg in convo.messages:
if msg.role == "function":
raise ValueError("Anthropic Claude doesn't support function calling")
role = "user" if msg.role in ["user", "system"] else "assistant"
if messages and messages[-1]["role"] == role:
messages[-1]["content"] += "\n\n" + msg.content
else:
messages.append(
{
"role": role,
"content": msg.content,
}
)
return messages
async def __call__(
self,
convo: Convo,
*,
temperature: Optional[float] = None,
parser: Optional[Callable] = None,
max_retries: int = 3,
json_mode: bool = False,
) -> Tuple[Any, LLMRequestLog]:
"""
Invoke the LLM with the given conversation.
Stream handler, if provided, should be an async function
that takes a single argument, the response content (str).
It will be called for each response chunk.
Parser, if provided, should be a function that takes the
response content (str) and returns the parsed response.
On parse error, the parser should raise a ValueError with
a descriptive error message that will be sent back to the LLM
to retry, up to max_retries.
:param convo: Conversation to send to the LLM.
:param parser: Optional parser for the response.
:param max_retries: Maximum number of retries for parsing the response.
:param json_mode: If True, the response is expected to be JSON.
:return: Tuple of the (parsed) response and request log entry.
"""
import anthropic
import groq
import openai
if temperature is None:
temperature = self.config.temperature
convo = convo.fork()
request_log = LLMRequestLog(
provider=self.provider,
model=self.config.model,
temperature=temperature,
)
prompt_length_kb = len(json.dumps(convo.messages).encode("utf-8")) / 1024
log.debug(
f"Calling {self.provider.value} model {self.config.model} (temp={temperature}), prompt length: {prompt_length_kb:.1f} KB"
)
t0 = time()
for _ in range(max_retries):
request_log.messages = convo.messages[:]
request_log.response = None
request_log.error = None
response = None
try:
response, prompt_tokens, completion_tokens = await self._make_request(
convo,
temperature=temperature,
json_mode=json_mode,
)
except (openai.APIConnectionError, anthropic.APIConnectionError, groq.APIConnectionError) as err:
log.warning(f"API connection error: {err}", exc_info=True)
request_log.error = str(f"API connection error: {err}")
request_log.status = LLMRequestStatus.ERROR
continue
except httpx.ReadTimeout as err:
log.warning(f"Read timeout (set to {self.config.read_timeout}s): {err}", exc_info=True)
request_log.error = str(f"Read timeout: {err}")
request_log.status = LLMRequestStatus.ERROR
continue
except httpx.ReadError as err:
log.warning(f"Read error: {err}", exc_info=True)
request_log.error = str(f"Read error: {err}")
request_log.status = LLMRequestStatus.ERROR
continue
except (openai.RateLimitError, anthropic.RateLimitError, groq.RateLimitError) as err:
log.warning(f"Rate limit error: {err}", exc_info=True)
request_log.error = str(f"Rate limit error: {err}")
request_log.status = LLMRequestStatus.ERROR
wait_time = self.rate_limit_sleep(err)
if wait_time:
message = f"We've hit {self.config.provider.value} rate limit. Sleeping for {wait_time.seconds} seconds..."
await self.error_handler(LLMError.RATE_LIMITED, message)
await asyncio.sleep(wait_time.seconds)
continue
else:
# RateLimitError that shouldn't be retried, eg. insufficient funds
err_msg = err.response.json().get("error", {}).get("message", "Rate limiting error.")
raise APIError(err_msg) from err
except (openai.NotFoundError, anthropic.NotFoundError, groq.NotFoundError) as err:
err_msg = err.response.json().get("error", {}).get("message", f"Model not found: {self.config.model}")
raise APIError(err_msg) from err
except (openai.AuthenticationError, anthropic.AuthenticationError, groq.AuthenticationError) as err:
log.warning(f"Key expired: {err}", exc_info=True)
err_msg = err.response.json().get("error", {}).get("message", "Incorrect API key")
if "[BricksLLM]" in err_msg:
# We only want to show the key expired message if it's from Bricks
await self.error_handler(LLMError.KEY_EXPIRED)
raise APIError(err_msg) from err
except (openai.APIStatusError, anthropic.APIStatusError, groq.APIStatusError) as err:
# Token limit exceeded (in original gpt-pilot handled as
# TokenLimitError) is thrown as 400 (OpenAI, Anthropic) or 413 (Groq).
# All providers throw an exception that is caught here.
# OpenAI and Groq return a `code` field in the error JSON that lets
# us confirm that we've breached the token limit, but Anthropic doesn't,
# so we can't be certain that's the problem in Anthropic case.
# Here we try to detect that and tell the user what happened.
err_code = err.response.json().get("error", {}).get("code", "")
if err_code in ("request_too_large", "context_length_exceeded", "string_above_max_length"):
# Handle OpenAI and Groq token limit exceeded
# OpenAI will return `string_above_max_length` for prompts more than 1M characters
message = "".join(
[
"We sent too large request to the LLM, resulting in an error. ",
"This is usually caused by including framework files in an LLM request. ",
"Here's how you can get GPT Pilot to ignore those extra files: ",
"https://bit.ly/faq-token-limit-error",
]
)
raise APIError(message) from err
log.warning(f"API error: {err}", exc_info=True)
request_log.error = str(f"API error: {err}")
request_log.status = LLMRequestStatus.ERROR
return None, request_log
request_log.response = response
request_log.prompt_tokens += prompt_tokens
request_log.completion_tokens += completion_tokens
if parser:
try:
response = parser(response)
break
except ValueError as err:
log.debug(f"Error parsing GPT response: {err}, asking LLM to retry", exc_info=True)
convo.assistant(response)
convo.user(f"Error parsing response: {err}. Please output your response EXACTLY as requested.")
continue
else:
break
else:
log.warning(f"Failed to parse response after {max_retries} retries")
response = None
request_log.status = LLMRequestStatus.ERROR
t1 = time()
request_log.duration = t1 - t0
log.debug(
f"Total {self.provider.value} response time {request_log.duration:.2f}s, {request_log.prompt_tokens} prompt tokens, {request_log.completion_tokens} completion tokens used"
)
return response, request_log
@staticmethod
def for_provider(provider: LLMProvider) -> type["BaseLLMClient"]:
"""
Return LLM client for the specified provider.
:param provider: Provider to return the client for.
:return: Client class for the specified provider.
"""
from .anthropic_client import AnthropicClient
from .groq_client import GroqClient
from .openai_client import OpenAIClient
if provider == LLMProvider.OPENAI:
return OpenAIClient
elif provider == LLMProvider.ANTHROPIC:
return AnthropicClient
elif provider == LLMProvider.GROQ:
return GroqClient
else:
raise ValueError(f"Unsupported LLM provider: {provider.value}")
def rate_limit_sleep(self, err: Exception) -> Optional[datetime.timedelta]:
"""
Return how long we need to sleep because of rate limiting.
These are computed from the response headers that each LLM returns.
For details, check the implementation for the specific LLM. If there
are no rate limiting headers, we assume that the request should not
be retried and return None (this will be the case for insufficient
quota/funds in the account).
:param err: RateLimitError that was raised by the LLM client.
:return: optional timedelta to wait before trying again
"""
raise NotImplementedError()
__all__ = ["BaseLLMClient"]

163
core/llm/convo.py Normal file
View File

@@ -0,0 +1,163 @@
from copy import deepcopy
from typing import Iterator, Optional
class Convo:
"""
A conversation between a user and a Large Language Model (LLM) assistant.
"""
ROLES = ["system", "user", "assistant", "function"]
messages: list[dict[str, str]]
def __init__(self, content: Optional[str] = None):
"""
Initialize a new conversation.
:param content: Initial system message (optional).
"""
self.messages = []
if content is not None:
self.system(content)
@staticmethod
def _dedent(text: str) -> str:
"""
Remove common leading whitespace from every line of text.
:param text: Text to dedent.
:return: Dedented text.
"""
indent = len(text)
lines = text.splitlines()
for line in lines:
if line.strip():
indent = min(indent, len(line) - len(line.lstrip()))
dedented_lines = [line[indent:].rstrip() for line in lines]
return "\n".join(line for line in dedented_lines)
def add(self, role: str, content: str, name: Optional[str] = None) -> "Convo":
"""
Add a message to the conversation.
In most cases, you should use the convenience methods instead.
:param role: Role of the message (system, user, assistant, function).
:param content: Content of the message.
:param name: Name of the message sender (optional).
:return: The conv object.
"""
if role not in self.ROLES:
raise ValueError(f"Unknown role: {role}")
if not content:
raise ValueError("Empty message content")
if not isinstance(content, str) and not isinstance(content, dict):
raise TypeError(f"Invalid message content: {type(content).__name__}")
message = {
"role": role,
"content": self._dedent(content) if isinstance(content, str) else content,
}
if name is not None:
message["name"] = name
self.messages.append(message)
return self
def system(self, content: str, name: Optional[str] = None) -> "Convo":
"""
Add a system message to the conversation.
System messages can use `name` for showing example conversations
between an example user and an example assistant.
:param content: Content of the message.
:param name: Name of the message sender (optional).
:return: The convo object.
"""
return self.add("system", content, name)
def user(self, content: str, name: Optional[str] = None) -> "Convo":
"""
Add a user message to the conversation.
:param content: Content of the message.
:param name: User name (optional).
:return: The convo object.
"""
return self.add("user", content, name)
def assistant(self, content: str, name: Optional[str] = None) -> "Convo":
"""
Add an assistant message to the conversation.
:param content: Content of the message.
:param name: Assistant name (optional).
:return: The convo object.
"""
return self.add("assistant", content, name)
def function(self, content: str, name: Optional[str] = None) -> "Convo":
"""
Add a function (tool) response to the conversation.
:param content: Content of the message.
:param name: Function/tool name (optional).
:return: The convo object.
"""
return self.add("function", content, name)
def fork(self) -> "Convo":
"""
Create an identical copy of the conversation.
This performs a deep copy of all the message
contents, so you can safely modify both the
parent and the child conversation.
:return: A copy of the conversation.
"""
child = Convo()
child.messages = deepcopy(self.messages)
return child
def after(self, parent: "Convo") -> "Convo":
"""
Create a chat with only messages after the last common
message (that appears in both parent conversation and
this one).
:param parent: Parent conversation.
:return: A new conversation with only new messages.
"""
index = 0
while index < min(len(self.messages), len(parent.messages)) and self.messages[index] == parent.messages[index]:
index += 1
child = Convo()
child.messages = [deepcopy(msg) for msg in self.messages[index:]]
return child
def last(self) -> Optional[dict[str, str]]:
"""
Get the last message in the conversation.
:return: The last message, or None if the conversation is empty.
"""
return self.messages[-1] if self.messages else None
def __iter__(self) -> Iterator[dict[str, str]]:
"""
Iterate over the messages in the conversation.
:return: An iterator over the messages.
"""
return iter(self.messages)
def __repr__(self) -> str:
return f"<Convo({self.messages})>"
__all__ = ["Convo"]

93
core/llm/groq_client.py Normal file
View File

@@ -0,0 +1,93 @@
import datetime
from typing import Optional
import tiktoken
from groq import AsyncGroq, RateLimitError
from httpx import Timeout
from core.config import LLMProvider
from core.llm.base import BaseLLMClient
from core.llm.convo import Convo
from core.log import get_logger
log = get_logger(__name__)
tokenizer = tiktoken.get_encoding("cl100k_base")
class GroqClient(BaseLLMClient):
provider = LLMProvider.GROQ
def _init_client(self):
self.client = AsyncGroq(
api_key=self.config.api_key,
base_url=self.config.base_url,
timeout=Timeout(
max(self.config.connect_timeout, self.config.read_timeout),
connect=self.config.connect_timeout,
read=self.config.read_timeout,
),
)
async def _make_request(
self,
convo: Convo,
temperature: Optional[float] = None,
json_mode: bool = False,
) -> tuple[str, int, int]:
completion_kwargs = {
"model": self.config.model,
"messages": convo.messages,
"temperature": self.config.temperature if temperature is None else temperature,
"stream": True,
}
if json_mode:
completion_kwargs["response_format"] = {"type": "json_object"}
stream = await self.client.chat.completions.create(**completion_kwargs)
response = []
prompt_tokens = 0
completion_tokens = 0
async for chunk in stream:
if not chunk.choices:
continue
content = chunk.choices[0].delta.content
if not content:
continue
response.append(content)
if self.stream_handler:
await self.stream_handler(content)
response_str = "".join(response)
# Tell the stream handler we're done
if self.stream_handler:
await self.stream_handler(None)
if prompt_tokens == 0 and completion_tokens == 0:
# FIXME: Here we estimate Groq tokens using the same method as for OpenAI....
# See https://cookbook.openai.com/examples/how_to_count_tokens_with_tiktoken
prompt_tokens = sum(3 + len(tokenizer.encode(msg["content"])) for msg in convo.messages)
completion_tokens = len(tokenizer.encode(response_str))
return response_str, prompt_tokens, completion_tokens
def rate_limit_sleep(self, err: RateLimitError) -> Optional[datetime.timedelta]:
"""
Groq rate limits docs: https://console.groq.com/docs/rate-limits
Groq includes `retry-after` header when 429 RateLimitError is
thrown, so we use that instead of calculating our own backoff time.
"""
headers = err.response.headers
if "retry-after" not in headers:
return None
retry_after = int(err.response.headers["retry-after"])
return datetime.timedelta(seconds=retry_after)
__all__ = ["GroqClient"]

116
core/llm/openai_client.py Normal file
View File

@@ -0,0 +1,116 @@
import datetime
import re
from typing import Optional
import tiktoken
from httpx import Timeout
from openai import AsyncOpenAI, RateLimitError
from core.config import LLMProvider
from core.llm.base import BaseLLMClient
from core.llm.convo import Convo
from core.log import get_logger
log = get_logger(__name__)
tokenizer = tiktoken.get_encoding("cl100k_base")
class OpenAIClient(BaseLLMClient):
provider = LLMProvider.OPENAI
def _init_client(self):
self.client = AsyncOpenAI(
api_key=self.config.api_key,
base_url=self.config.base_url,
timeout=Timeout(
max(self.config.connect_timeout, self.config.read_timeout),
connect=self.config.connect_timeout,
read=self.config.read_timeout,
),
)
async def _make_request(
self,
convo: Convo,
temperature: Optional[float] = None,
json_mode: bool = False,
) -> tuple[str, int, int]:
completion_kwargs = {
"model": self.config.model,
"messages": convo.messages,
"temperature": self.config.temperature if temperature is None else temperature,
"stream": True,
"stream_options": {
"include_usage": True,
},
}
if json_mode:
completion_kwargs["response_format"] = {"type": "json_object"}
stream = await self.client.chat.completions.create(**completion_kwargs)
response = []
prompt_tokens = 0
completion_tokens = 0
async for chunk in stream:
if chunk.usage:
prompt_tokens += chunk.usage.prompt_tokens
completion_tokens += chunk.usage.completion_tokens
if not chunk.choices:
continue
content = chunk.choices[0].delta.content
if not content:
continue
response.append(content)
if self.stream_handler:
await self.stream_handler(content)
response_str = "".join(response)
# Tell the stream handler we're done
if self.stream_handler:
await self.stream_handler(None)
if prompt_tokens == 0 and completion_tokens == 0:
# See https://cookbook.openai.com/examples/how_to_count_tokens_with_tiktoken
prompt_tokens = sum(3 + len(tokenizer.encode(msg["content"])) for msg in convo.messages)
completion_tokens = len(tokenizer.encode(response_str))
log.warning(
"OpenAI response did not include token counts, estimating with tiktoken: "
f"{prompt_tokens} input tokens, {completion_tokens} output tokens"
)
return response_str, prompt_tokens, completion_tokens
def rate_limit_sleep(self, err: RateLimitError) -> Optional[datetime.timedelta]:
"""
OpenAI rate limits docs:
https://platform.openai.com/docs/guides/rate-limits/error-mitigation
Limit reset times are in "2h32m54s" format.
"""
headers = err.response.headers
if "x-ratelimit-remaining-tokens" not in headers:
return None
remaining_tokens = headers["x-ratelimit-remaining-tokens"]
time_regex = r"(?:(\d+)h)?(?:(\d+)m)?(?:(\d+)s)?"
if remaining_tokens == 0:
match = re.search(time_regex, headers["x-ratelimit-reset-tokens"])
else:
match = re.search(time_regex, headers["x-ratelimit-reset-requests"])
if match:
seconds = int(match.group(1)) * 3600 + int(match.group(2)) * 60 + int(match.group(3))
else:
# Not sure how this would happen, we would have to get a RateLimitError,
# but nothing (or invalid entry) in the `reset` field. Using a sane default.
seconds = 5
return datetime.timedelta(seconds=seconds)
__all__ = ["OpenAIClient"]

161
core/llm/parser.py Normal file
View File

@@ -0,0 +1,161 @@
import json
import re
from enum import Enum
from typing import Optional, Union
from pydantic import BaseModel, ValidationError
class MultiCodeBlockParser:
"""
Parse multiple Markdown code blocks from a string.
Expects zero or more blocks, and ignores any text
outside of the code blocks.
Example usage:
>>> parser = MultiCodeBlockParser()
>>> text = '''
... text outside block
...
... ```python
... first block
... ```
... some text between blocks
... ```js
... more
... code
... ```
... some text after blocks
'''
>>> assert parser(text) == ["first block", "more\ncode"]
If no code blocks are found, an empty list is returned:
"""
def __init__(self):
# FIXME: ``` should be the only content on the line`
self.pattern = re.compile(r"```([a-z0-9]+\n)?(.*?)```\s*", re.DOTALL)
def __call__(self, text: str) -> list[str]:
blocks = []
for block in self.pattern.findall(text):
blocks.append(block[1].strip())
return blocks
class CodeBlockParser(MultiCodeBlockParser):
"""
Parse a Markdown code block from a string.
Expects exactly one code block, and ignores
any text before or after it.
Usage:
>>> parser = CodeBlockParser()
>>> text = "text\n```py\ncodeblock\n'''\nmore text"
>>> assert parser(text) == "codeblock"
This is a special case of MultiCodeBlockParser,
checking that there's exactly one block.
"""
def __call__(self, text: str) -> str:
blocks = super().__call__(text)
# FIXME: if there are more than 1 code block, this means the output actually contains ```,
# so re-parse this with that in mind
if len(blocks) != 1:
raise ValueError(f"Expected a single code block, got {len(blocks)}")
return blocks[0]
class OptionalCodeBlockParser:
def __call__(self, text: str) -> str:
text = text.strip()
if text.startswith("```") and text.endswith("\n```"):
# Remove the first and last line. Note the first line may include syntax
# highlighting, so we can't just remove the first 3 characters.
text = "\n".join(text.splitlines()[1:-1]).strip()
return text
class JSONParser:
def __init__(self, spec: Optional[BaseModel] = None, strict: bool = True):
self.spec = spec
self.strict = strict or (spec is not None)
@property
def schema(self):
return self.spec.model_json_schema() if self.spec else None
@staticmethod
def errors_to_markdown(errors: list) -> str:
error_txt = []
for error in errors:
loc = ".".join(str(loc) for loc in error["loc"])
etype = error["type"]
msg = error["msg"]
error_txt.append(f"- `{loc}`: {etype} ({msg})")
return "\n".join(error_txt)
def __call__(self, text: str) -> Union[BaseModel, dict, None]:
text = text.strip()
if text.startswith("```"):
try:
text = CodeBlockParser()(text)
except ValueError:
if self.strict:
raise
else:
return None
try:
data = json.loads(text.strip())
except json.JSONDecodeError as e:
if self.strict:
raise ValueError(f"JSON is not valid: {e}") from e
else:
return None
if self.spec is None:
return data
try:
model = self.spec(**data)
except ValidationError as err:
errtxt = self.errors_to_markdown(err.errors())
raise ValueError(f"Invalid JSON format:\n{errtxt}") from err
except Exception as err:
raise ValueError(f"Error parsing JSON: {err}") from err
return model
class EnumParser:
def __init__(self, spec: Enum, ignore_case: bool = True):
self.spec = spec
self.ignore_case = ignore_case
def __call__(self, text: str) -> Enum:
text = text.strip()
if self.ignore_case:
text = text.lower()
try:
return self.spec(text)
except ValueError as e:
options = ", ".join([str(v) for v in self.spec])
raise ValueError(f"Invalid option '{text}'; valid options: {options}") from e
class StringParser:
def __call__(self, text: str) -> str:
# Strip any leading and trailing whitespace
text = text.strip()
# Check and remove quotes at the start and end if they match
if text.startswith(("'", '"')) and text.endswith(("'", '"')) and len(text) > 1:
# Remove the first and last character if they are both quotes
if text[0] == text[-1]:
text = text[1:-1]
return text

48
core/llm/prompt.py Normal file
View File

@@ -0,0 +1,48 @@
from os.path import isdir
from typing import Any, Optional
from jinja2 import BaseLoader, Environment, FileSystemLoader, StrictUndefined, TemplateNotFound
class FormatTemplate:
def __call__(self, template: str, **kwargs: dict[str, Any]) -> str:
return template.format(**kwargs)
class BaseJinjaTemplate:
def __init__(self, loader: Optional[BaseLoader]):
self.env = Environment(
loader=loader,
autoescape=False,
lstrip_blocks=True,
trim_blocks=True,
keep_trailing_newline=True,
undefined=StrictUndefined,
)
class JinjaStringTemplate(BaseJinjaTemplate):
def __init__(self):
super().__init__(None)
def __call__(self, template: str, **kwargs: dict[str, Any]) -> str:
tpl = self.env.from_string(template)
return tpl.render(**kwargs)
class JinjaFileTemplate(BaseJinjaTemplate):
def __init__(self, template_dirs: list[str]):
for td in template_dirs:
if not isdir(td):
raise ValueError(f"Template directory does not exist: {td}")
super().__init__(FileSystemLoader(template_dirs))
def __call__(self, template: str, **kwargs: dict[str, Any]) -> str:
try:
tpl = self.env.get_template(template)
except TemplateNotFound as err:
raise ValueError(f"Template not found: {template}") from err
return tpl.render(**kwargs)
__all__ = ["FormatTemplate", "JinjaStringTemplate", "JinjaFileTemplate"]

28
core/llm/request_log.py Normal file
View File

@@ -0,0 +1,28 @@
from datetime import datetime
from enum import Enum
from pydantic import BaseModel, Field
from core.config import LLMProvider
class LLMRequestStatus(str, Enum):
SUCCESS = "success"
ERROR = "error"
class LLMRequestLog(BaseModel):
provider: LLMProvider
model: str
temperature: float
messages: list[dict[str, str]] = Field(default_factory=list)
response: str = ""
prompt_tokens: int = 0
completion_tokens: int = 0
started_at: datetime = Field(default_factory=datetime.now)
duration: float = 0.0
status: LLMRequestStatus = LLMRequestStatus.SUCCESS
error: str = ""
__all__ = ["LLMRequestLog", "LLMRequestStatus"]

50
core/log/__init__.py Normal file
View File

@@ -0,0 +1,50 @@
from logging import FileHandler, Formatter, Logger, StreamHandler, getLogger
from core.config import LogConfig
def setup(config: LogConfig, force: bool = False):
"""
Set up logging based on the current configuration.
The method is idempotent unless `force` is set to True,
in which case it will reconfigure the logging.
"""
root = getLogger()
logger = getLogger("pythagora")
# Only clear/remove existing log handlers if we're forcing a new setup
if not force and (root.handlers or logger.handlers):
return
while force and root.handlers:
root.removeHandler(root.handlers[0])
while force and logger.handlers:
logger.removeHandler(logger.handlers[0])
level = config.level
formatter = Formatter(config.format)
if config.output:
handler = FileHandler(config.output, encoding="utf-8")
else:
handler = StreamHandler()
handler.setFormatter(formatter)
handler.setLevel(level)
logger.setLevel(level)
logger.addHandler(handler)
def get_logger(name) -> Logger:
"""
Get log function for a given (module) name
:return: Logger instance
"""
return getLogger(name)
__all__ = ["setup", "get_logger"]

0
core/proc/__init__.py Normal file
View File

21
core/proc/exec_log.py Normal file
View File

@@ -0,0 +1,21 @@
from datetime import datetime
from typing import Optional
from pydantic import BaseModel, Field
class ExecLog(BaseModel):
started_at: datetime = Field(default_factory=datetime.now)
duration: float = Field(description="The duration of the command/process run in seconds")
cmd: str = Field(description="The full command (as executed in the shell)")
cwd: str = Field(description="The working directory for the command (relative to project root)")
env: dict = Field(description="The environment variables for the command")
timeout: Optional[float] = Field(description="The command timeout in seconds (or None if no timeout)")
status_code: Optional[int] = Field(description="The command return code, or None if there was a timeout")
stdout: str = Field(description="The command standard output")
stderr: str = Field(description="The command standard error")
analysis: str = Field(description="The result analysis as performed by the LLM")
success: bool = Field(description="Whether the command was successful")
__all__ = ["ExecLog"]

View File

@@ -0,0 +1,278 @@
import asyncio
import signal
import sys
import time
from dataclasses import dataclass
from os import getenv
from os.path import abspath, join
from typing import Callable, Optional
from uuid import UUID, uuid4
import psutil
from core.log import get_logger
log = get_logger(__name__)
NONBLOCK_READ_TIMEOUT = 0.01
BUSY_WAIT_INTERVAL = 0.1
WATCHER_IDLE_INTERVAL = 1.0
MAX_COMMAND_TIMEOUT = 180
@dataclass
class LocalProcess:
id: UUID
cmd: str
cwd: str
env: dict[str, str]
stdout: str
stderr: str
_process: asyncio.subprocess.Process
def __hash__(self) -> int:
return hash(self.id)
@staticmethod
async def start(
cmd: str,
*,
cwd: str = ".",
env: dict[str, str],
bg: bool = False,
) -> "LocalProcess":
log.debug(f"Starting process: {cmd} (cwd={cwd}, env={env})")
_process = await asyncio.create_subprocess_shell(
cmd,
cwd=cwd,
env=env,
start_new_session=bg,
stdin=asyncio.subprocess.PIPE,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
)
if bg:
_process.stdin.close()
return LocalProcess(
id=uuid4(),
cmd=cmd,
cwd=cwd,
env=env,
stdout="",
stderr="",
_process=_process,
)
async def wait(self, timeout: Optional[float] = None) -> int:
try:
future = self._process.wait()
if timeout:
future = asyncio.wait_for(future, timeout)
retcode = await future
except asyncio.TimeoutError:
log.debug(f"Process {self.cmd} still running after {timeout}s, terminating")
await self.terminate()
# FIXME: this may still hang if we don't manage to kill the process.
retcode = await self._process.wait()
await self.read_output()
return retcode
@staticmethod
async def _nonblock_read(reader: asyncio.StreamReader, timeout: float) -> str:
"""
Reads data from a stream reader without blocking (for long).
This wraps the read in a (short) timeout to avoid blocking the event loop for too long.
:param reader: Async stream reader to read from.
:param timeout: Timeout for the read operation (should not be too long).
:return: Data read from the stream reader, or empty string.
"""
try:
data = await asyncio.wait_for(reader.read(), timeout)
return data.decode("utf-8", errors="ignore")
except asyncio.TimeoutError:
return ""
async def read_output(self, timeout: float = NONBLOCK_READ_TIMEOUT) -> tuple[str, str]:
new_stdout = await self._nonblock_read(self._process.stdout, timeout)
new_stderr = await self._nonblock_read(self._process.stderr, timeout)
self.stdout += new_stdout
self.stderr += new_stderr
return (new_stdout, new_stderr)
async def _terminate_process_tree(self, signal: int):
# This is a recursive function that terminates the entire process tree
# of the current process. It first terminates all child processes, then
# terminates itself.
shell_process = psutil.Process(self._process.pid)
processes = shell_process.children(recursive=True)
processes.append(shell_process)
for proc in processes:
try:
proc.send_signal(signal)
except psutil.NoSuchProcess:
pass
psutil.wait_procs(processes, timeout=1)
async def terminate(self, kill: bool = True):
if kill and sys.platform != "win32":
await self._terminate_process_tree(signal.SIGKILL)
else:
# Windows doesn't have SIGKILL
await self._terminate_process_tree(signal.SIGTERM)
@property
def is_running(self) -> bool:
try:
return psutil.Process(self._process.pid).is_running()
except psutil.NoSuchProcess:
return False
@property
def pid(self) -> int:
return self._process.pid
class ProcessManager:
def __init__(
self,
*,
root_dir: str,
env: Optional[dict[str, str]] = None,
output_handler: Optional[Callable] = None,
exit_handler: Optional[Callable] = None,
):
if env is None:
env = {
"PATH": getenv("PATH"),
}
self.processes: dict[UUID, LocalProcess] = {}
self.default_env = env
self.root_dir = root_dir
self.watcher_should_run = True
self.watcher_task = asyncio.create_task(self.watcher())
self.output_handler = output_handler
self.exit_handler = exit_handler
async def stop_watcher(self):
"""
Stop the process watcher.
This should only be done when the ProcessManager is no longer needed.
"""
if not self.watcher_should_run:
raise ValueError("Process watcher is not running")
self.watcher_should_run = False
await self.watcher_task
async def watcher(self):
"""
Watch over the processes and manage their output and lifecycle.
This is a separate coroutine running independently of the caller
coroutine.
"""
# IDs of processes whos output has been fully read after they finished
complete_processes = set()
while self.watcher_should_run:
procs = [p for p in self.processes.values() if p.id not in complete_processes]
if len(procs) == 0:
await asyncio.sleep(WATCHER_IDLE_INTERVAL)
continue
for process in procs:
out, err = await process.read_output()
if self.output_handler and (out or err):
await self.output_handler(out, err)
if not process.is_running:
# We're not removing the complete process from the self.processes
# list to give time to the rest of the system to read its outputs
complete_processes.add(process.id)
if self.exit_handler:
await self.exit_handler(process)
# Sleep a bit to avoid busy-waiting
await asyncio.sleep(BUSY_WAIT_INTERVAL)
async def start_process(
self,
cmd: str,
*,
cwd: str = ".",
env: Optional[dict[str, str]] = None,
bg: bool = True,
) -> LocalProcess:
env = {**self.default_env, **(env or {})}
abs_cwd = abspath(join(self.root_dir, cwd))
process = await LocalProcess.start(cmd, cwd=abs_cwd, env=env, bg=bg)
if bg:
self.processes[process.id] = process
return process
async def run_command(
self,
cmd: str,
*,
cwd: str = ".",
env: Optional[dict[str, str]] = None,
timeout: float = MAX_COMMAND_TIMEOUT,
) -> tuple[Optional[int], str, str]:
"""
Run command and wait for it to finish.
Status code is an integer representing the process exit code, or
None if the process timed out and was terminated.
:param cmd: Command to run.
:param cwd: Working directory.
:param env: Environment variables.
:param timeout: Timeout in seconds.
:return: Tuple of (status code, stdout, stderr).
"""
timeout = min(timeout, MAX_COMMAND_TIMEOUT)
terminated = False
process = await self.start_process(cmd, cwd=cwd, env=env, bg=False)
t0 = time.time()
while process.is_running and (time.time() - t0) < timeout:
out, err = await process.read_output(BUSY_WAIT_INTERVAL)
if self.output_handler and (out or err):
await self.output_handler(out, err)
if process.is_running:
log.debug(f"Process {cmd} still running after {timeout}s, terminating")
await process.terminate()
terminated = True
else:
await process.wait()
out, err = await process.read_output()
if self.output_handler and (out or err):
await self.output_handler(out, err)
if terminated:
status_code = None
else:
status_code = process._process.returncode or 0
return (status_code, process.stdout, process.stderr)
def list_running_processes(self):
return [p for p in self.processes.values() if p.is_running]
async def terminate_process(self, process_id: UUID) -> tuple[str, str]:
if process_id not in self.processes:
raise ValueError(f"Process {process_id} not found")
process = self.processes[process_id]
await process.terminate(kill=False)
del self.processes[process_id]
return (process.stdout, process.stderr)

View File

@@ -0,0 +1,68 @@
You're designing the architecture and technical specifications for a new project.
If the project requirements call out for specific technology, use that. Otherwise, if working on a web app, prefer Node.js for the backend (with Express if a web server is needed, and MongoDB if a database is needed), and Bootstrap for the front-end. You MUST NOT use Docker, Kubernetes, microservices and single-page app frameworks like React, Next.js, Angular, Vue or Svelte unless the project details explicitly require it.
Here are the details for the new project:
-----------------------------
{% include "partials/project_details.prompt" %}
{% include "partials/features_list.prompt" %}
-----------------------------
Based on these details, think step by step to design the architecture for the project and choose technologies to use in building it.
1. First, design and describe project architecture in general terms
2. Then, list any system dependencies that should be installed on the system prior to start of development. For each system depedency, output a {{ os }} command to check whether it's installed.
3. Finally, list any other 3rd party packages or libraries that will be used (that will be installed later using packager a package manager in the project repository/environment).
4. {% if templates %}Optionally, choose a project starter template.{% else %}(for this project there are no available starter/boilerplate templates, so there's no template to choose){% endif %}
{% if templates %}
You have an option to use a project template that implements standard boilerplate/scaffolding so you can start faster and be more productive. To be considered, a template must be compatible with the architecture and technologies you've choosen (it doesn't need to implement everything that will be used in the project, just a useful subset). If multiple templates can be considered, pick one that's the best match.
If no project templates are a good match, don't pick any! It's better to start from scratch than to use a template that is not a good fit for the project and then spend time reworking it to fit the requirements.
Here are the available project templates:
{% for name, tpl in templates.items() %}
### {{ name }}
{{ tpl.description }}
Contains:
{{ tpl.summary }}
{% endfor %}
{% endif %}
*IMPORTANT*: You must follow these rules while creating your project:
* You must only list *system* dependencies, ie. the ones that need to be installed (typically as admin) to set up the programming language, database, etc. Any packages that will need to be installed via language/platform-specific package managers are *not* system dependencies.
* If there are several popular options (such as Nginx or Apache for web server), pick one that would be more suitable for the app in question.
* DO NOT include text editors, IDEs, shells, OpenSSL, CLI tools such as git, AWS, or Stripe clients, or other utilities in your list. only direct dependencies required to build and run the project.
* If a dependency (such as database) has a cloud alternative or can be installed on another computer (ie. isn't required on this computer), you must mark it as `required_locally: false`
Output only your response in JSON format like in this example, without other commentary:
```json
{
"architecture": "Detailed description of the architecture of the application",
"system_dependencies": [
{
"name": "Node.js",
"description": "JavaScript runtime for building apps. This is required to be able to run the app you're building.",
"test": "node --version",
"required_locally": true
},
{
"name": "MongoDB",
"description": "NoSQL database. If you don't want to install MongoDB locally, you can use a cloud version such as MongoDB Atlas.",
"test": "mongosh --version",
"required_locally": false
},
...
],
"package_dependencies": [
{
"name": "express",
"description": "Express web server for Node"
},
...
],
"template": "name of the project template to use" // or null if you decide not to use a project template
}
```

View File

@@ -0,0 +1,2 @@
{# This is the same template as for Developer's breakdown because Code Monkey is reusing it in a conversation #}
{% extends "developer/breakdown.prompt" %}

View File

@@ -0,0 +1,26 @@
Your task is to explain the functionality implemented by a particular source code file.
Given a file path and file contents, your output should contain:
* a detailed explanation of what the file is about;
* a list of all other files referenced (imported) from this file. note that some libraries, frameworks or libraries assume file extension and don't use it explicitly. For example, "import foo" in Python references "foo.py" without specifying the extension. In your response, use the complete file name including the implied extension (for example "foo.py", not just "foo").
Please analyze file `{{ path }}`, which contains the following content:
```
{{ content }}
```
Output the result in a JSON format with the following structure, as in this example:
Example:
{
"summary": "Describe in detail the functionality being defind o implemented in this file. Be as detailed as possible",
"references": [
"some/file.py",
"some/other/file.js"
],
}
**IMPORTANT** In references, only include references to files that are local to the project. Do not include standard libraries or well-known external dependencies.
Your response must be a valid JSON document, following the example format. Do not add any extra explanation or commentary outside the JSON document.

View File

@@ -0,0 +1,56 @@
{% if rework_feedback is defined %}
You previously made changes to file `{{ file_name }}`, according to the instructions described in the previous message.
The reviewer accepted some of your changes, and the file now looks like this:
```
{{ file_content }}
```
{% elif file_content %}
I need to modify file `{{ file_name }}` that currently looks like this:
```
{{ file_content }}
```
{% else %}
I need to create a new file `{{ file_name }}`:
{% endif %}
**IMPORTANT**
{% if rework_feedback is defined %}
But not all changes were accepted, and the reviewer provided feedback on the changes that you must rework:
{{ rework_feedback}}
Please update the file accordingly and output the full new version of the file.
{% else %}
I want you to implement changes described in previous message, that starts with `{{ " ".join(instructions.split()[:5]) }}` and ends with `{{ " ".join(instructions.split()[-5:]) }}`.
{% endif %}
Make sure you don't make any mistakes, especially ones that could affect rest of project. Your changes will {% if rework_feedback is defined %}again {% endif %}be reviewed by very detailed reviewer. Because of that, it is extremely important that you are STRICTLY following ALL the following rules while implementing changes:
**IMPORTANT** Output format
You must output the COMPLETE NEW VERSION of this file in following format:
-----------------------format----------------------------
```
the full contents of the updated file, without skipping over any content
```
------------------------end_of_format---------------------------
**IMPORTANT** Comprehensive Codebase Insight
It's crucial to grasp the full scope of the codebase related to your tasks to avert mistakes. Check the initial conversation message for a list of files. Pay a lot of attention to files that are directly included in the file you are currently modifying or that are importing your file.
Consider these examples to guide your approach and thought process:
-----------------------start_of_examples----------------------------
- UI components or templates: Instead of placing scripts directly on specific pages, integrating them in the <head> section or as reusable partials enhances application-wide consistency and reusability.
- Database operations: Be careful not to execute an action, like password hashing, both in a routing function and a model's pre('save') hook, which could lead to redundancy and errors.
- Adding backend logic: Prior to creating new functions, verify if an equivalent function exists in the codebase that you could import and use, preventing unnecessary code duplication and keeping the project efficient.
-----------------------end_of_examples----------------------------
**IMPORTANT** Coding principles
Write high-quality code, first organize it logically with clear, meaningful names for variables, functions, and classes. Aim for simplicity and adhere to the DRY (Don't Repeat Yourself) principle to avoid code duplication. Ensure your codebase is structured and modular for easy navigation and updates.
**IMPORTANT** If the instructions have comments like `// ..add code here...` or `# placeholder for code`, instead of copying the comment, interpret the instructions and output the relevant code.
**IMPORTANT** Your reply MUST NOT omit any code in the new implementation or substitute anything with comments like `// .. rest of the code goes here ..` or `# insert existing code here`, because I will overwrite the existing file with the content you provide. Output ONLY the content for this file, without additional explanation, suggestions or notes. Your output MUST start with ``` and MUST end with ``` and include only the complete file contents.
**IMPORTANT** For hardcoded configuration values that the user needs to change, mark the line that needs user configuration with `INPUT_REQUIRED {config_description}` comment, where `config_description` is a description of the value that needs to be set by the user. Use appropriate syntax for comments in the file you're saving (for example `// INPUT_REQUIRED {config_description}` in JavaScript). NEVER ask the user to write code or provide implementation, even if the instructions suggest it! If the file type doesn't support comments (eg JSON), don't add any.
**IMPORTANT**: Logging
Whenever you write code, make sure to log code execution so that when a developer looks at the CLI output, they can understand what is happening on the server. If the description above mentions the exact code that needs to be added but doesn't contain enough logs, you need to add the logs handlers inside that code yourself.
**IMPORTANT**: Error handling
Whenever you write code, make sure to add error handling for all edge cases you can think of because this app will be used in production so there shouldn't be any crashes. Whenever you log the error, you **MUST** log the entire error message and trace and not only the error message. If the description above mentions the exact code that needs to be added but doesn't contain enough error handlers, you need to add the error handlers inside that code yourself.

View File

@@ -0,0 +1,17 @@
Your changes have been reviewed.
{% if content != original_content %}
The reviewer approved and applied some of your changes, but requested you rework the others.
Here's the file with the approved changes already applied:
```
{{ content }}
```
Here's the reviewer's feedback:
{% else %}
The reviewer requested that you rework your changes, here's the feedback:
{% endif %}
{{ rework_feedback }}
Based on this feedback and the original instructions, think carefully, make the correct changes, and output the entire file again. Remember, Output ONLY the content for this file, without additional explanation, suggestions or notes. Your output MUST start with ``` and MUST end with ``` and include only the complete file contents.

View File

@@ -0,0 +1,3 @@
You are a full stack software developer that works in a software development agency.
You write modular, clean, maintainable, production-ready code.
Your job is to implement tasks that your tech lead assigns you.

View File

@@ -0,0 +1,2 @@
{# This is the same template as for Developer's breakdown because Code Reviewer is reusing it in a conversation #}
{% extends "developer/breakdown.prompt" %}

View File

@@ -0,0 +1,29 @@
A developer on your team has been working on the task described in previous message. Based on those instructions, the developer has made changes to file `{{ file_name }}`.
Here is the original content of this file:
```
{{ old_content }}
```
Here is the diff of the changes:
{% for hunk in hunks %}## Hunk {{ loop.index }}
```diff
{{ hunk }}
```
{% endfor %}
As you can see, there {% if hunks|length == 1 %}is only one hunk in this diff, and it{% else %}are {{hunks|length}} hunks in this diff, and each{% endif %} starts with the `@@` header line.
When reviewing the code changes, apply these principles to decide on each hunk:
- Apply: Approve and integrate the hunk into our core codebase if it accurately delivers the intended functionality or enhancement, aligning with our project objectives. This action confirms the change is beneficial and meets our quality standards.
- Ignore: Use this option sparingly, only when you're certain the entire hunk is incorrect or will introduce errors (logical, syntax, etc.) that could negatively impact the project. Ignoring means the hunk will be completely removed. This should be reserved for cases where the inclusion of the code is definitively more harmful than its absence. Emphasize careful consideration before choosing 'Ignore.' It's crucial for situations where the hunk's removal is the only option to prevent significant issues. Otherwise, 'Rework' might be the better choice to ensure the code's integrity and functionality.
- Rework: Suggest this option if the concept behind the change is valid and necessary but is implemented in a way that introduces problems. This indicates a need for a revision of the hunk to refine its integration without fully discarding the underlying idea.
When deciding what should be done with the hunk you are currently reviewing, pick an option that most reviewers of your skill would choose. Your decisions have to be consistent.
Keep in mind you're just reviewing current file. You don't need to consider if other files are created, dependent packages installed, etc. Focus only on reviewing the changes in this file based on the instructions in the previous message.
Note that the developer may add, modify or delete logging (including `gpt_pilot_debugging_log`) or error handling that's not explicitly asked for, but is a part of good development practice. Unless these logging and error handling additions break something, your decision to apply, ignore or rework the hunk should not be based on this. Base your decision only on functional changes - comments or logging are less important. Importantly, don't ask for a rework just because of logging or error handling changes. Also, take into account this is a junior developer and while the approach they take may not be the best practice, if it's not *wrong*, let it pass. Ask for rework only if the change is clearly bad and would break something.
The developer that wrote this is sometimes sloppy and has could have deleted some parts of the code that contain important functionality and should not be deleted. Pay special attention to that in your review.

View File

@@ -0,0 +1,2 @@
You are a world class full stack software developer. You write modular, clean, maintainable, production-ready code.
Your job is to review changes implemented by your junior team members.

View File

@@ -0,0 +1,34 @@
You are working on an app called "{{ state.branch.project.name }}" and you need to write code for the entire {% if state.epics|length > 1 %}feature{% else %}app{% endif %} based on the tasks that the tech lead gives you. So that you understand better what you're working on, you're given other specs for "{{ state.branch.project.name }}" as well.
{% include "partials/project_details.prompt" %}
{% include "partials/features_list.prompt" %}
{% include "partials/files_list.prompt" %}
We've broken the development of this {% if state.epics|length > 1 %}feature{% else %}app{% endif %} down to these tasks:
```
{% for task in state.tasks %}
{{ loop.index }}. {{ task.description }}{% if task.get("completed") %} (completed){% endif %}
{% endfor %}
```
You are currently working on task #{{ current_task_index + 1 }} with the following description:
```
{{ task.description }}
```
{% if current_task_index != 0 %}All previous tasks are finished and you don't have to work on them.{% endif %}
Now, tell me all the code that needs to be written to implement ONLY this task and have it fully working and all commands that need to be run to implement this task.
**IMPORTANT**
{%- if state.epics|length == 1 %}
Remember, I created an empty folder where I will start writing files that you tell me and that are needed for this app.
{% endif %}
{% include "partials/relative_paths.prompt" %}
DO NOT specify commands to create any folders or files, they will be created automatically - just specify the relative path to each file that needs to be written.
{% include "partials/file_naming.prompt" %}
{% include "partials/execution_order.prompt" %}
{% include "partials/human_intervention_explanation.prompt" %}
{% include "partials/file_size_limit.prompt" %}
Never use the port 5000 to run the app, it's reserved.

View File

@@ -0,0 +1,16 @@
We're starting work on a new task for a project we're working on.
{% include "partials/project_details.prompt" %}
{% include "partials/files_list.prompt" %}
{% include "partials/relative_paths.prompt" %}
We've broken the development of the project down to these tasks:
```
{% for task in state.tasks %}
{{ loop.index }}. {{ task.description }}{% if task.get("completed") %} (completed){% endif %}
{% endfor %}
```
The next task we need to work on is: {{ current_task.description }}
Before we dive into solving this task, we need to determine which files which files from the above list are relevant to this task. Output the relevant files in a JSON list.

View File

@@ -0,0 +1 @@
{% extends "troubleshooter/iteration.prompt" %}

View File

@@ -0,0 +1,43 @@
Ok, now, take your response and convert it to a list of actionable steps that will be executed by a machine.
Analyze the entire message, think step by step and make sure that you don't omit any information
when converting this message to steps.
Each step can be either:
* `command` - command to run (must be able to run on a {{ os }} machine, assume current working directory is project root folder)
* `save_file` - create or update ONE file
* `human_intervention` - if you need the human to do something, use this type of step and explain in details what you want the human to do. NEVER use `human_intervention` for testing, as testing will be done separately by a dedicated QA after all the steps are done. Also you MUST NOT use `human_intervention` to ask the human to write or review code.
**IMPORTANT**: If multiple changes are required for same file, you must provide single `save_file` step for each file.
{% include "partials/file_naming.prompt" %}
{% include "partials/relative_paths.prompt" %}
{% include "partials/execution_order.prompt" %}
{% include "partials/human_intervention_explanation.prompt" %}
**IMPORTANT**: Remember, NEVER output human intervention steps to do manual tests or coding tasks, even if the previous message asks for it! The testing will be done *after* these steps and you MUST NOT include testing in these steps.
Examples:
------------------------example_1---------------------------
```
{
"tasks": [
{
"type": "save_file",
"save_file": {
"path": "server.js"
},
},
{
"type": "command",
"command": {
"command": "mv index.js public/index.js"",
"timeout": 5,
"success_message": "",
"command_id": "move_index_file"
}
}
]
}
```
------------------------end_of_example_1---------------------------

View File

@@ -0,0 +1,5 @@
You are a world class full stack software developer working in a team.
You write modular, well-organized code split across files that are not too big, so that the codebase is maintainable. You include proper error handling and logging for your clean, readable, production-level quality code.
Your job is to implement tasks assigned by your tech lead, following task implementation instructions.

View File

@@ -0,0 +1,58 @@
A coding task has been implemented for the new project we're working on.
{% include "partials/project_details.prompt" %}
{% include "partials/files_list.prompt" %}
We've broken the development of the project down to these tasks:
```
{% for task in state.tasks %}
{{ loop.index }}. {{ task.description }}{% if task.get("completed") %} (completed){% endif %}
{% endfor %}
```
The current task is: {{ current_task.description }}
Here are the detailed instructions for the current task:
```
{{ current_task.instructions }}
```
{# FIXME: the above stands in place of a previous (task breakdown) convo, and is duplicated in define_user_review_goal, review_task and debug prompts #}
{% if task_steps and step_index is not none -%}
The current task has been split into multiple steps, and each step is one of the following:
* `command` - command to run
* `save_file` - create or update a file
* `human_intervention` - if the human needs to do something
{# FIXME: this is copypasted from ran_command #}
Here is the list of all steps in in this task (steps that were already completed are marked as COMPLETED, future steps that will be executed once debugging is done are marked as FUTURE, and the current step is marked as CURRENT STEP):
{% for step in task_steps %}
* {% if loop.index0 < step_index %}(COMPLETED){% elif loop.index0 > step_index %}(FUTURE){% else %}(**CURRENT STEP**){% endif %} {{ step.type }}: `{% if step.type == 'command' %}{{ step.command.command }}{% elif step.type == 'save_file' %}{{ step.save_file.path }}{% endif %}`
{% endfor %}
When trying to see if command was ran successfully, take into consideration steps that were previously executed and steps that will be executed after the current step. It can happen that command seems like it failed but it will be fixed with next steps. In that case you should consider that command to be successfully executed.
{%- endif %}
I ran the command `{{ cmd }}`, and it {% if status_code is none %}timed out{% else %}exited with status code {{ status_code }}{% endif %}.
{% if stdout %}
Command stdout:
```
{{ stdout }}
```
{% endif %}
{% if stderr %}
Command stderr:
```
{{ stderr }}
```
{% endif %}
{# end copypasted #}
{{ analysis }}
Based on the above, I want you to propose a step by step plan to solve the problem and continue with the the current task. I will take your plan and replace the current steps with it, so make sure it contains everything needed to complete this task AND THIS TASK ONLY.
{% include "partials/file_naming.prompt" %}
{% include "partials/execution_order.prompt" %}
{% include "partials/human_intervention_explanation.prompt" %}
{% include "partials/file_size_limit.prompt" %}

View File

@@ -0,0 +1,56 @@
A coding task has been implemented for the new project we're working on.
{% include "partials/project_details.prompt" %}
{% include "partials/files_list.prompt" %}
We've broken the development of the project down to these tasks:
```
{% for task in state.tasks %}
{{ loop.index }}. {{ task.description }}{% if task.get("completed") %} (completed){% endif %}
{% endfor %}
```
The current task is: {{ current_task.description }}
Here are the detailed instructions for the current task:
```
{{ current_task.instructions }}
```
{# FIXME: the above stands in place of a previous (task breakdown) convo, and is duplicated in define_user_review_goal and debug prompts #}
{% if task_steps and step_index is not none -%}
The current task has been split into multiple steps, and each step is one of the following:
* `command` - command to run
* `save_file` - create or update a file
* `human_intervention` - if the human needs to do something
Here is the list of all steps in in this task (steps that were already completed are marked as COMPLETED, future steps that will be executed once debugging is done are marked as FUTURE, and the current step is marked as CURRENT STEP):
{% for step in task_steps %}
* {% if loop.index0 < step_index %}(COMPLETED){% elif loop.index0 > step_index %}(FUTURE){% else %}(**CURRENT STEP**){% endif %} {{ step.type }}: `{% if step.type == 'command' %}{{ step.command.command }}{% elif step.type == 'save_file' %}{{ step.save_file.path }}{% endif %}`
{% endfor %}
When trying to see if command was ran successfully, take into consideration steps that were previously executed and steps that will be executed after the current step. It can happen that command seems like it failed but it will be fixed with next steps. In that case you should consider that command to be successfully executed.
{%- endif %}
I ran the command `{{ cmd }}`, and it {% if status_code is none %}timed out{% else %}exited with status code {{ status_code }}{% endif %}.
{% if stdout %}
Command stdout:
```
{{ stdout }}
```
{% endif %}
{% if stderr %}
Command stderr:
```
{{ stderr }}
```
{% endif %}
Think about the output and result of this command in the context of current task and current step. Provide detailed analysis of the output and determine if the command was successfully executed.
Output your response in the following JSON format:
```
{
"analysis": "Detailed analysis of the command results. In this error the command was successfully executed because...",
"success": true
}
```

View File

@@ -0,0 +1 @@
All the steps will be executed in order in which you give them, so it is very important that you think about all steps before you start listing them. For example, you should never code something before you install dependencies or you should never try access a file before it exists in project.

View File

@@ -0,0 +1,16 @@
{% if state.epics|length > 2 %}
Here is the list of features that were previously implemented on top of initial high level description of "{{ state.branch.project.name }}":
```
{% for feature in state.epics[1:] %}
- {{ loop.index0 }}. {{ feature.summary }}
{% endfor %}
```
{% endif %}
{% if state.epics|length > 1 %}
Here is the feature that you are implementing right now:
```
{{ state.unfinished_epics[0].description }}
```
{% endif %}

View File

@@ -0,0 +1 @@
**IMPORTANT**: When creating and naming new files, ensure the file naming (camelCase, kebab-case, underscore_case, etc) is consistent with the best practices and coding style of the language.

View File

@@ -0,0 +1,2 @@
**IMPORTANT**
When you think about in which file should the new code go to, always try to make files as small as possible and put code in more smaller files rather than in one big file.

View File

@@ -0,0 +1,26 @@
{% if state.relevant_files %}
These files are currently implemented in the project:
{% for file in state.files %}
* `{{ file.path }}{% if file.meta.get("description") %}: {{file.meta.description}}{% endif %}`
{% endfor %}
Here are the complete contents of files relevant to this task:
---START_OF_FILES---
{% for file in state.relevant_file_objects %}
File **`{{ file.path }}`** ({{file.content.content.splitlines()|length}} lines of code):
```
{{ file.content.content }}```
{% endfor %}
---END_OF_FILES---
{% elif state.files %}
These files are currently implemented in the project:
---START_OF_FILES---
{% for file in state.files %}
**`{{ file.path }}`** ({{file.content.content.splitlines()|length}} lines of code):
```
{{ file.content.content }}```
{% endfor %}
---END_OF_FILES---
{% endif %}

View File

@@ -0,0 +1,38 @@
**IMPORTANT**
You must not tell me to run a command in the database or anything OS related - only if some dependencies need to be installed. If there is a need to run an OS related command, specifically tell me that this should be labeled as "Human Intervention" and explain what the human needs to do.
Avoid using "Human Intervention" if possible. You should NOT use "Human Intervention" for anything else than steps that you can't execute. Also, you must not use "Human Intervention" to ask user to test that the application works, because this will be done separately after all the steps are finished - no need to ask the user now.
Here are a few examples when and how to use "Human Intervention":
------------------------start_of_example_1---------------------------
Here is an example of good response for the situation where it seems like 3rd party API, in this case Facebook, is not working:
* "Human Intervention"
"1. Check latest Facebook API documentation for updates on endpoints, parameters, or authentication.
2. Verify Facebook API key/authentication and request format to ensure they are current and correctly implemented.
3. Use REST client tools like Postman or cURL to directly test the Facebook API endpoints.
4. Check the Facebook API's status page for any reported downtime or service issues.
5. Try calling the Facebook API from a different environment to isolate the issue."
------------------------end_of_example_1---------------------------
------------------------start_of_example_2---------------------------
Here is an example of good response for the situation where the user needs to enable some settings in their Gmail account:
* "Human Intervention"
"To enable sending emails from your Node.js app via your Gmail, account, you need to do the following:
1. Log in to your Gmail account.
2. Go to 'Manage your Google Account' > Security.
3. Scroll down to 'Less secure app access' and turn it on.
4. Under 'Signing in to Google', select 'App Passwords'. (You may need to sign in again)
5. At the bottom, click 'Select app' and choose the app youre using.
6. Click 'Generate'.
Then, use your gmail address and the password generated in the step #6 and put it into the .env file."
------------------------end_of_example_2---------------------------
------------------------start_of_example_3---------------------------
Here is an example when there are issues with writing to the MongoDB connection:
* "Human Intervention"
"1. Verify the MongoDB credentials provided have write permissions, not just read-only access.
2. Confirm correct database and collection names are used when connecting to database.
3. Update credentials if necessary to include insert document permissions."
------------------------end_of_example_3---------------------------

View File

@@ -0,0 +1,22 @@
Here is a high level description of "{{ state.branch.project.name }}":
```
{{ state.specification.description }}
```
{% if state.specification.architecture %}
Here is a short description of the project architecture:
{{ state.specification.architecture }}
{% endif %}
{% if state.specification.system_dependencies %}
Here are the technologies that should be used for this project:
{% for tech in state.specification.system_dependencies %}
* {{ tech.name }} - {{ tech.description }}
{% endfor %}
{% endif %}
{% if state.specification.package_dependencies %}
{% for tech in state.specification.package_dependencies %}
* {{ tech.name }} - {{ tech.description }}
{% endfor %}
{% endif %}

View File

@@ -0,0 +1,67 @@
Before we go into the coding part, I want you to split the development process of creating this {{ task_type }} into smaller tasks so that it is easier to develop, debug and make the {{ task_type }} work.
Each task needs to be related only to the development of this {{ task_type }} and nothing else - once the {{ task_type }} is fully working, that is it. There shouldn't be a task for researching, deployment, writing documentation, testing or anything that is not writing the actual code.
**IMPORTANT**
As an experienced tech lead you always follow rules on how to create tasks. Dividing project into tasks is extremely important job and you have to do it very carefully.
Now, based on the project details provided{% if task_type == 'feature' %} and new feature description{% endif %}, think task by task and create the entire development plan{% if task_type == 'feature' %} for new feature{% elif task_type == 'app' %}. {% if state.files %}Continue from the existing code listed above{% else %}Start from the project setup{% endif %} and specify each task until the moment when the entire app should be fully working{% if state.files %}. You should not reimplement what's already done - just continue from the implementation already there{% endif %}{% endif %} while strictly following these rules:
Rule #1
There should never be a task that is only testing or ensuring something works, every task must have coding involved. Have this in mind for every task, but it is extremely important for last task of project. Testing if {{ task_type }} works will be done as part of each task.
Rule #2
This rule applies to the complexity of tasks.
You have to make sure the project is not split into tasks that are too small or simple for no reason but also not too big or complex so that they are hard to develop, debug and review.
Have in mind that project already has workspace folder created and only system dependencies installed. You don't have to create tasks for that.
Here are examples of poorly created tasks:
**too simple tasks**
- Set up a Node.js project and install all necessary dependencies.
- Establish a MongoDB database connection using Mongoose with the IP '127.0.0.1'.
**too complex tasks**
- Set up Node.js project with /home, /profile, /register and /login routes that will have user authentication, connection to MongoDB with user schemas, mailing of new users and frontend with nice design.
You must to avoid creating tasks that are too simple or too complex. You have to aim to create tasks that are medium complexity. Here are examples of tasks that are good:
**good tasks**
- Set up a Node.js project, install all necessary dependencies and set up an express server with a simple route to `/ping` that returns the status 200.
- Establish a MongoDB database connection and implement the message schema using Mongoose for persistent storage of messages.
Rule #3
This rule applies to the number of tasks you will create.
Every {{ task_type }} should have different number of tasks depending on complexity. Think task by task and create the minimum number of tasks that are relevant for this specific {{ task_type }}.
{% if task_type == 'feature' %} If the feature is small, it is ok to have only 1 task.{% endif %}
Here are some examples of apps with different complexity that can give you guidance on how many tasks you should create:
Example #1:
app description: "I want to create an app that will just say 'Hello World' when I open it on my localhost:3000."
number of tasks: 1
Example #2:
app description: "Create a node.js app that enables users to register and log into the app. On frontend it should have /home (shows user data), /register and /login. It should use sessions to keep user logged in."
number of tasks: 2-4
Example #3:
app description: "A cool online shoe store, with a sleek look. In terms of data models, there are shoes, categories and user profiles. For web pages: product listing, details, shopping cart. It must look cool and jazzy."
number of tasks: 5-15
Rule #4
This rule applies to writing task 'description'.
Every task must have a clear and very detailed (must be minimum of 4 sentences but can be more) 'description'. It must be very clear so that even developers who just moved to this project can execute them without additional questions. It is not enough to just write something like "Create a route for /home". You have to describe what needs to be done in that route, what data needs to be returned, what should be the status code, etc. Give as many details as possible and make sure no information is missing that could be needed for this task.
Here is an example of good and bad task description:
**bad task**
{
"description": "Create a route for /dashboard"
}
**good task**
{
"description": "In 'route.js' add a route for /dashboard that returns the status 200. Route should be accessible only for logged in users. In 'middlewares.js' there should be a check if user is logged in using session. If user is not logged in, it should redirect to /login. If user is logged in, it should return the user data. User data should be fetched from database in 'users' collection using the user id from session."
}
Rule #5
When creating and naming new files, ensure the file naming (camelCase, kebab-case, underscore_case, etc) is consistent with the best practices and coding style of the language.
Pay attention to file paths: if the command or argument is a file or folder from the project, use paths relative to the project root (for example, use `somedir/somefile` instead of `/somedir/somefile`).

View File

@@ -0,0 +1 @@
**IMPORTANT**: Pay attention to file paths: if the command or argument is a file or folder from the project, use paths relative to the project root (for example, use `somedir/somefile` instead of `/path/to/project/somedir/somefile`).

View File

@@ -0,0 +1,57 @@
You are working on an app called "{{ state.branch.project.name }}" and you need to write code for the entire {% if state.epics|length > 1 %}feature{% else %}app{% endif %} based on the tasks that the tech lead gives you. So that you understand better what you're working on, you're given other specs for "{{ state.branch.project.name }}" as well.
{% include "partials/project_details.prompt" %}
{% include "partials/features_list.prompt" %}
We've broken the development of this {% if state.epics|length > 1 %}feature{% else %}app{% endif %} down to these tasks:
```
{% for task in state.tasks %}
{{ loop.index }}. {{ task.description }}{% if task.get("completed") %} (completed){% endif %}
{% endfor %}
```
{% if state.current_task %}
You are currently working on, and have to focus only on, this task:
```
{{ state.current_task.description }}
```
{% endif %}
A part of the app is already finished.
{% include "partials/files_list.prompt" %}
You are trying to solve an issue that your colleague is reporting.
{% if previous_solutions|length > 0 %}
You tried {{ previous_solutions|length }} times to solve it but it was unsuccessful. In last few attempts, your colleague gave you this report:
{% for solution in previous_solutions[-3:] %}
----------------------------start_of_report_{{ loop.index }}----------------------------
{{ solution.user_feedback }}
----------------------------end_of_report_{{ loop.index }}----------------------------
Then, you gave the following proposal (proposal_{{ loop.index }}) of what needs to be done to fix the issue:
----------------------------start_of_proposal_{{ loop.index }}----------------------------
{{ solution.description }}
----------------------------end_of_of_proposal_{{ loop.index }}----------------------------
{% if not loop.last %}
Then, upon implementing these changes, your colleague came back with the following report:
{% endif %}
{% endfor %}
{% endif %}
{% if user_input != '' %}
Your colleague who is testing the app "{{ name }}" sent you this report now:
```
{{ user_input }}
```
You tried to solve this problem before but your colleague is telling you that you got into a loop where all your tries end up the same way - with an error.
{%- endif -%}
It seems that the solutions you're proposing aren't working.
Now, think step by step about 5 alternative solutions to get this code to work that are most probable to solve this issue.
Every proposed solution needs to be concrete and not vague (eg, it cannot be "Review and change apps functionality") and based on the code changes. A solution can be complex if it's related to the same part of the code (eg. "Try changing the input variables X, Y and Z to a method N").
Order them in the order of the biggest probability of fixing the problem. A developer will then go through this list item by item, try to implement it, and check if it solved the issue until the end of the list.

View File

@@ -0,0 +1 @@
{% extends "troubleshooter/iteration.prompt" %}

View File

@@ -0,0 +1,76 @@
Your task is to talk to a new client and develop a detailed specification for a new application the client wants to build. This specification will serve as an input to an AI software developer and thus must be very detailed, contain all the project functionality and precisely define behaviour, 3rd-party integrations (if any), etc.
The AI developer prefers working on web apps using Node/Express/MongoDB/Mongoose/EJS stack, and use vanilla JS with Bootstrap on the frontend, unless the client has different requirements.
Try to avoid the use of Docker, Kubernetes, microservices and single-page app frameworks like React, Next.js, Angular, Vue or Svelte unless the brief explicitly requires it.
In your work, follow these important rules:
* In your communication with the client, be straightforward, concise, and focused on the task.
* Ask questions ONE BY ONE. This is veryy important, as the client is easily confused. If you were to ask multiple questions the user would probably miss some questions, so remember to always ask the questions one by one
* Ask specific questions, taking into account what you already know about the project. For example, don't ask "what features do you need?" or "describe your idea"; instead ask "what is the most important feature?"
* Pay special attention to any documentation or information that the project might require (such as accessing a custom API, etc). Be sure to ask the user to provide information and examples that the developers will need to build the proof-of-concept. You will need to output all of this in the final specification.
* This is a a prototype project, it is important to have small and well-defined scope. If the scope seems to grow too large (beyond a week or two of work for one developer), ask the user if they can simplify the project.
* Do not address non-functional requirements (performance, deployment, security, budget, timelines, etc...). We are only concerned with functional and technical specification here.
* Do not address deployment or hosting, including DevOps tasks to set up a CI/CD pipeline
* Don't address or invision any future development (post proof-of-concept), the scope of your task is to only spec the PoC/prototype.
* If the user provided specific information on how to access 3rd party API or how exactly to implement something, you MUST include that in the specification. Remember, the AI developer will only have access to the specification you write.
Ensure that you have all the information about:
* overall description and goals for the app
* all the features of the application
* functional specification
* how the user will use the app
* enumerate all the parts of the application (eg. pages of the application, background processing if any, etc); for each part, explain *in detail* how it should work from the perspective of the user
* identify any constraints, business rules, user flows or other important info that affect how the application works or how it is used
* technical specification
* what kind of an application this is and what platform/technologies will be used
* the architecture of the application (what happens on backend, frontend, mobile, background tasks, integration with 3rd party services, etc)
* detailed description of each component of the application architecture
* integration specification
* any 3rd party apps, services, APIs that will be used (eg. for auth, payments, etc..)
* if a custom API is used, precise definitions, with examples, how to use the custom API or do the custom integration
If you identify any missing information or need clarification on any vague or ambiguous parts of the brief, ask the client about it.
Important note: don't ask trivial questions for obvious or unimportant parts of the app, for example:
* Bad questions example 1:
* Client brief: I want to build a hello world web app
* Bad questions:
* What title do you want for the web page that displays "Hello World"?
* What color and font size would you like for the "Hello World" text to be displayed in?
* Should the "Hello World" message be static text served directly from the server, or would you like it implemented via JavaScript on the client side?
* Explanation: There's no need to micromanage the developer(s) and designer(s), the client would've specified these details if they were important.
If you ask such trivial questions, the client will think you're stupid and will leave. DOn'T DO THAT
Think carefully about what a developer must know to be able to build the app. The specification must address all of this information, otherwise the AI software developer will not be able to build the app.
When you gather all the information from the client, output the complete specification. Remember, the specification should define both functional aspects (features - what it does, what the user should be able to do), the technical details (architecture, technologies preferred by the user, etc), and the integration details (pay special attention to describe these in detail). Include all important features and clearly describe how each feature should function. IMPORTANT: Do not add any preamble (eg. "Here's the specification....") or conclusion/commentary (eg. "Let me know if you have further questions")!
Here's an EXAMPLE initial prompt:
---start-of-example-output---
Online forum similar to Hacker News (news.ycombinator.com), with a simple and clean interface, where people can post links or text posts, and other people can upvote, downvote and comment on. Reading is open to anonymous users, but users must register to post, upvote, downvote or comment. Use simple username+password authentication. The forum should be implemented in Node.js with Express framework, using MongoDB and Mongoose ORM.
The UI should use EJS view engine, Bootstrap for styling and plain vanilla JavaScript. Design should be simple and look like Hacker News, with a top bar for navigation, using a blue color scheme instead of the orange color in HN. The footer in each page should just be "Built using GPT Pilot".
Each story has a title (one-line text), a link (optional, URL to an external article being shared on AI News), and text (text to show in the post). Link and text are mutually exclusive - if the submitter tries to use both, show them an error.
Use the following algorithm to rank top stories, and comments within a story: "score = upvotes - downvotes + comments - sqrt(age)" , where "upvotes" and "downvotes" are the number of upvotes and downvotes the story or comment has, "comments" is the number of comments for a story (total), or the number of sub-comments (for a comment), and "age" is how old is the story, in minutes, and "sqrt" is the square root function.
Implement the following pages:
* / - shows the top 20 posted stories, ranked using the scoring algorithm, with a "More" link that shows the next 20 (pagination using "p" query parameter), and so on
* /newest - shows the latest 20 posted stories, ranked chronologically (newest first), with a "More" link that shows the next 20 (pagination using "p" query parameter), and so on
* /submit - shows a form to submit a new story, upon submitting the user should get redirected to /newest
* /login - shows a login form (username, password, "login" button, and a link to register page for new users)
* /register - shows a register form (username, password, "register" button, and a link to login page for existing users)
* /item - shows the story (use "id" query parameter to pass the story ID to this route)
* /comment - shows the form to send a comment (just a textarea and "submit" button) - upon commenting, the person should get redirected to the story they commented on
The / and /newest pages should show the story title (link to the external article if "link" is set, otherwise link to the story item /item page), number of points (points = upvotes - downvotes), poster username (no link), how old is the story ("x minutes ago", "y hours ago" or "z days ago"), and "xyz comments" (link to /item page of the story). This is basically the same how HN shows it.
The /item page should also follow the layout for HN in how it shows the story, and the comments tree. Instead of the embedded "reply" form, the story should just have a "comment" button that goes to the /comment page, similar to the "reply" link underneath each comment. Both should link to the /comment page.
---end-of-example-output---
Remember, this is important: the AI developer will not have access to client's initial description and transcript of your conversation. The developer will only see the specification you output on the end. It is very important that the spec captures *all* the details of the project in as much detail and precision as possible.
Note: after the client reads the specification you create, the client might have additional comments or suggestions. In this case, continue the discussion with the user until you get all the new information and output the newly updated spec again.

View File

@@ -0,0 +1,8 @@
```
{{ prompt }}
```
The above is a user prompt for application/software tool they are trying to develop. Determine the complexity of the user's request. Do NOT respond with thoughts, reasoning, explanations or anything similar, return ONLY a string representation of the complexity level. Use the following scale:
"hard" for high complexity
"moderate" for moderate complexity
"simple" for low complexity

Some files were not shown because too many files have changed in this diff Show More