Merge branch 'main' into debugging_ipc

# Conflicts: # README.md # pilot/helpers/agents/CodeMonkey.py # pilot/helpers/agents/Developer.py # pilot/prompts/system_messages/architect.prompt # pilot/utils/arguments.py # pilot/utils/llm_connection.py # pilot/utils/utils.py
2026-01-09 21:27:53 -05:00 · 2023-09-28 13:53:49 +10:00
parent 433f05d313 f5d14c1ecc
commit 3129e34903
13 changed files with 224 additions and 26 deletions
--- a/README.md
+++ b/README.md
@@ -50,7 +50,6 @@ https://github.com/Pythagora-io/gpt-pilot/assets/10895136/0495631b-511e-451b-93d

 # 🔌 Requirements

-
 - **Python 3**
 - **PostgreSQL** (optional, projects default is SQLite)
   - DB is needed for multiple reasons like continuing app development if you had to stop at any point or app crashed, going back to specific step so that you can change some later steps in development, easier debugging, for future we will add functionality to update project (change some things in existing project or add new features to the project and so on)...
@@ -88,28 +87,93 @@ All generated code will be stored in the folder `workspace` inside the folder na

 This will start two containers, one being a new image built by the `Dockerfile` and a postgres database. The new image also has [ttyd](https://github.com/tsl0922/ttyd) installed so that you can easily interact with gpt-pilot. Node is also installed on the image and port 3000 is exposed.

-# 🧑‍💻️ Other arguments
- continue working on an existing app
+
+# 🧑‍💻️ CLI arguments
+
+## `app_type` and `name`
+If not provided, the ProductOwner will ask for these values
+
+`app_type` is used as a hint to the LLM as to what kind of architecture, language options and conventions would apply. If not provided, `prompts.prompts.ask_for_app_type()` will ask for it.
+
+See `const.common.ALL_TYPES`: 'Web App', 'Script', 'Mobile App', 'Chrome Extension'
+
+
+## `app_id` and `workspace`
+Continue working on an existing app using **`app_id`**
 ```bash
 python main.py app_id=<ID_OF_THE_APP>
 ```

- continue working on an existing app from a specific step
+_or_ **`workspace`** path:
+
+```bash
+python main.py workspace=<PATH_TO_PROJECT_WORKSPACE>
+```
+
+Each user can have their own workspace path for each App.
+
+
+## `user_id`, `email` and `password`
+These values will be saved to the User table in the DB.
+
+```bash
+python main.py user_id=me_at_work
+```
+
+If not specified, `user_id` defaults to the OS username, but can be provided explicitly if your OS username differs from your GitHub or work username. This value is used to load the `App` config when the `workspace` arg is provided.
+
+If not specified `email` will be parsed from `~/.gitconfig` if the file exists.
+
+See also [What's the purpose of arguments.password / User.password?](https://github.com/Pythagora-io/gpt-pilot/discussions/55)
+
+
+## `advanced`
+The Architect by default favours certain technologies including: 
+
+- Node.JS
+- MongoDB
+- PeeWee ORM
+- Jest & PyUnit
+- Bootstrap
+- Vanilla JavaScript
+- Socket.io
+
+If you have your own preferences, you can have a deeper conversation with the Architect.
+
+```bash
+python main.py advanced=True
+```
+
+
+## `step`
+Continue working on an existing app from a specific **`step`** (eg: `user_tasks`)
 ```bash
 python main.py app_id=<ID_OF_THE_APP> step=<STEP_FROM_CONST_COMMON>
 ```

- continue working on an existing app from a specific development step
+
+## `skip_until_dev_step`
+Continue working on an existing app from a specific **development step**
 ```bash
 python main.py app_id=<ID_OF_THE_APP> skip_until_dev_step=<DEV_STEP>
 ```
 This is basically the same as `step` but during the actual development process. If you want to play around with gpt-pilot, this is likely the flag you will often use.
 <br>
- erase all development steps previously done and continue working on an existing app from start of development
+
+Erase all development steps previously done and continue working on an existing app from start of development
+
 ```bash
 python main.py app_id=<ID_OF_THE_APP> skip_until_dev_step=0
 ```

+
+## `delete_unrelated_steps`
+
+
+## `update_files_before_start`
+
+
+
 # 🔎 Examples

 Here are a couple of example apps GPT Pilot created by itself:
@@ -157,7 +221,9 @@ Here are the steps GPT Pilot takes to create an app:
 5. **DevOps agent** checks if all technologies are installed on the machine and installs them if they are not
 6. **Tech Lead agent** writes up development tasks that Developer will need to implement. This is an important part because, for each step, Tech Lead needs to specify how the user (real world developer) can review if the task is done (e.g. open localhost:3000 and do something)
 7. **Developer agent** takes each task and writes up what needs to be done to implement it. The description is in human-readable form.
-8. Finally, **Code Monkey agent** takes the Developer's description and the currently implemented file and implements the changes into it. We realized this works much better than giving it to Developer right away to implement changes.
+8. Finally, **Code Monkey agent** takes the Developer's description and the existing file and implements the changes into it. We realized this works much better than giving it to Developer right away to implement changes.
+
+For more details on the roles of agents employed by GPT Pilot refer to [AGENTS.md](https://github.com/Pythagora-io/gpt-pilot/blob/main/pilot/helpers/agents/AGENTS.md)

 ![GPT Pilot Coding Workflow](https://github.com/Pythagora-io/gpt-pilot/assets/10895136/53ea246c-cefe-401c-8ba0-8e4dd49c987b)

--- a/pilot/helpers/agents/AGENTS.md
+++ b/pilot/helpers/agents/AGENTS.md
@@ -0,0 +1,64 @@
+Roles are defined in `const.common.ROLES`.
+Each agent's role is described to the LLM by a prompt in `pilot/prompts/system_messages/{role}.prompt`
+
+## Product Owner
+`project_description`, `user_stories`, `user_tasks`
+
+- Talk to client, ask detailed questions about what client wants
+- Give specifications to dev team
+
+
+## Architect
+`architecture`
+
+- Scripts: Node.js, MongoDB, PeeWee ORM
+- Testing: Node.js -> Jest, Python -> pytest, E2E -> Cypress **(TODO - BDD?)**
+- Frontend: Bootstrap, vanilla Javascript **(TODO - TypeScript, Material/Styled, React/Vue/other?)**
+- Other: cronjob, Socket.io
+
+TODO: 
+- README.md
+- .gitignore
+- .editorconfig
+- LICENSE
+- CI/CD
+- IaC, Dockerfile
+
+
+## Tech Lead
+`development_planning`
+
+- Break down the project into smaller tasks for devs.
+- Specify each task as clear as possible:
+  - Description
+  - "Programmatic goal" which determines if the task can be marked as done.
+    eg: "server needs to be able to start running on a port 3000 and accept API request 
+         to the URL `http://localhost:3000/ping` when it will return the status code 200"
+  - "User-review goal" 
+    eg: "run `npm run start` and open `http://localhost:3000/ping`, see "Hello World" on the screen"
+
+
+## Dev Ops
+`environment_setup`
+
+**TODO: no prompt**
+
+`debug` functions: `run_command`, `implement_code_changes`
+
+
+## Developer (full_stack_developer)
+`create_scripts`, `coding` **(TODO: No entry in `STEPS` for `create_scripts`)**
+
+- Implement tasks assigned by tech lead
+- Modular code, TDD
+- Tasks provided as "programmatic goals" **(TODO: consider BDD)**
+
+
+
+## Code Monkey
+**TODO: not listed in `ROLES`**
+
+`development/implement_changes` functions: `save_files`
+
+- Implement tasks assigned by tech lead
+- Modular code, TDD
--- a/pilot/helpers/agents/Architect.py
+++ b/pilot/helpers/agents/Architect.py
@@ -39,6 +39,7 @@ class Architect(Agent):
            #  'user_tasks': self.project.user_tasks,
             'app_type': self.project.args['app_type']}, ARCHITECTURE)

+        # TODO: Project.args should be a defined class so that all of the possible args are more obvious
        if self.project.args.get('advanced', False):
            architecture = get_additional_info_from_user(self.project, architecture, 'architect')

--- a/pilot/helpers/agents/CodeMonkey.py
+++ b/pilot/helpers/agents/CodeMonkey.py
@@ -1,4 +1,4 @@
-from const.function_calls import GET_FILES, DEV_STEPS, IMPLEMENT_CHANGES, CODE_CHANGES
+from const.function_calls import GET_FILES, IMPLEMENT_CHANGES
 from helpers.AgentConvo import AgentConvo
 from helpers.Agent import Agent

@@ -9,7 +9,7 @@ class CodeMonkey(Agent):
        self.developer = developer

    def implement_code_changes(self, convo, code_changes_description, step_index=0):
-        if convo == None:
+        if convo is None:
            convo = AgentConvo(self)

        # files_needed = convo.send_message('development/task/request_files_for_code_changes.prompt', {
--- a/pilot/helpers/agents/Developer.py
+++ b/pilot/helpers/agents/Developer.py
@@ -18,6 +18,7 @@ from utils.utils import get_os_info

 ENVIRONMENT_SETUP_STEP = 'environment_setup'

+
 class Developer(Agent):
    def __init__(self, project):
        super().__init__('full_stack_developer', project)
@@ -31,7 +32,7 @@ class Developer(Agent):
            self.project.skip_steps = False if ('skip_until_dev_step' in self.project.args and self.project.args['skip_until_dev_step'] == '0') else True

        # DEVELOPMENT
-        print(green_bold(f"Ok, great, now, let's start with the actual development...\n"))
+        print(green_bold(f"🚀 Now for the actual development...\n"))
        logger.info(f"Starting to create the actual code...")

        for i, dev_task in enumerate(self.project.development_plan):
@@ -337,7 +338,7 @@ class Developer(Agent):
                    for cmd in installation_commands:
                        run_command_until_success(cmd['command'], cmd['timeout'], self.convo_os_specific_tech)

-        logger.info('The entire tech stack needed is installed and ready to be used.')
+        logger.info('The entire tech stack is installed and ready to be used.')

        save_progress(self.project.args['app_id'], self.project.current_step, {
            "os_specific_technologies": os_specific_technologies,
@@ -421,5 +422,5 @@ class Developer(Agent):
                run_command_until_success(cmd['command'], cmd['timeout'], convo)
        # elif type == 'CODE_CHANGE':
        #     code_changes_details = get_step_code_changes()
-            # TODO: give to code monkey for implementation
+        #     # TODO: give to code monkey for implementation
        pass
--- a/pilot/helpers/cli.py
+++ b/pilot/helpers/cli.py
@@ -247,6 +247,7 @@ def build_directory_tree(path, prefix="", ignore=None, is_last=False, files=None

    return output

+
 def execute_command_and_check_cli_response(command, timeout, convo):
    """
    Execute a command and check its CLI response.
--- a/pilot/prompts/prompts.py
+++ b/pilot/prompts/prompts.py
@@ -9,7 +9,7 @@ from logger.logger import logger


 def ask_for_app_type():
-    return 'Web App'
+    return 'App'
    answer = styled_select(
        "What type of app do you want to build?",
        choices=common.APP_TYPES
@@ -37,7 +37,7 @@ def ask_for_app_type():
 def ask_for_main_app_definition(project):
    description = styled_text(
        project,
-        "Describe your app in as many details as possible."
+        "Describe your app in as much detail as possible."
    )

    if description is None:
@@ -67,9 +67,22 @@ def ask_user(project, question: str, require_some_input=True, hint: str = None):


 def get_additional_info_from_openai(project, messages):
+    """
+    Runs the conversation between Product Owner and LLM.
+    Provides the user's initial description, LLM asks the user clarifying questions and user responds.
+    Limited by `MAX_QUESTIONS`, exits when LLM responds "EVERYTHING_CLEAR".
+
+    :param project: Project
+    :param messages: [
+        { "role": "system", "content": "You are a Product Owner..." },
+        { "role": "user", "content": "I want you to create the app {name} that can be described: ```{description}```..." }
+      ]
+    :return: The updated `messages` list with the entire conversation between user and LLM.
+    """
    is_complete = False
    while not is_complete:
        # Obtain clarifications using the OpenAI API
+        # { 'text': new_code }
        response = create_gpt_chat_completion(messages, 'additional_info')

        if response is not None:
@@ -92,12 +105,21 @@ def get_additional_info_from_openai(project, messages):


 # TODO refactor this to comply with AgentConvo class
-def get_additional_info_from_user(project,  messages, role):
+def get_additional_info_from_user(project, messages, role):
+    """
+    If `advanced` CLI arg, Architect offers user a chance to change the architecture.
+    Prompts: "Please check this message and say what needs to be changed. If everything is ok just press ENTER"...
+    Then asks the LLM to update the messages based on the user's feedback.
+
+    :param project: Project
+    :param messages: array<string | { "text": string }>
+    :param role: 'product_owner', 'architect', 'dev_ops', 'tech_lead', 'full_stack_developer', 'code_monkey'
+    :return: a list of updated messages - see https://github.com/Pythagora-io/gpt-pilot/issues/78
+    """
    # TODO process with agent convo
    updated_messages = []

    for message in messages:
-
        while True:
            if isinstance(message, dict) and 'text' in message:
                message = message['text']
@@ -114,15 +136,33 @@ def get_additional_info_from_user(project,  messages, role):
        updated_messages.append(message)

    logger.info('Getting additional info from user done')
-
    return updated_messages


 def generate_messages_from_description(description, app_type, name):
+    """
+    Called by ProductOwner.get_description().
+    :param description: "I want to build a cool app that will make me rich"
+    :param app_type: 'Web App', 'Script', 'Mobile App', 'Chrome Extension' etc
+    :param name: Project name
+    :return: [
+        { "role": "system", "content": "You are a Product Owner..." },
+        { "role": "user", "content": "I want you to create the app {name} that can be described: ```{description}```..." }
+      ]
+    """
+    # "I want you to create the app {name} that can be described: ```{description}```
+    # Get additional answers
+    # Break down stories
+    # Break down user tasks
+    # Start with Get additional answers
+    # {prompts/components/no_microservices}
+    # {prompts/components/single_question}
+    # "
    prompt = get_prompt('high_level_questions/specs.prompt', {
        'name': name,
        'prompt': description,
        'app_type': app_type,
+        # TODO: MAX_QUESTIONS should be configurable by ENV or CLI arg
        'MAX_QUESTIONS': MAX_QUESTIONS
    })

@@ -133,6 +173,20 @@ def generate_messages_from_description(description, app_type, name):


 def generate_messages_from_custom_conversation(role, messages, start_role='user'):
+    """
+    :param role: 'product_owner', 'architect', 'dev_ops', 'tech_lead', 'full_stack_developer', 'code_monkey'
+    :param messages: [
+        "I will show you some of your message to which I want you to make some updates. Please just modify your last message per my instructions.",
+        {LLM's previous message},
+        {user's request for change}
+    ]
+    :param start_role: 'user'
+    :return: [
+      { "role": "system", "content": "You are a ..., You do ..." },
+      { "role": start_role, "content": messages[i + even] },
+      { "role": "assistant" (or "user" for other start_role), "content": messages[i + odd] },
+      ... ]
+    """
    # messages is list of strings
    result = [get_sys_message(role)]

--- a/pilot/prompts/system_messages/architect.prompt
+++ b/pilot/prompts/system_messages/architect.prompt
@@ -1,10 +1,10 @@
 You are an experienced software architect. Your expertise is in creating an architecture for an MVP (minimum viable products) for {{ app_type }}s that can be developed as fast as possible by using as many ready-made technologies as possible. The technologies that you prefer using when other technologies are not explicitly specified are:
-**Scripts**: you prefer using Node.js for writing scripts that are meant to be ran just with the CLI.
+**Scripts**: You prefer using Node.js for writing scripts that are meant to be ran just with the CLI.

-**Backend**: you prefer using Node.js with Mongo database if not explicitly specified otherwise. When you're using Mongo, you always use Mongoose and when you're using Postgresql, you always use PeeWee as an ORM.
+**Backend**: You prefer using Node.js with Mongo database if not explicitly specified otherwise. When you're using Mongo, you always use Mongoose and when you're using a relational database, you always use PeeWee as an ORM.

 **Testing**: To create unit and integration tests, you prefer using Jest for Node.js projects and pytest for Python projects. To create end-to-end tests, you prefer using Cypress.

-**Frontend**: you prefer using Bootstrap for creating HTML and CSS while you use plain (vanilla) Javascript.
+**Frontend**: You prefer using Bootstrap for creating HTML and CSS while you use plain (vanilla) Javascript.

-**Other**: From other technologies, if the project requires periodical script run, you prefer using cronjob (for making automated tasks), and if the project requires real time communication, you prefer Socket.io for web sockets
+**Other**: From other technologies, if the project requires periodical script run, you prefer using cronjob (for making automated tasks), and if the project requires real time communication, you prefer Socket.io for web sockets
--- a/pilot/prompts/utils/update.prompt
+++ b/pilot/prompts/utils/update.prompt
@@ -1 +1 @@
-I will show you some of your message to which I want make some updates. Please just modify your last message per my instructions.
+I will show you some of your message to which I want you to make some updates. Please just modify your last message per my instructions.
--- a/pilot/utils/arguments.py
+++ b/pilot/utils/arguments.py
@@ -52,7 +52,6 @@ def get_arguments():
            # Handle the error as needed, possibly exiting the script
    else:
        arguments['app_id'] = str(uuid.uuid4())
-        # TODO: This intro is also presented by Project.py. This version is not presented in the VS Code extension
        print(colored('\n------------------ STARTING NEW PROJECT ----------------------', 'green', attrs=['bold']))
        print("If you wish to continue with this project in future run:")
        print(colored(f'python {sys.argv[0]} app_id={arguments["app_id"]}', 'green', attrs=['bold']))
--- a/pilot/utils/llm_connection.py
+++ b/pilot/utils/llm_connection.py
@@ -95,6 +95,7 @@ def create_gpt_chat_completion(messages: List[dict], req_type, min_tokens=MIN_TO
            if key in gpt_data:
                del gpt_data[key]

+    # Advise the LLM of the JSON response schema we are expecting
    add_function_calls_to_request(gpt_data, function_calls)

    try:
@@ -140,8 +141,11 @@ def get_tokens_in_messages_from_openai_error(error_message):

 def retry_on_exception(func):
    def wrapper(*args, **kwargs):
+        # spinner = None
+
        while True:
            try:
+                # spinner_stop(spinner)
                return func(*args, **kwargs)
            except Exception as e:
                # Convert exception to string
@@ -154,16 +158,19 @@ def retry_on_exception(func):
                        args[0]['function_buffer'] = e.doc
                        continue
                if "context_length_exceeded" in err_str:
+                    # spinner_stop(spinner)
                    raise TokenLimitError(get_tokens_in_messages_from_openai_error(err_str), MAX_GPT_MODEL_TOKENS)
                if "rate_limit_exceeded" in err_str:
                    # Extracting the duration from the error string
                    match = re.search(r"Please try again in (\d+)ms.", err_str)
                    if match:
+                        # spinner = spinner_start(colored("Rate limited. Waiting...", 'yellow'))
                        wait_duration = int(match.group(1)) / 1000
                        time.sleep(wait_duration)
                    continue

                print(red(f'There was a problem with request to openai API:'))
+                # spinner_stop(spinner)
                print(err_str)

                user_message = questionary.text(
@@ -363,7 +370,7 @@ def assert_json_schema(response: str, functions: list[FunctionType]) -> True:
        return True


-def postprocessing(gpt_response, req_type):
+def postprocessing(gpt_response: str, req_type) -> str:
    return gpt_response


--- a/pilot/utils/spinner.py
+++ b/pilot/utils/spinner.py
@@ -9,4 +9,5 @@ def spinner_start(text="Processing..."):


 def spinner_stop(spinner):
-    spinner.stop()
+    if spinner is not None:
+        spinner.stop()
--- a/pilot/utils/utils.py
+++ b/pilot/utils/utils.py
@@ -85,6 +85,10 @@ def get_prompt_components():


 def get_sys_message(role):
+    """
+    :param role: 'product_owner', 'architect', 'dev_ops', 'tech_lead', 'full_stack_developer', 'code_monkey'
+    :return: { "role": "system", "content": "You are a {role}... You do..." }
+    """
    content = get_prompt(f'system_messages/{role}.prompt')

    return {
@@ -137,7 +141,7 @@ def should_execute_step(arg_step, current_step):
 def step_already_finished(args, step):
    args.update(step['app_data'])

-    message = f"{capitalize_first_word_with_underscores(step['step'])} already done for this app_id: {args['app_id']}. Moving to next step..."
+    message = f"✅  {capitalize_first_word_with_underscores(step['step'])}"
    print(green(message))
    logger.info(message)