Commit Graph

465 Commits

Author SHA1 Message Date
Zamil Majdy
475c5a5cc3 fix(backend): Avoid executing any agent with zero balance (#9901)
### Changes 🏗️

* Avoid executing any agent with a zero balance.
* Make node execution count global across agents for a single user.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
- [x] Run agents by tweaking the `execution_cost_count_threshold` &
`execution_cost_per_threshold` values.
2025-05-01 15:11:38 +00:00
Zamil Majdy
86d5cfe60b feat(backend): Support flexible RPC client (#9842)
Using sync code in the async route often introduces a blocking
event-loop code that impacts stability.

The current RPC system only provides a synchronous client to call the
service endpoints.
The scope of this PR is to provide an entirely decoupled signature
between client and server, allowing the client can mix & match async &
sync options on the client code while not changing the async/sync nature
of the server.

### Changes 🏗️

* Add support for flexible async/sync RPC client.
* Migrate scheduler client to all-async client.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Scheduler route test.
  - [x] Modified service_test.py
  - [x] Run normal agent executions
2025-05-01 04:38:06 +00:00
Zamil Majdy
3526986f98 fix(backend): Failing test on a new Pydantic version (#9897)
```
FAILED test/model_test.py::test_agent_preset_from_db - pydantic_core._pydantic_core.ValidationError: 1 validation error for AgentNodeExecutionInputOutput

E       pydantic_core._pydantic_core.ValidationError: 1 validation error for AgentNodeExecutionInputOutput
E       data
E         JSON input should be string, bytes or bytearray [type=json_type, input_value=Json, input_type=Json]
E           For further information visit https://errors.pydantic.dev/2.11/v/json_type
```

### Changes 🏗️

Manually creating a Prisma model often breaks, and we have such an
instance in the test.
This PR fixes the test to make the new Pydantic happy.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] CI
2025-04-30 13:59:17 +00:00
Nicholas Tindle
04c4340ee3 feat(frontend,backend): user spending admin dashboard (#9751)
<!-- Clearly explain the need for these changes: -->
We need a way to refund people who spend money on agents wihout making
manual db actions

### Changes 🏗️
- Adds a bunch for refunding users
- Adds reasons and admin id for actions
- Add admin to db manager
- Add UI for this for the admin panel
- Clean up pagination controls
<!-- Concisely describe all of the changes made in this pull request:
-->

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
  - [x] Test by importing dev db as baseline
- [x] Add transactions on top for "refund", and make sure all existing
transactions work

---------

Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
2025-04-29 17:39:25 +00:00
Zamil Majdy
9fa62c03f6 feat(backend): Improve cancel execution reliability (#9889)
When an executor dies, an ongoing execution will not be retried and will
just stuck in the running status.
This change avoids such a scenario by allowing an execution of an entry
that is not in QUEUED status with the low-probability risk of double
execution.

### Changes 🏗️

* Allow non-QUEUED status to be re-executed.
* Improve cleanup of node & graph executor.
* Make a cancellation request consumption a separate thread to avoid
being blocked by other messages.
* Remove unused retry loop on the execution manager.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
  - [x] Run agent, kill the server, re-run it, agent restarted.
2025-04-29 17:06:03 +00:00
Mareddy Lohith Reddy
d5dc687484 fix: handle empty 204 responses in SendWebRequestBlock (#9887)
<!-- Clearly explain the need for these changes: -->
This PR fixes [Issue
#9883](https://github.com/Significant-Gravitas/AutoGPT/issues/9883),
where the SendWebRequestBlock crashes when receiving a 204 No Content
response, such as when posting to a Discord webhook. The fix ensures
that empty responses are handled gracefully, and the block does not
crash.

### Changes 🏗️
- Added a check to handle empty HTTP responses (like 204 status) in
SendWebRequestBlock
- Fallback to empty string or None if there is no response content
- Prevents server errors when parsing non-existent response bodies

<!-- Concisely describe all of the changes made in this pull request:
-->

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
  - [x] Send a POST request to an endpoint that returns 204 No Content
  - [x]  Confirm that SendWebRequestBlock handles it without crashing
  - [x] Confirm that regular 200 OK JSON responses still work

---------

Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
Co-authored-by: Lohith-11 <lohithr011@gamil.com>
Co-authored-by: Toran Bruce Richards <toran.richards@gmail.com>
Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
2025-04-28 19:16:04 +00:00
Nicholas Tindle
8fdfd75cc4 feat: allow admins to download agents for review (#9881)
<!-- Clearly explain the need for these changes: -->
for admins to approve agents for the marketplace, we need to be able to
run them. this is a quick workaround for downloading them so you can put
them in your marketplace to check

### Changes 🏗️
- clones various endpoints related to downloading into an admin side
with logging, and admin checks
- adds download button and removes open in builder action
<!-- Concisely describe all of the changes made in this pull request:
-->

### Checklist 📋

#### For code changes:
- [ ] I have clearly listed my changes in the PR description
- [ ] I have made a test plan
- [ ] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
  - [ ] Test downloading agents from local marketplace
2025-04-28 17:58:23 +00:00
Nicholas Tindle
7d83f1db05 feat(block): bring back PrintConsoleBlock (#9850)
### Changes 🏗️

Bring back PrintConsoleBlock

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Print console block

---------

Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
2025-04-28 09:55:57 +00:00
Zamil Majdy
f07696e3c1 fix(backend): Fix top-up with zero transaction flow (#9886)
The transaction with zero payment amount will not generate a payment ID,
so the checkout failed for this scenario.

### Changes 🏗️

Don't use payment id as transaction key on top-up with zero payment
amount.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Top-up with stripe coupon
2025-04-27 14:22:08 +00:00
sentry-autofix[bot]
9715ea5313 fix: handle token limits and estimate token count for llm calls (#9880)
👋 Hi there! This PR was automatically generated by Autofix 🤖

This fix was triggered by Toran Bruce Richards.

Fixes
[AUTOGPT-SERVER-1ZY](https://sentry.io/organizations/significant-gravitas/issues/6386687527/).
The issue was that: `llm_call` calculates `max_tokens` without
considering `input_tokens`, causing OpenRouter API errors when the
context window is exceeded.

- Implements a function `estimate_token_count` to estimate the number of
tokens in a list of messages.
- Calculates available tokens based on the context window, estimated
input tokens, and user-defined max tokens.
- Adjusts `max_tokens` for LLM calls to prevent exceeding context window
limits.
- Reduces `max_tokens` by 15% and retries if a token limit error is
encountered during LLM calls.

If you have any questions or feedback for the Sentry team about this
fix, please email [autofix@sentry.io](mailto:autofix@sentry.io) with the
Run ID: 32838.

---------

Co-authored-by: sentry-autofix[bot] <157164994+sentry-autofix[bot]@users.noreply.github.com>
Co-authored-by: Krzysztof Czerwinski <kpczerwinski@gmail.com>
2025-04-25 13:45:47 +00:00
Zamil Majdy
ef022720d5 fix(backend): Avoid releasing lock that is no longer owned by the current thread (#9878)
There are instances of node executions that were failed and end up stuck
in the RUNNING status due to the execution failed to release the lock:
```
2025-04-24 20:53:31,573 INFO  [ExecutionManager|uid:25eba2d1-e9c1-44bc-88c7-43e0f4fbad5a|gid:01f8c315-c163-4dd1-a8a0-d396477c5a9f|nid:f8bf84ae-b1f0-4434-8f04-80f43852bc30]|geid:2e1b35c6-0d2f-4e97-adea-f6fe0d9965d0|neid:590b29ea-63ee-4e24-a429-de5a3e191e72|-] Failed node execution 590b29ea-63ee-4e24-a429-de5a3e191e72: Cannot release a lock that's no longer owned
```

### Changes 🏗️

Check the ownership of the lock before releasing.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
  - [x] Existing CI tests.
2025-04-25 07:39:10 +00:00
Zamil Majdy
91f34966c8 fix(block): Fix Smart Decision Block missing input beads & incompability with input in special characters (#9875)
Smart Decision Block was not able to work with sub agent with custom
name input & the bead were not properly propagated in the execution UI.
The scope of this PR is fixing it.

### Changes 🏗️

* Introduce an easy to parse format of tool edge:
`{tool}_^_{func}_~_{arg}`. Graph using SmartDecisionBlock needs to be
re-saved before execution to work.
* Reduce cluttering on a smart decision block logic.
* Fix beads not being shown for a smart decision block tool calling.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Execute an SDM with some special character input as a tool

<img width="672" alt="image"
src="https://github.com/user-attachments/assets/873556b3-c16a-4dd1-ad84-bc86c636c406"
/>
2025-04-24 19:24:41 +00:00
Krzysztof Czerwinski
0675a41e42 fix(backend): Strip secrets, credentials when forking agent (#9874)
Strip secrets, credentials when forking agent

### Changes 🏗️

<!-- Concisely describe all of the changes made in this pull request:
-->

### Checklist 📋

#### For code changes:
- [ ] I have clearly listed my changes in the PR description
- [ ] I have made a test plan
- [ ] I have tested my changes according to the test plan:
  - [ ] ...
2025-04-24 15:09:41 +00:00
Zamil Majdy
7fbe135ec8 feat(backend): Expose execution prometheus metrics (#9866)
Currently, we have no visibility on the state of the execution manager,
the scope of this PR is to open up the observability of it by exposing
Prometheus metrics.

### Changes 🏗️

Re-use the execution manager port to expose the Prometheus metrics.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Hit /metrics on 8002 port
2025-04-24 07:48:38 +00:00
Zamil Majdy
eb6a0b34e1 feat(backend): Use forkserver on process creation if possible (#9864)
### Changes 🏗️

Set process starting mode to forkserver instead of spawn, if possible,
for performance benefits.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Existing tests
2025-04-24 07:36:36 +00:00
Zamil Majdy
1e3236a041 feat(backend): Add retry on executor process initialization (#9865)
Executor process initialization can fail and cause this error:
```
concurrent.futures.process.BrokenProcessPool: A child process terminated abruptly, the process pool is not usable anymore
```

### Changes 🏗️

Add retry to reduce the chance of the initialization error to happen.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Existing tests
2025-04-23 21:24:17 +00:00
Krzysztof Czerwinski
160a622ba4 feat(platform): Forking agent in Library (#9870)
This PR introduces copying agents feature in the Library. Users can copy
and download their library agents but they can edit only the ones they
own (included copied ones).

### Changes 🏗️

- DB migration: add relation in `AgentGraph`: `forked_from_id` and
`forked_from_version`
- Add `fork_graph` function that makes a hardcopy of agent graph and its
nodes (all with new ids)
- Add `fork_library_agent` that copies library agent and its graph for a
user
- Add endpoint `/library/agents/{libraryAgentId}/fork`
- Add UI to `library/agents/[id]/page.tsx`: `Edit a copy` button with
dialog confirmation

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Agent can be copied, edited and runs
2025-04-23 16:28:42 +00:00
Zamil Majdy
20d39f6d44 fix(platform): Fix Google Maps API Key setting through env (#9848)
Setting the Google Maps API through the API has never worked on the
platform.

### Changes 🏗️

Set the default api key from the environment variable.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
  - [x] Test GoogleMapsBlock
2025-04-22 03:00:47 +07:00
Bently
d5b82c01e0 feat(backend): Adds latest llm models (#9856)
This PR adds the following models:
OpenAI's O3: https://platform.openai.com/docs/models/o3
OpenAI's GPT 4.1: https://platform.openai.com/docs/models/gpt-4.1
Anthropics Claude 3.7: https://www.anthropic.com/news/claude-3-7-sonnet
Googles gemini 2.5 pro:
https://openrouter.ai/google/gemini-2.5-pro-preview-03-25
2025-04-21 19:26:21 +00:00
Krzysztof Czerwinski
67af77e179 fix((backend): Fix migrate llm models in existing agents (#9810)
https://github.com/Significant-Gravitas/AutoGPT/pull/9452 was throwing
`operator does not exist: text ? unknown` on deployed dev and so the
function call was commented as a hotfix.
This PR fixes and re-enables the llm model migration function.

### Changes 🏗️

- Uncomment and fix `migrate_llm_models` function

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Migrate nodes with non-existing models
  - [x] Don't migrate nodes without any model or with correct models

---------

Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
2025-04-19 12:52:36 +00:00
Zamil Majdy
9052ee7b95 fix(backend): Clear RabbitMQ connection cache on execution-manager retry 2025-04-19 07:50:04 +02:00
Zamil Majdy
c783f64b33 fix(backend): Handle add execution API request failure (#9838)
There are cases where the publishing agent execution is failing, making
the agent execution appear to be stuck in a queue, but the execution has
never been in a queue in the first place.

### Changes 🏗️

On publishing failure, we set the graph & starting node execution status
to FAILED and let the UI bubble up the error so the user can try again.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Normal add execution flow
2025-04-18 18:35:43 +00:00
Zamil Majdy
055a231aed feat(backend): Add retry mechanism for pika publish_message (#9839)
For unknown reason publishing message can fail sometimes due to the
connection being broken:
MessageQueue suddenly unavailable, connection simply broke, connection
being reset, etc.

### Changes 🏗️

Adding a tenacity retry on AMQP or ConnectionError, which hopefully can
alleviate the issue.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Simple add execution
2025-04-18 17:56:27 +00:00
Reinier van der Leer
417d7732af feat(platform/library): Add credentials UX on /library/agents/[id] (#9789)
- Resolves #9771
- ... in a non-persistent way, so it won't work for webhook-triggered
agents
    For webhooks: #9541

### Changes 🏗️

Frontend:
- Add credentials inputs in Library "New run" screen (based on
`graph.credentials_input_schema`)
- Refactor `CredentialsInput` and `useCredentials` to not rely on XYFlow
context

- Unsplit lists of saved credentials in `CredentialsProvider` state

- Move logic that was being executed at component render to `useEffect`
hooks in `CredentialsInput`

Backend:
- Implement logic to aggregate credentials input requirements to one per
provider per graph
- Add `BaseGraph.credentials_input_schema` (JSON schema) computed field
    Underlying added logic:
- `BaseGraph._credentials_input_schema` - makes a `BlockSchema` from a
graph's aggregated credentials inputs
- `BaseGraph.aggregate_credentials_inputs()` - aggregates a graph's
nodes' credentials inputs using `CredentialsFieldInfo.combine(..)`
- `BlockSchema.get_credentials_fields_info() -> dict[str,
CredentialsFieldInfo]`
- `CredentialsFieldInfo` model (created from
`_CredentialsFieldSchemaExtra`)

- Implement logic to inject explicitly passed credentials into graph
execution
  - Add `credentials_inputs` parameter to `execute_graph` endpoint
- Add `graph_credentials_input` parameter to
`.executor.utils.add_graph_execution(..)`
  - Implement `.executor.utils.make_node_credentials_input_map(..)`
  - Amend `.executor.utils.construct_node_execution_input`
  - Add `GraphExecutionEntry.node_credentials_input_map` attribute
  - Amend validation to allow injecting credentials
    - Amend `GraphModel._validate_graph(..)`
    - Amend `.executor.utils._validate_node_input_credentials`
- Add `node_credentials_map` parameter to
`ExecutionManager.add_execution(..)`
    - Amend execution validation to handle side-loaded credentials
    - Add `GraphExecutionEntry.node_execution_map` attribute
- Add mechanism to inject passed credentials into node execution data
- Add credentials injection mechanism to node execution queueing logic
in `Executor._on_graph_execution(..)`

- Replace boilerplate logic in `v1.execute_graph` endpoint with call to
existing `.executor.utils.add_graph_execution(..)`
- Replace calls to `.server.routers.v1.execute_graph` with
`add_graph_execution`

Also:
- Address tech debt in `GraphModel._validate_gaph(..)`
- Fix type checking in `BaseGraph._generate_schema(..)`

#### TODO
- [ ] ~~Make "Run again" work with credentials in
`AgentRunDetailsView`~~
- [ ] Prohibit saving a graph if it has nodes with missing discriminator
value for discriminated credentials inputs

### Checklist 📋

#### For code changes:
- [ ] I have clearly listed my changes in the PR description
- [ ] I have made a test plan
- [ ] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
  - [ ] ...
2025-04-18 14:27:13 +00:00
Krzysztof Czerwinski
8ea64327a1 fix(backend): Fix array types in database (#9828)
Array fields in `schema.prisma` are non-nullable, but generated
migrations don’t add `NOT NULL` constraints. This causes existing rows
to get `NULL` values when new array columns are added, breaking schema
expectations and leading to bugs.

### Changes 🏗️

- Backfill all `NULL` rows on non-nullable array columns to empty arrays
- Set `NOT NULL` constraint on all array columns

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Existing `NULL` rows are properly backfilled
  - [x] Existing arrays are not set to default empty arrays
  - [x] Affected columns became non-nullable in the db
2025-04-18 07:43:54 +00:00
Zamil Majdy
f6a4b036c7 fix(block): Disable LLM blocks parallel tool calls (#9834)
SmartDecisionBlock sometimes tried to be smart by calling multiple tool
calls and our platform does not support this yet.

### Changes 🏗️

Disable parallel tool calls for OpenAI & OpenRouter LLM provider LLM
blocks.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
  - [x] Tested SmartDecisionBlock & AITextGeneratorBlock
2025-04-17 12:58:05 +00:00
Zamil Majdy
c43924cd4e feat(backend): Add RabbitMQ connection cleanup on executor shutdown hook 2025-04-17 01:28:15 +02:00
Zamil Majdy
e3846c22bd fix(backend): Avoid multithreaded pika access (#9832)
### Changes 🏗️

Avoid other threads accessing the channel within the same process.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Manual agent runs
2025-04-16 22:06:07 +00:00
Toran Bruce Richards
9a7a838418 fix(backend): Change node output logging type from info to debug (#9831)
<!-- Clearly explain the need for these changes: -->

### Changes 🏗️
This PR simply changes the logging type from info to debug of node
outputs in the agent.py file.
<!-- Concisely describe all of the changes made in this pull request:
-->

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [ ] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
  - [x] ...

<details>
  <summary>Example test plan</summary>
  
  - [ ] Create from scratch and execute an agent with at least 3 blocks
- [ ] Import an agent from file upload, and confirm it executes
correctly
  - [ ] Upload agent to marketplace
- [ ] Import an agent from marketplace and confirm it executes correctly
  - [ ] Edit an agent from monitor, and confirm it executes correctly
</details>

#### For configuration changes:
- [x] `.env.example` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)

<details>
  <summary>Examples of configuration changes</summary>

  - Changing ports
  - Adding new services that need to communicate with each other
  - Secrets or environment variable changes
  - New or infrastructure changes such as databases
</details>

---------

Co-authored-by: Bentlybro <Github@bentlybro.com>
2025-04-16 20:45:51 +00:00
Toran Bruce Richards
d61d815208 fix(logging): Change node data logging to debug level from info (#9830)
<!-- Clearly explain the need for these changes: -->

### Changes 🏗️
This change simply changes the logging level of node inputs and outputs
to debug level. This change is needed because currently logging all node
data causes logs that are too large for the logger to prevent nodes from
running.

<!-- Concisely describe all of the changes made in this pull request:
-->

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [ ] I have made a test plan
- [ ] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
  - [x] ...

<details>
  <summary>Example test plan</summary>
  
  - [ ] Create from scratch and execute an agent with at least 3 blocks
- [ ] Import an agent from file upload, and confirm it executes
correctly
  - [ ] Upload agent to marketplace
- [ ] Import an agent from marketplace and confirm it executes correctly
  - [ ] Edit an agent from monitor, and confirm it executes correctly
</details>

#### For configuration changes:
- [x] `.env.example` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)

<details>
  <summary>Examples of configuration changes</summary>

  - Changing ports
  - Adding new services that need to communicate with each other
  - Secrets or environment variable changes
  - New or infrastructure changes such as databases
</details>
2025-04-16 19:22:52 +00:00
Zamil Majdy
44e3770003 fix(backend): Fix execution manager message consuming pattern (#9829)
We have seen instances where the executor gets stuck in a failing
message-consuming loop due to the upstream RabbitMQ being down. The
current message-consuming pattern is not optimal for handling this.

### Changes 🏗️

* Add a retry limit to the execution loop limit.
* Use `basic_consume` instead of `basic_get` for handling message
consumption.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run agents cancel them
2025-04-16 22:54:26 +07:00
Zamil Majdy
71cdc18674 fix(backend): Fix cancel_execution can only work once (#9825)
### Changes 🏗️

The recent change to the execution cancelation fix turns out to only
work on the first request.
This PR change fixes it by reworking how the thread_cached work on async
functions.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
  - [x] Cancel agent executions multiple times
2025-04-16 10:33:49 +00:00
Nicholas Tindle
3e0742f9c5 Spike/infra pooling (#9812)
<!-- Clearly explain the need for these changes: -->
Swap to pooling supabase connections rather than depending on x number
of max open connections

### Changes 🏗️
Adds direct connect URL to be used throughout the system
<!-- Concisely describe all of the changes made in this pull request:
-->

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
- [x] Test thoroughly all of the endpoints in the dev env with switched
infra matching pr
  - [x] Follow the new release plan tests
  - [x] Follow the old release plan tests

#### For configuration changes:
- [x] `.env.example` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)

<details>
  <summary>configuration changes</summary>

- Change how we connect to the database to use direct when configured
and database URL when not
  - update prisma for this
  - have default matching database and default
</details>
2025-04-15 15:40:15 +00:00
Krzysztof Czerwinski
d791cdea76 feat(platform): Onboarding Phase 2 (#9736)
### Changes 🏗️

- Update onboarding to give user rewards for completing steps
- Remove `canvas-confetti` lib and add `party-js` instead; the former
didn't allow to play confetti from a component
- Add onboarding videos in `frontend/public/onboarding/`
- Remove Balance (`CreditsCard.tsx`) and add openable `Wallet.tsx` (and
accompanying `WalletTaskGroup.tsx`) instead that displays grouped
onboarding tasks with descriptions and short instructional videos
- Further relevant updates to `useOnboarding`, `types.ts`
- Implement onboarding rewards
- Add `onboarding_reward` function in `credit.py` that is used to reward
user for finished onboarding tasks safely - transaction key is
deterministic, so the same user won't be rewarded twice for the same
step.
  - Add `reward_user` in `onboarding.py`
- Update `UserOnboarding` model and add a migration

<img width="464" alt="Screenshot 2025-04-05 at 6 06 29 PM"
src="https://github.com/user-attachments/assets/fca8d09e-0139-466b-b679-d24117ad01f0"
/>

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Onboarding works
  - [x] Tasks can be completed
  - [x] Rewards are added correctly for all completed tasks
2025-04-12 10:56:59 +00:00
Zamil Majdy
bb92226f5d feat(backend): Remove RPC service from Agent Executor (#9804)
Currently the execution task is not properly distributed between
executors because we need to send the execution request to the execution
server.

The execution manager now accepts the execution request from the message
queue. Thus, we can remove the synchronous RPC system from this service,
let the system focus on executing the agent, and not spare any process
for the HTTP API interface.

This will also reduce the risk of the execution service being too busy
and not able to accept any add execution requests.

### Changes 🏗️

* Remove the RPC system in Agent Executor
* Allow the cancellation of the execution that is still waiting in the
queue (by avoiding it from being executed).
* Make a unified helper for adding an execution request to the system
and move other execution-related helper functions into
`executor/utils.py`.
* Remove non-db connections (redis / rabbitmq) in Database Manager and
let the client manage this by themselves.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Existing CI, some agent runs
2025-04-11 19:03:47 +00:00
Zamil Majdy
f7ca5ac1ba feat(backend/executor): Move execution queue + cancel mechanism to RabbitMQ (#9759)
The graph execution queue is not disk-persisted; when the executor dies,
the executions are lost.

The scope of this issue is migrating the execution queue from an
inter-process queue to a RabbitMQ message queue. A sync client should be
used for this.

- Resolves #9746
- Resolves #9714

### Changes 🏗️

Move the execution manager from multiprocess.Queue into persisted
Rabbit-MQ.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Execute agents.

<details>
  <summary>Example test plan</summary>
  
  - [ ] Create from scratch and execute an agent with at least 3 blocks
- [ ] Import an agent from file upload, and confirm it executes
correctly
  - [ ] Upload agent to marketplace
- [ ] Import an agent from marketplace and confirm it executes correctly
  - [ ] Edit an agent from monitor, and confirm it executes correctly
</details>

#### For configuration changes:
- [ ] `.env.example` is updated or already compatible with my changes
- [ ] `docker-compose.yml` is updated or already compatible with my
changes
- [ ] I have included a list of my configuration changes in the PR
description (under **Changes**)

<details>
  <summary>Examples of configuration changes</summary>

  - Changing ports
  - Adding new services that need to communicate with each other
  - Secrets or environment variable changes
  - New or infrastructure changes such as databases
</details>
2025-04-11 14:15:39 +00:00
Reinier van der Leer
8ea3bfabc4 fix(backend/db): Fix unchecked Prisma statements (#9805) 2025-04-10 23:04:42 +02:00
Nicholas Tindle
2ca18d77a4 feat(frontend, backend): track sentry environment on frontend + sentry init in app services (#9773)
<!-- Clearly explain the need for these changes: -->
We want to be able to filter errors according to where they occur in
sentry so we need to track and include that data. We also are not
logging everything from app services correctly so fix that up

### Changes 🏗️

<!-- Concisely describe all of the changes made in this pull request:
-->
- Adds env tracking for frontend
- adds sentry init in app service spawn

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
- [x] Tested by running and making sure all events + logs are inserted
into sentry correctly
2025-04-10 14:34:26 +00:00
Reinier van der Leer
353396110c refactor(backend): Clean up Library & Store DB schema (#9774)
Distilled from #9541 to reduce the scope of that PR.

- Part of #9307

-  Blocks #9786
  -  Blocks #9541

### Changes 🏗️

- Fix `LibraryAgent` schema (for #9786)
- Fix relationships between `LibraryAgent`, `AgentGraph`, and
`AgentPreset`
  - Impose uniqueness constraint on `LibraryAgent`

- Rename things that are called `agent` that actually refer to a
`graph`/`agentGraph`
- Fix singular/plural forms in DB schema
- Simplify reference names of closely related objects (e.g.
`AgentGraph.AgentGraphExecutions` -> `AgentGraph.Executions`)

- Eliminate use of `# type: ignore` in DB statements
  - Add `typed` and `typed_cast` utilities to `backend.util.type`

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] CI static type checking (with all risky `# type: ignore` removed)
  - [x] Check that column references in views are updated
2025-04-10 10:40:25 +00:00
Nicholas Tindle
62361ccc48 feat: deep copy the schema (#9794)
<!-- Clearly explain the need for these changes: -->
We were duplicating placeholder values across all agents 😨 

### Changes 🏗️

<!-- Concisely describe all of the changes made in this pull request:
-->
Deep copies the schema instead

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
  - [x] Test the broken agent in dev
2025-04-09 21:32:50 +00:00
Reinier van der Leer
755a80c87a fix(blocks): Fix block I/O value sharing (#9793)
- Resolves #9792

### Changes 🏗️

- Replace all `default=[]` -> `default_factory=list`
- Replace all `default={}` -> `default_factory=dict`

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [ ] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
  - [ ] CI

---------

Co-authored-by: Krzysztof Czerwinski <kpczerwinski@gmail.com>
2025-04-09 19:15:41 +00:00
Reinier van der Leer
5a83b233f8 fix(backend): Add required method cleanup to MainApp
The absence of this method caused type checking errors.
2025-04-09 13:09:21 +02:00
Reinier van der Leer
cb1a3703ad fix(ci): Fix linter exit code on failure (#9777)
The linter currently exits with exit code 0 even if linting fails. This
makes the CI linter permissive which isn't good.

Changes:
- Make linter exit with an error code if a linting step fails
- Fix existing formatting issues
2025-04-09 11:30:04 +02:00
Bently
91f62c47f9 feat(backend): Add new llama 4 maverick & scout models (#9788)
This PR is to add the new [Meta: Llama 4
Maverick](https://openrouter.ai/meta-llama/llama-4-maverick) and [Meta:
Llama 4 Scout](https://openrouter.ai/meta-llama/llama-4-scout) models
via [OpenRouter](https://openrouter.ai/)


### Changes 🏗️

Added the model names to ``llm.py``
```
    META_LLAMA_4_SCOUT = "meta-llama/llama-4-scout"
    META_LLAMA_4_MAVERICK = "meta-llama/llama-4-maverick"
```
and the modela metadata
```
    LlmModel.META_LLAMA_4_SCOUT: ModelMetadata("open_router", 131072, 131072),
    LlmModel.META_LLAMA_4_MAVERICK: ModelMetadata("open_router", 1048576, 1000000),
```

and i have added the model price to ``block_cost_config.py``
```
    LlmModel.META_LLAMA_4_SCOUT: 1,
    LlmModel.META_LLAMA_4_MAVERICK: 1,
```

### Checklist 📋

- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Open the build page and place a ai text block, open the model
select and scroll to the bottom and select either of the 2 models
  - [x] test them with a prompt and wait for a reply!
2025-04-08 21:59:53 +00:00
Zamil Majdy
7fedb5e2fd refactor(backend): Un-share resource initializations from AppService + Remove Pyro (#9750)
This is a prerequisite infra change for
https://github.com/Significant-Gravitas/AutoGPT/issues/9714.

We will need a service where we can maintain our own client (db, redis,
rabbitmq, be it async/sync) and configure our own cadence of
initialization and cleanup.

While refactoring the service.py, an option to use Pyro as an RPC
protocol is also removed.

### Changes 🏗️

* Decouple resource initialization and cleanup from the parent
AppService logic.
* Removed Pyro.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
  - [x] CI
2025-04-08 19:47:22 +00:00
Nicholas Tindle
3c14861d8e fix(backend): reduce log level for retrying connection (#9765)
<!-- Clearly explain the need for these changes: -->
Now that we are trying to use Sentry more, cleaning up some errors ->
warnings is a good idea

### Changes 🏗️
- reduces log level of retry to warning
<!-- Concisely describe all of the changes made in this pull request:
-->

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
  - [x] check it comes through in sentry
2025-04-07 17:00:18 +00:00
Nicholas Tindle
074a00ce86 fix(backend): ProviderName behavior when loading secrets (#9764)
<!-- Clearly explain the need for these changes: -->
We got this error in
sentry:[AUTOGPT-SERVER-33P](https://significant-gravitas.sentry.io/issues/6462614597/events/bb4871d796b04e759ade55197498cff9/)
```
Level: Error
'Secrets' object has no attribute 'ProviderName.GOOGLE_client_id'
```

### Changes 🏗️
- Follows pattern used when accessing these in
`_get_provider_oauth_handler` in the router
<!-- Concisely describe all of the changes made in this pull request:
-->

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
  - [x] Test to make sure getting works
2025-04-07 16:59:48 +00:00
Zamil Majdy
3771a0924c fix(backend): Update deprecated code caused by upgrades (#9758)
This series of upgrades:
https://github.com/significant-gravitas/autogpt/pull/9727
https://github.com/Significant-Gravitas/AutoGPT/pull/9728
https://github.com/Significant-Gravitas/AutoGPT/pull/9560

Caused some code in the repo being deprecated, this PR addresses those.

### Changes 🏗️

Fix pydantic config, usage of field, usage of proper prisma
`CreateInput` type, pytest loop-scope.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] CI, manual test on running some agents.

---------

Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
2025-04-04 16:34:40 +00:00
Nicholas Tindle
4397746a87 feat(backend): baseline sentry logging (#9756)
<!-- Clearly explain the need for these changes: -->
Sentry just released logs so lets enrich our details there too

### Changes 🏗️
- Adds sentry logging
- Adds dependencies tracking all of our sentry integrations
- Adds environment tracking to sentry
<!-- Concisely describe all of the changes made in this pull request:
-->

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
- [x] Tested to make sure events show up in sentry with the correct
environment logging
2025-04-04 15:31:52 +00:00
Reinier van der Leer
1fc984f7fd feat(platform/library): Add real-time "Steps" count to agent run view (#9740)
- Resolves #9731

### Changes 🏗️

- feat: Add "Steps" showing `node_execution_count` to agent run view
  - Add `GraphExecutionMeta.stats.node_exec_count` attribute

- feat(backend/executor): Send graph execution update after *every* node
execution (instead of only I/O node executions)
  - Update graph execution stats after every node execution

- refactor: Move `GraphExecutionMeta` stats into sub-object
(`cost`, `duration`, `total_run_time` -> `stats.cost`, `stats.duration`,
`stats.node_exec_time`)

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - View an agent run with 1+ steps on `/library/agents/[id]`
    - [x] "Info" section layout doesn't break
    - [x] Number of steps is shown
  - Initiate a new agent run
    - [x] "Steps" increments in real time during execution
2025-04-03 18:58:21 +00:00