<!-- Clearly explain the need for these changes: -->
for admins to approve agents for the marketplace, we need to be able to
run them. this is a quick workaround for downloading them so you can put
them in your marketplace to check
### Changes 🏗️
- clones various endpoints related to downloading into an admin side
with logging, and admin checks
- adds download button and removes open in builder action
<!-- Concisely describe all of the changes made in this pull request:
-->
### Checklist 📋
#### For code changes:
- [ ] I have clearly listed my changes in the PR description
- [ ] I have made a test plan
- [ ] I have tested my changes according to the test plan:
<!-- Put your test plan here: -->
- [ ] Test downloading agents from local marketplace
### Changes 🏗️
Bring back PrintConsoleBlock
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Print console block
---------
Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
The transaction with zero payment amount will not generate a payment ID,
so the checkout failed for this scenario.
### Changes 🏗️
Don't use payment id as transaction key on top-up with zero payment
amount.
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Top-up with stripe coupon
👋 Hi there! This PR was automatically generated by Autofix 🤖
This fix was triggered by Toran Bruce Richards.
Fixes
[AUTOGPT-SERVER-1ZY](https://sentry.io/organizations/significant-gravitas/issues/6386687527/).
The issue was that: `llm_call` calculates `max_tokens` without
considering `input_tokens`, causing OpenRouter API errors when the
context window is exceeded.
- Implements a function `estimate_token_count` to estimate the number of
tokens in a list of messages.
- Calculates available tokens based on the context window, estimated
input tokens, and user-defined max tokens.
- Adjusts `max_tokens` for LLM calls to prevent exceeding context window
limits.
- Reduces `max_tokens` by 15% and retries if a token limit error is
encountered during LLM calls.
If you have any questions or feedback for the Sentry team about this
fix, please email [autofix@sentry.io](mailto:autofix@sentry.io) with the
Run ID: 32838.
---------
Co-authored-by: sentry-autofix[bot] <157164994+sentry-autofix[bot]@users.noreply.github.com>
Co-authored-by: Krzysztof Czerwinski <kpczerwinski@gmail.com>
There are instances of node executions that were failed and end up stuck
in the RUNNING status due to the execution failed to release the lock:
```
2025-04-24 20:53:31,573 INFO [ExecutionManager|uid:25eba2d1-e9c1-44bc-88c7-43e0f4fbad5a|gid:01f8c315-c163-4dd1-a8a0-d396477c5a9f|nid:f8bf84ae-b1f0-4434-8f04-80f43852bc30]|geid:2e1b35c6-0d2f-4e97-adea-f6fe0d9965d0|neid:590b29ea-63ee-4e24-a429-de5a3e191e72|-] Failed node execution 590b29ea-63ee-4e24-a429-de5a3e191e72: Cannot release a lock that's no longer owned
```
### Changes 🏗️
Check the ownership of the lock before releasing.
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
<!-- Put your test plan here: -->
- [x] Existing CI tests.
Smart Decision Block was not able to work with sub agent with custom
name input & the bead were not properly propagated in the execution UI.
The scope of this PR is fixing it.
### Changes 🏗️
* Introduce an easy to parse format of tool edge:
`{tool}_^_{func}_~_{arg}`. Graph using SmartDecisionBlock needs to be
re-saved before execution to work.
* Reduce cluttering on a smart decision block logic.
* Fix beads not being shown for a smart decision block tool calling.
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Execute an SDM with some special character input as a tool
<img width="672" alt="image"
src="https://github.com/user-attachments/assets/873556b3-c16a-4dd1-ad84-bc86c636c406"
/>
Strip secrets, credentials when forking agent
### Changes 🏗️
<!-- Concisely describe all of the changes made in this pull request:
-->
### Checklist 📋
#### For code changes:
- [ ] I have clearly listed my changes in the PR description
- [ ] I have made a test plan
- [ ] I have tested my changes according to the test plan:
- [ ] ...
Currently, we have no visibility on the state of the execution manager,
the scope of this PR is to open up the observability of it by exposing
Prometheus metrics.
### Changes 🏗️
Re-use the execution manager port to expose the Prometheus metrics.
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Hit /metrics on 8002 port
### Changes 🏗️
Set process starting mode to forkserver instead of spawn, if possible,
for performance benefits.
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Existing tests
Executor process initialization can fail and cause this error:
```
concurrent.futures.process.BrokenProcessPool: A child process terminated abruptly, the process pool is not usable anymore
```
### Changes 🏗️
Add retry to reduce the chance of the initialization error to happen.
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Existing tests
This PR introduces copying agents feature in the Library. Users can copy
and download their library agents but they can edit only the ones they
own (included copied ones).
### Changes 🏗️
- DB migration: add relation in `AgentGraph`: `forked_from_id` and
`forked_from_version`
- Add `fork_graph` function that makes a hardcopy of agent graph and its
nodes (all with new ids)
- Add `fork_library_agent` that copies library agent and its graph for a
user
- Add endpoint `/library/agents/{libraryAgentId}/fork`
- Add UI to `library/agents/[id]/page.tsx`: `Edit a copy` button with
dialog confirmation
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Agent can be copied, edited and runs
Setting the Google Maps API through the API has never worked on the
platform.
### Changes 🏗️
Set the default api key from the environment variable.
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
<!-- Put your test plan here: -->
- [x] Test GoogleMapsBlock
https://github.com/Significant-Gravitas/AutoGPT/pull/9452 was throwing
`operator does not exist: text ? unknown` on deployed dev and so the
function call was commented as a hotfix.
This PR fixes and re-enables the llm model migration function.
### Changes 🏗️
- Uncomment and fix `migrate_llm_models` function
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Migrate nodes with non-existing models
- [x] Don't migrate nodes without any model or with correct models
---------
Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
There are cases where the publishing agent execution is failing, making
the agent execution appear to be stuck in a queue, but the execution has
never been in a queue in the first place.
### Changes 🏗️
On publishing failure, we set the graph & starting node execution status
to FAILED and let the UI bubble up the error so the user can try again.
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Normal add execution flow
For unknown reason publishing message can fail sometimes due to the
connection being broken:
MessageQueue suddenly unavailable, connection simply broke, connection
being reset, etc.
### Changes 🏗️
Adding a tenacity retry on AMQP or ConnectionError, which hopefully can
alleviate the issue.
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Simple add execution
- Resolves#9771
- ... in a non-persistent way, so it won't work for webhook-triggered
agents
For webhooks: #9541
### Changes 🏗️
Frontend:
- Add credentials inputs in Library "New run" screen (based on
`graph.credentials_input_schema`)
- Refactor `CredentialsInput` and `useCredentials` to not rely on XYFlow
context
- Unsplit lists of saved credentials in `CredentialsProvider` state
- Move logic that was being executed at component render to `useEffect`
hooks in `CredentialsInput`
Backend:
- Implement logic to aggregate credentials input requirements to one per
provider per graph
- Add `BaseGraph.credentials_input_schema` (JSON schema) computed field
Underlying added logic:
- `BaseGraph._credentials_input_schema` - makes a `BlockSchema` from a
graph's aggregated credentials inputs
- `BaseGraph.aggregate_credentials_inputs()` - aggregates a graph's
nodes' credentials inputs using `CredentialsFieldInfo.combine(..)`
- `BlockSchema.get_credentials_fields_info() -> dict[str,
CredentialsFieldInfo]`
- `CredentialsFieldInfo` model (created from
`_CredentialsFieldSchemaExtra`)
- Implement logic to inject explicitly passed credentials into graph
execution
- Add `credentials_inputs` parameter to `execute_graph` endpoint
- Add `graph_credentials_input` parameter to
`.executor.utils.add_graph_execution(..)`
- Implement `.executor.utils.make_node_credentials_input_map(..)`
- Amend `.executor.utils.construct_node_execution_input`
- Add `GraphExecutionEntry.node_credentials_input_map` attribute
- Amend validation to allow injecting credentials
- Amend `GraphModel._validate_graph(..)`
- Amend `.executor.utils._validate_node_input_credentials`
- Add `node_credentials_map` parameter to
`ExecutionManager.add_execution(..)`
- Amend execution validation to handle side-loaded credentials
- Add `GraphExecutionEntry.node_execution_map` attribute
- Add mechanism to inject passed credentials into node execution data
- Add credentials injection mechanism to node execution queueing logic
in `Executor._on_graph_execution(..)`
- Replace boilerplate logic in `v1.execute_graph` endpoint with call to
existing `.executor.utils.add_graph_execution(..)`
- Replace calls to `.server.routers.v1.execute_graph` with
`add_graph_execution`
Also:
- Address tech debt in `GraphModel._validate_gaph(..)`
- Fix type checking in `BaseGraph._generate_schema(..)`
#### TODO
- [ ] ~~Make "Run again" work with credentials in
`AgentRunDetailsView`~~
- [ ] Prohibit saving a graph if it has nodes with missing discriminator
value for discriminated credentials inputs
### Checklist 📋
#### For code changes:
- [ ] I have clearly listed my changes in the PR description
- [ ] I have made a test plan
- [ ] I have tested my changes according to the test plan:
<!-- Put your test plan here: -->
- [ ] ...
Array fields in `schema.prisma` are non-nullable, but generated
migrations don’t add `NOT NULL` constraints. This causes existing rows
to get `NULL` values when new array columns are added, breaking schema
expectations and leading to bugs.
### Changes 🏗️
- Backfill all `NULL` rows on non-nullable array columns to empty arrays
- Set `NOT NULL` constraint on all array columns
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Existing `NULL` rows are properly backfilled
- [x] Existing arrays are not set to default empty arrays
- [x] Affected columns became non-nullable in the db
SmartDecisionBlock sometimes tried to be smart by calling multiple tool
calls and our platform does not support this yet.
### Changes 🏗️
Disable parallel tool calls for OpenAI & OpenRouter LLM provider LLM
blocks.
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
<!-- Put your test plan here: -->
- [x] Tested SmartDecisionBlock & AITextGeneratorBlock
### Changes 🏗️
Avoid other threads accessing the channel within the same process.
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Manual agent runs
<!-- Clearly explain the need for these changes: -->
### Changes 🏗️
This PR simply changes the logging type from info to debug of node
outputs in the agent.py file.
<!-- Concisely describe all of the changes made in this pull request:
-->
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [ ] I have tested my changes according to the test plan:
<!-- Put your test plan here: -->
- [x] ...
<details>
<summary>Example test plan</summary>
- [ ] Create from scratch and execute an agent with at least 3 blocks
- [ ] Import an agent from file upload, and confirm it executes
correctly
- [ ] Upload agent to marketplace
- [ ] Import an agent from marketplace and confirm it executes correctly
- [ ] Edit an agent from monitor, and confirm it executes correctly
</details>
#### For configuration changes:
- [x] `.env.example` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)
<details>
<summary>Examples of configuration changes</summary>
- Changing ports
- Adding new services that need to communicate with each other
- Secrets or environment variable changes
- New or infrastructure changes such as databases
</details>
---------
Co-authored-by: Bentlybro <Github@bentlybro.com>
<!-- Clearly explain the need for these changes: -->
### Changes 🏗️
This change simply changes the logging level of node inputs and outputs
to debug level. This change is needed because currently logging all node
data causes logs that are too large for the logger to prevent nodes from
running.
<!-- Concisely describe all of the changes made in this pull request:
-->
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [ ] I have made a test plan
- [ ] I have tested my changes according to the test plan:
<!-- Put your test plan here: -->
- [x] ...
<details>
<summary>Example test plan</summary>
- [ ] Create from scratch and execute an agent with at least 3 blocks
- [ ] Import an agent from file upload, and confirm it executes
correctly
- [ ] Upload agent to marketplace
- [ ] Import an agent from marketplace and confirm it executes correctly
- [ ] Edit an agent from monitor, and confirm it executes correctly
</details>
#### For configuration changes:
- [x] `.env.example` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)
<details>
<summary>Examples of configuration changes</summary>
- Changing ports
- Adding new services that need to communicate with each other
- Secrets or environment variable changes
- New or infrastructure changes such as databases
</details>
We have seen instances where the executor gets stuck in a failing
message-consuming loop due to the upstream RabbitMQ being down. The
current message-consuming pattern is not optimal for handling this.
### Changes 🏗️
* Add a retry limit to the execution loop limit.
* Use `basic_consume` instead of `basic_get` for handling message
consumption.
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Run agents cancel them
### Changes 🏗️
The recent change to the execution cancelation fix turns out to only
work on the first request.
This PR change fixes it by reworking how the thread_cached work on async
functions.
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
<!-- Put your test plan here: -->
- [x] Cancel agent executions multiple times
<!-- Clearly explain the need for these changes: -->
Swap to pooling supabase connections rather than depending on x number
of max open connections
### Changes 🏗️
Adds direct connect URL to be used throughout the system
<!-- Concisely describe all of the changes made in this pull request:
-->
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
<!-- Put your test plan here: -->
- [x] Test thoroughly all of the endpoints in the dev env with switched
infra matching pr
- [x] Follow the new release plan tests
- [x] Follow the old release plan tests
#### For configuration changes:
- [x] `.env.example` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)
<details>
<summary>configuration changes</summary>
- Change how we connect to the database to use direct when configured
and database URL when not
- update prisma for this
- have default matching database and default
</details>
### Changes 🏗️
- Update onboarding to give user rewards for completing steps
- Remove `canvas-confetti` lib and add `party-js` instead; the former
didn't allow to play confetti from a component
- Add onboarding videos in `frontend/public/onboarding/`
- Remove Balance (`CreditsCard.tsx`) and add openable `Wallet.tsx` (and
accompanying `WalletTaskGroup.tsx`) instead that displays grouped
onboarding tasks with descriptions and short instructional videos
- Further relevant updates to `useOnboarding`, `types.ts`
- Implement onboarding rewards
- Add `onboarding_reward` function in `credit.py` that is used to reward
user for finished onboarding tasks safely - transaction key is
deterministic, so the same user won't be rewarded twice for the same
step.
- Add `reward_user` in `onboarding.py`
- Update `UserOnboarding` model and add a migration
<img width="464" alt="Screenshot 2025-04-05 at 6 06 29 PM"
src="https://github.com/user-attachments/assets/fca8d09e-0139-466b-b679-d24117ad01f0"
/>
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Onboarding works
- [x] Tasks can be completed
- [x] Rewards are added correctly for all completed tasks
Currently the execution task is not properly distributed between
executors because we need to send the execution request to the execution
server.
The execution manager now accepts the execution request from the message
queue. Thus, we can remove the synchronous RPC system from this service,
let the system focus on executing the agent, and not spare any process
for the HTTP API interface.
This will also reduce the risk of the execution service being too busy
and not able to accept any add execution requests.
### Changes 🏗️
* Remove the RPC system in Agent Executor
* Allow the cancellation of the execution that is still waiting in the
queue (by avoiding it from being executed).
* Make a unified helper for adding an execution request to the system
and move other execution-related helper functions into
`executor/utils.py`.
* Remove non-db connections (redis / rabbitmq) in Database Manager and
let the client manage this by themselves.
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Existing CI, some agent runs
The graph execution queue is not disk-persisted; when the executor dies,
the executions are lost.
The scope of this issue is migrating the execution queue from an
inter-process queue to a RabbitMQ message queue. A sync client should be
used for this.
- Resolves#9746
- Resolves#9714
### Changes 🏗️
Move the execution manager from multiprocess.Queue into persisted
Rabbit-MQ.
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Execute agents.
<details>
<summary>Example test plan</summary>
- [ ] Create from scratch and execute an agent with at least 3 blocks
- [ ] Import an agent from file upload, and confirm it executes
correctly
- [ ] Upload agent to marketplace
- [ ] Import an agent from marketplace and confirm it executes correctly
- [ ] Edit an agent from monitor, and confirm it executes correctly
</details>
#### For configuration changes:
- [ ] `.env.example` is updated or already compatible with my changes
- [ ] `docker-compose.yml` is updated or already compatible with my
changes
- [ ] I have included a list of my configuration changes in the PR
description (under **Changes**)
<details>
<summary>Examples of configuration changes</summary>
- Changing ports
- Adding new services that need to communicate with each other
- Secrets or environment variable changes
- New or infrastructure changes such as databases
</details>
<!-- Clearly explain the need for these changes: -->
We want to be able to filter errors according to where they occur in
sentry so we need to track and include that data. We also are not
logging everything from app services correctly so fix that up
### Changes 🏗️
<!-- Concisely describe all of the changes made in this pull request:
-->
- Adds env tracking for frontend
- adds sentry init in app service spawn
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
<!-- Put your test plan here: -->
- [x] Tested by running and making sure all events + logs are inserted
into sentry correctly
Distilled from #9541 to reduce the scope of that PR.
- Part of #9307
- ❗ Blocks #9786
- ❗ Blocks #9541
### Changes 🏗️
- Fix `LibraryAgent` schema (for #9786)
- Fix relationships between `LibraryAgent`, `AgentGraph`, and
`AgentPreset`
- Impose uniqueness constraint on `LibraryAgent`
- Rename things that are called `agent` that actually refer to a
`graph`/`agentGraph`
- Fix singular/plural forms in DB schema
- Simplify reference names of closely related objects (e.g.
`AgentGraph.AgentGraphExecutions` -> `AgentGraph.Executions`)
- Eliminate use of `# type: ignore` in DB statements
- Add `typed` and `typed_cast` utilities to `backend.util.type`
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] CI static type checking (with all risky `# type: ignore` removed)
- [x] Check that column references in views are updated
<!-- Clearly explain the need for these changes: -->
We were duplicating placeholder values across all agents 😨
### Changes 🏗️
<!-- Concisely describe all of the changes made in this pull request:
-->
Deep copies the schema instead
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
<!-- Put your test plan here: -->
- [x] Test the broken agent in dev
- Resolves#9792
### Changes 🏗️
- Replace all `default=[]` -> `default_factory=list`
- Replace all `default={}` -> `default_factory=dict`
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [ ] I have tested my changes according to the test plan:
<!-- Put your test plan here: -->
- [ ] CI
---------
Co-authored-by: Krzysztof Czerwinski <kpczerwinski@gmail.com>
The linter currently exits with exit code 0 even if linting fails. This
makes the CI linter permissive which isn't good.
Changes:
- Make linter exit with an error code if a linting step fails
- Fix existing formatting issues
This PR is to add the new [Meta: Llama 4
Maverick](https://openrouter.ai/meta-llama/llama-4-maverick) and [Meta:
Llama 4 Scout](https://openrouter.ai/meta-llama/llama-4-scout) models
via [OpenRouter](https://openrouter.ai/)
### Changes 🏗️
Added the model names to ``llm.py``
```
META_LLAMA_4_SCOUT = "meta-llama/llama-4-scout"
META_LLAMA_4_MAVERICK = "meta-llama/llama-4-maverick"
```
and the modela metadata
```
LlmModel.META_LLAMA_4_SCOUT: ModelMetadata("open_router", 131072, 131072),
LlmModel.META_LLAMA_4_MAVERICK: ModelMetadata("open_router", 1048576, 1000000),
```
and i have added the model price to ``block_cost_config.py``
```
LlmModel.META_LLAMA_4_SCOUT: 1,
LlmModel.META_LLAMA_4_MAVERICK: 1,
```
### Checklist 📋
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Open the build page and place a ai text block, open the model
select and scroll to the bottom and select either of the 2 models
- [x] test them with a prompt and wait for a reply!
This is a prerequisite infra change for
https://github.com/Significant-Gravitas/AutoGPT/issues/9714.
We will need a service where we can maintain our own client (db, redis,
rabbitmq, be it async/sync) and configure our own cadence of
initialization and cleanup.
While refactoring the service.py, an option to use Pyro as an RPC
protocol is also removed.
### Changes 🏗️
* Decouple resource initialization and cleanup from the parent
AppService logic.
* Removed Pyro.
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
<!-- Put your test plan here: -->
- [x] CI
<!-- Clearly explain the need for these changes: -->
Now that we are trying to use Sentry more, cleaning up some errors ->
warnings is a good idea
### Changes 🏗️
- reduces log level of retry to warning
<!-- Concisely describe all of the changes made in this pull request:
-->
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
<!-- Put your test plan here: -->
- [x] check it comes through in sentry
<!-- Clearly explain the need for these changes: -->
We got this error in
sentry:[AUTOGPT-SERVER-33P](https://significant-gravitas.sentry.io/issues/6462614597/events/bb4871d796b04e759ade55197498cff9/)
```
Level: Error
'Secrets' object has no attribute 'ProviderName.GOOGLE_client_id'
```
### Changes 🏗️
- Follows pattern used when accessing these in
`_get_provider_oauth_handler` in the router
<!-- Concisely describe all of the changes made in this pull request:
-->
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
<!-- Put your test plan here: -->
- [x] Test to make sure getting works
<!-- Clearly explain the need for these changes: -->
Sentry just released logs so lets enrich our details there too
### Changes 🏗️
- Adds sentry logging
- Adds dependencies tracking all of our sentry integrations
- Adds environment tracking to sentry
<!-- Concisely describe all of the changes made in this pull request:
-->
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
<!-- Put your test plan here: -->
- [x] Tested to make sure events show up in sentry with the correct
environment logging
- Resolves#9731
### Changes 🏗️
- feat: Add "Steps" showing `node_execution_count` to agent run view
- Add `GraphExecutionMeta.stats.node_exec_count` attribute
- feat(backend/executor): Send graph execution update after *every* node
execution (instead of only I/O node executions)
- Update graph execution stats after every node execution
- refactor: Move `GraphExecutionMeta` stats into sub-object
(`cost`, `duration`, `total_run_time` -> `stats.cost`, `stats.duration`,
`stats.node_exec_time`)
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- View an agent run with 1+ steps on `/library/agents/[id]`
- [x] "Info" section layout doesn't break
- [x] Number of steps is shown
- Initiate a new agent run
- [x] "Steps" increments in real time during execution
- Resolves#9679
### Changes 🏗️
Frontend:
- Fix crash on `payload` graph input
- Fix crash on object type agent I/O values
- Hide "+ New run" if `graph.webhook_id` is set
Backend:
- Add computed field `webhook_id` to `GraphModel`
- Add computed property `webhook_input_node` to `GraphModel`
- Refactor:
- Move `Node.webhook_id` -> `NodeModel.webhook_id`
- Move `NodeModel.block` -> `Node.block` (computed property)
- Replace `get_block(node.block_id)` with `node.block` where sensible
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Create and run a simple graph
- [x] Create a graph with a webhook trigger and ensure it works
- [x] Check out the runs of a webhook-triggered graph and ensure the
page works
<!-- Clearly explain the need for these changes: -->
Uvicorn and our logs were ending up in different places, this pr enures
uvicorn using our logging config, not their own.
### Changes 🏗️
- Clears uvicorn's loggers for rest, ws
- always log to stdout,stderr and additionally log to gcp is appropriate
<!-- Concisely describe all of the changes made in this pull request:
-->
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
<!-- Put your test plan here: -->
- [x] Test all possible variants of the log cloud vs not and ensure that
uvicorn logs show up in the same place that rest of the system logs do
for all
---------
Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
<!-- Clearly explain the need for these changes: -->
One of the pull request review notes from when these were first made is
that they don't belong in the v2 folder. This pr fixes where they are.
### Changes 🏗️
- Moves from v2 to routers for the postmark tooling
<!-- Concisely describe all of the changes made in this pull request:
-->
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
<!-- Put your test plan here: -->
- [x] Check that linting and tests pass
---------
Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
<!-- Clearly explain the need for these changes: -->
We have logged 272k timeout errors in the past week from the event loop.
Don't raise those as errors.
Also along the way for diagnosing this we found that some items were
inserted into batches with incomplete datasets so handle that too.
### Changes 🏗️
- Handle timeout errors explicitly
- Add better messaging for other error types
- Add filtering for queueing bad mezsaging
- add filtering for reading bad batches
<!-- Concisely describe all of the changes made in this pull request:
-->
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
<!-- Put your test plan here: -->
- [x] Pull dev db
- [x] Test new code to check stability + error reduction
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
The cleanup command was only called on SIGTERM, making it possible for
the service to close without being cleaned. Risking the connection not
being proactively closed when the service is unused.
### Changes 🏗️
Call the cleanup command on the service finally block.
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Run the service, stop it, see the log is printed (locally)
<!-- Clearly explain the need for these changes: -->
I want to be able to insert data into the graph as a webhook from
various services without making a provider specific webhook for things
like discord, slack, uptime bots, etc.
### Changes 🏗️
- Adds a generic webhook block that others can use
<!-- Concisely describe all of the changes made in this pull request:
-->
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
<!-- Put your test plan here: -->
- [x] Test the endpoint that is generated with a graph, making sure to
pass data and consts to it