Compare commits

...

7 Commits

Author SHA1 Message Date
Zamil Majdy
c7056136d2 fix(backend/db): expose /ready endpoint via POST and GET for consistency 2026-03-23 23:00:32 +07:00
Zamil Majdy
5df20cbbf7 fix(backend/db): add memory-aware readiness probe for database-manager
Add /ready endpoint that checks process memory usage against a
configurable threshold (READINESS_MEMORY_LIMIT env var, default 1.5GB).

When memory exceeds the threshold, the readiness probe fails, causing
Kubernetes to remove the pod from Service endpoints. In-flight requests
drain gracefully while the liveness probe (/health_check) continues to
pass. Once the pod is fully drained, Kubernetes terminates and restarts
it with fresh memory.

Base AppService gets a default readiness_check() that delegates to
health_check(), so other services are unaffected.
2026-03-23 22:56:27 +07:00
Zamil Majdy
c141e3cd33 fix(skills): replace test results table placeholder with variable template
Add TEST_RESULTS_TABLE variable with instructions for the agent to populate
it with actual test results before posting the PR comment.
2026-03-23 22:54:43 +07:00
Zamil Majdy
2927dce5c3 fix(skills): use associative array for screenshot explanations instead of placeholder
Replace static SCREENSHOT_EXPLANATION placeholder with a declare -A map that
the agent populates from Step 6 observations before running the loop.
2026-03-23 22:49:48 +07:00
Zamil Majdy
645eb59d60 fix(skills): unify explanation length to 1-2 sentences across Steps 6 and 7 2026-03-23 22:45:23 +07:00
Zamil Majdy
221873ccc0 fix(skills): address PR review comments on /pr-test skill
- Replace HTML placeholder comments with clear instruction to insert real
  screenshot explanations from Step 6 observations
- Make Step 6 summary table consistent with Step 7 (3 columns, no Screenshot col)
- Use --body-file for gh api comment posting to avoid shell interpretation issues
- Add markdown language identifier to fenced code block example
2026-03-23 22:31:37 +07:00
Zamil Majdy
26961444d9 feat(devops): improve /pr-test skill to show screenshots with explanations
Update the pr-test skill to:
- Take screenshots at every significant test step (not just failures)
- Show every screenshot to the user via Read tool with 2-3 sentence explanations
- Post PR comments with inline images and per-screenshot descriptions
- Simplify the GitHub Git API upload flow (single tree build)
- Add clear formatting guidance for the PR comment structure

Previously screenshots were saved locally but not shown to the user or
included in PR comments with context. Now every screenshot gets displayed
inline with an explanation of what it proves.
2026-03-23 21:33:50 +07:00
3 changed files with 128 additions and 68 deletions

View File

@@ -346,59 +346,54 @@ agent-browser --session-name pr-test open 'http://localhost:3000/copilot' --time
# ... fill chat input and press Enter, wait 20-30s for response
```
## Step 5: Record results
## Step 5: Record results and take screenshots
For each test scenario, record in `$RESULTS_DIR/test-report.md`:
**Take a screenshot at every significant test step** — before and after interactions, on success, and on failure. Name them sequentially with descriptive names:
```markdown
# E2E Test Report: PR #{N} — {title}
Date: {date}
Branch: {branch}
Worktree: {path}
## Environment
- Docker services: [list running containers]
- API keys: OpenRouter={present/missing}, E2B={present/missing}
## Test Results
### Scenario 1: {name}
**Steps:**
1. ...
2. ...
**Expected:** ...
**Actual:** ...
**Result:** PASS / FAIL
**Screenshot:** {filename}.png
**Logs:** (if relevant)
### Scenario 2: {name}
...
## Summary
- Total: X scenarios
- Passed: Y
- Failed: Z
- Bugs found: [list]
```
Take screenshots at each significant step:
```bash
agent-browser --session-name pr-test screenshot $RESULTS_DIR/{NN}-{description}.png
# Examples:
# $RESULTS_DIR/01-login-page.png
# $RESULTS_DIR/02-builder-with-block.png
# $RESULTS_DIR/03-copilot-response.png
# $RESULTS_DIR/04-agent-execution-result.png
# $RESULTS_DIR/05-error-state.png
```
## Step 6: Report results
**Aim for at least one screenshot per test scenario.** More is better — screenshots are the primary evidence that tests were actually run.
After all tests complete, output a summary to the user:
## Step 6: Show results to user with screenshots
1. Table of all scenarios with PASS/FAIL
2. Screenshots of failures (read the PNG files to show them)
3. Any bugs found with details
4. Recommendations
**CRITICAL: After all tests complete, you MUST show every screenshot to the user using the Read tool, with an explanation of what each screenshot shows.** This is the most important part of the test report — the user needs to visually verify the results.
### Post test results as PR comment with screenshots
For each screenshot:
1. Use the `Read` tool to display the PNG file (Claude can read images)
2. Write a 1-2 sentence explanation below it describing:
- What page/state is being shown
- What the screenshot proves (which test scenario it validates)
- Any notable details visible in the UI
Upload screenshots to the PR using the GitHub Git API (no local git operations — safe for worktrees).
Format the output like this:
```markdown
### Screenshot 1: {descriptive title}
[Read the PNG file here]
**What it shows:** {1-2 sentence explanation of what this screenshot proves}
---
```
After showing all screenshots, output a summary table:
| # | Scenario | Result |
|---|----------|--------|
| 1 | {name} | PASS/FAIL |
| 2 | ... | ... |
## Step 7: Post test report as PR comment with screenshots
Upload screenshots to the PR using the GitHub Git API (no local git operations — safe for worktrees), then post a comment with inline images and per-screenshot explanations.
```bash
# Upload screenshots via GitHub Git API (creates blobs, tree, commit, and ref remotely)
@@ -406,17 +401,7 @@ REPO="Significant-Gravitas/AutoGPT"
SCREENSHOTS_BRANCH="test-screenshots/pr-${PR_NUMBER}"
SCREENSHOTS_DIR="test-screenshots/PR-${PR_NUMBER}"
# Step 1: Create blobs for each screenshot
declare -a TREE_ENTRIES
for img in $RESULTS_DIR/*.png; do
BASENAME=$(basename "$img")
B64=$(base64 < "$img")
BLOB_SHA=$(gh api "repos/${REPO}/git/blobs" -f content="$B64" -f encoding="base64" --jq '.sha')
TREE_ENTRIES+=("-f" "tree[][path]=${SCREENSHOTS_DIR}/${BASENAME}" "-f" "tree[][mode]=100644" "-f" "tree[][type]=blob" "-f" "tree[][sha]=${BLOB_SHA}")
done
# Step 2: Create a tree with all screenshot blobs
# Build the tree JSON manually since gh api doesn't handle arrays well
# Step 1: Create blobs for each screenshot and build tree JSON
TREE_JSON='['
FIRST=true
for img in $RESULTS_DIR/*.png; do
@@ -428,42 +413,72 @@ for img in $RESULTS_DIR/*.png; do
done
TREE_JSON+=']'
TREE_SHA=$(echo "$TREE_JSON" | gh api "repos/${REPO}/git/trees" --input - -f base_tree="" --jq '.sha' 2>/dev/null \
|| echo "$TREE_JSON" | jq -c '{tree: .}' | gh api "repos/${REPO}/git/trees" --input - --jq '.sha')
# Step 3: Create a commit pointing to that tree
# Step 2: Create tree, commit, and branch ref
TREE_SHA=$(echo "$TREE_JSON" | jq -c '{tree: .}' | gh api "repos/${REPO}/git/trees" --input - --jq '.sha')
COMMIT_SHA=$(gh api "repos/${REPO}/git/commits" \
-f message="test: add E2E test screenshots for PR #${PR_NUMBER}" \
-f tree="$TREE_SHA" \
--jq '.sha')
# Step 4: Create or update the ref (branch) — no local checkout needed
gh api "repos/${REPO}/git/refs" \
-f ref="refs/heads/${SCREENSHOTS_BRANCH}" \
-f sha="$COMMIT_SHA" 2>/dev/null \
|| gh api "repos/${REPO}/git/refs/heads/${SCREENSHOTS_BRANCH}" \
-X PATCH -f sha="$COMMIT_SHA" -f force=true
```
# Step 5: Build image markdown and post the comment
Then post the comment with **inline images AND explanations for each screenshot**:
```bash
REPO_URL="https://raw.githubusercontent.com/${REPO}/${SCREENSHOTS_BRANCH}"
# Build image markdown — each screenshot gets a heading, image, and explanation.
# This loop is a TEMPLATE. Before running it, you must define a SCREENSHOT_EXPLANATIONS
# associative array mapping each basename to a 1-2 sentence explanation from Step 6.
# Example: declare -A SCREENSHOT_EXPLANATIONS=(
# ["01-login-page.png"]="Shows the login page loaded successfully with SSO options visible."
# ["02-builder-with-block.png"]="The builder canvas displays the newly added block connected to the trigger."
# )
declare -A SCREENSHOT_EXPLANATIONS
# ^^^ Populate this with real explanations from Step 6 before running the loop
IMAGE_MARKDOWN=""
for img in $RESULTS_DIR/*.png; do
BASENAME=$(basename "$img")
IMAGE_MARKDOWN="$IMAGE_MARKDOWN
![${BASENAME}](${REPO_URL}/${SCREENSHOTS_DIR}/${BASENAME})"
TITLE=$(echo "${BASENAME%.png}" | sed 's/^[0-9]*-//' | sed 's/-/ /g' | awk '{for(i=1;i<=NF;i++) $i=toupper(substr($i,1,1)) tolower(substr($i,2))}1')
EXPLANATION="${SCREENSHOT_EXPLANATIONS[$BASENAME]}"
IMAGE_MARKDOWN="${IMAGE_MARKDOWN}
### ${TITLE}
![${BASENAME}](${REPO_URL}/${SCREENSHOTS_DIR}/${BASENAME})
${EXPLANATION}
"
done
gh api "repos/${REPO}/issues/$PR_NUMBER/comments" -f body="$(cat <<EOF
# Write comment body to file to avoid shell interpretation issues with special characters.
# IMPORTANT: Before running this, you must replace TEST_RESULTS_TABLE below with actual
# markdown table rows from your test results, e.g.:
# TEST_RESULTS_TABLE="| 1 | Login flow | PASS |
# | 2 | Copilot chat | PASS |
# | 3 | Agent execution | FAIL |"
COMMENT_FILE=$(mktemp)
cat > "$COMMENT_FILE" <<INNEREOF
## 🧪 E2E Test Report
$(cat $RESULTS_DIR/test-report.md)
| # | Scenario | Result |
|---|----------|--------|
${TEST_RESULTS_TABLE}
### Screenshots
${IMAGE_MARKDOWN}
EOF
)"
INNEREOF
gh api "repos/${REPO}/issues/$PR_NUMBER/comments" -F body=@"$COMMENT_FILE"
rm -f "$COMMENT_FILE"
```
**The PR comment MUST include:**
1. A summary table of all scenarios with PASS/FAIL
2. Every screenshot rendered inline (not just linked)
3. A 1-2 sentence explanation below each screenshot describing what it proves
This approach uses the GitHub Git API to create blobs, trees, commits, and refs entirely server-side. No local `git checkout` or `git push` — safe for worktrees and won't interfere with the PR branch.
## Fix mode (--fix flag)

View File

@@ -1,4 +1,6 @@
import logging
import os
import resource
from contextlib import asynccontextmanager
from typing import TYPE_CHECKING, Callable, Concatenate, ParamSpec, TypeVar, cast
@@ -173,6 +175,42 @@ class DatabaseManager(AppService):
logger.info(f"[{self.service_name}] ⏳ Disconnecting Database...")
await db.disconnect()
def _get_memory_usage_bytes(self) -> int:
"""Get current process memory usage in bytes (RSS).
Tries cgroup v2 first (GKE with containerd), then cgroup v1,
then falls back to getrusage.
"""
for path in (
"/sys/fs/cgroup/memory.current", # cgroup v2
"/sys/fs/cgroup/memory/memory.usage_in_bytes", # cgroup v1
):
try:
with open(path) as f:
return int(f.read().strip())
except (FileNotFoundError, PermissionError):
continue
# Fallback: RSS from getrusage (macOS returns bytes, Linux returns KB)
rss = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
return rss if os.uname().sysname == "Darwin" else rss * 1024
async def readiness_check(self) -> str:
"""Readiness probe: fails when memory exceeds threshold.
When memory is too high, Kubernetes removes the pod from Service
endpoints (no new traffic), letting in-flight requests drain before
the liveness probe eventually kills and restarts the pod.
"""
# Default 1.5GB; configurable via env var
max_memory = int(os.environ.get("READINESS_MEMORY_LIMIT", 1_610_612_736))
current = self._get_memory_usage_bytes()
if current > max_memory:
raise UnhealthyServiceError(
f"Memory {current / 1024**2:.0f}Mi exceeds readiness limit "
f"{max_memory / 1024**2:.0f}Mi"
)
return await self.health_check()
async def health_check(self) -> str:
if not db.is_connected():
raise UnhealthyServiceError("Database is not connected")

View File

@@ -372,6 +372,10 @@ class AppService(BaseAppService, ABC):
"""A method to check the health of the process."""
return "OK"
async def readiness_check(self) -> str:
"""Readiness probe. Override to add memory/resource checks."""
return await self.health_check()
def run(self):
sentry_init()
super().run()
@@ -411,6 +415,9 @@ class AppService(BaseAppService, ABC):
self.fastapi_app.add_api_route(
"/health_check_async", self.health_check, methods=["POST", "GET"]
)
self.fastapi_app.add_api_route(
"/ready", self.readiness_check, methods=["POST", "GET"]
)
self.fastapi_app.add_exception_handler(
ValueError, self._handle_internal_http_error(400)
)