chore: remove test.db from tracking

style(classic): fix code formatting with black
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-20 04:28:09 -05:00 · 2026-01-20 01:24:00 -06:00 · 2026-01-20 01:23:51 -06:00 · 2026-01-20 01:23:33 -06:00 · 2026-01-20 01:21:50 -06:00 · 2026-01-20 01:16:38 -06:00
2750 changed files with 82456 additions and 823398 deletions
--- a/.claude/skills/vercel-react-best-practices/AGENTS.md
+++ b/.claude/skills/vercel-react-best-practices/AGENTS.md
--- a/.claude/skills/vercel-react-best-practices/SKILL.md
+++ b/.claude/skills/vercel-react-best-practices/SKILL.md
@@ -0,0 +1,125 @@
 ---
 name: vercel-react-best-practices
 description: React and Next.js performance optimization guidelines from Vercel Engineering. This skill should be used when writing, reviewing, or refactoring React/Next.js code to ensure optimal performance patterns. Triggers on tasks involving React components, Next.js pages, data fetching, bundle optimization, or performance improvements.
 license: MIT
 metadata:
  author: vercel
  version: "1.0.0"
 ---
 # Vercel React Best Practices
 Comprehensive performance optimization guide for React and Next.js applications, maintained by Vercel. Contains 45 rules across 8 categories, prioritized by impact to guide automated refactoring and code generation.
 ## When to Apply
 Reference these guidelines when:
 - Writing new React components or Next.js pages
 - Implementing data fetching (client or server-side)
 - Reviewing code for performance issues
 - Refactoring existing React/Next.js code
 - Optimizing bundle size or load times
 ## Rule Categories by Priority
 | Priority | Category | Impact | Prefix |
 |----------|----------|--------|--------|
 | 1 | Eliminating Waterfalls | CRITICAL | `async-` |
 | 2 | Bundle Size Optimization | CRITICAL | `bundle-` |
 | 3 | Server-Side Performance | HIGH | `server-` |
 | 4 | Client-Side Data Fetching | MEDIUM-HIGH | `client-` |
 | 5 | Re-render Optimization | MEDIUM | `rerender-` |
 | 6 | Rendering Performance | MEDIUM | `rendering-` |
 | 7 | JavaScript Performance | LOW-MEDIUM | `js-` |
 | 8 | Advanced Patterns | LOW | `advanced-` |
 ## Quick Reference
 ### 1. Eliminating Waterfalls (CRITICAL)
 - `async-defer-await` - Move await into branches where actually used
 - `async-parallel` - Use Promise.all() for independent operations
 - `async-dependencies` - Use better-all for partial dependencies
 - `async-api-routes` - Start promises early, await late in API routes
 - `async-suspense-boundaries` - Use Suspense to stream content
 ### 2. Bundle Size Optimization (CRITICAL)
 - `bundle-barrel-imports` - Import directly, avoid barrel files
 - `bundle-dynamic-imports` - Use next/dynamic for heavy components
 - `bundle-defer-third-party` - Load analytics/logging after hydration
 - `bundle-conditional` - Load modules only when feature is activated
 - `bundle-preload` - Preload on hover/focus for perceived speed
 ### 3. Server-Side Performance (HIGH)
 - `server-cache-react` - Use React.cache() for per-request deduplication
 - `server-cache-lru` - Use LRU cache for cross-request caching
 - `server-serialization` - Minimize data passed to client components
 - `server-parallel-fetching` - Restructure components to parallelize fetches
 - `server-after-nonblocking` - Use after() for non-blocking operations
 ### 4. Client-Side Data Fetching (MEDIUM-HIGH)
 - `client-swr-dedup` - Use SWR for automatic request deduplication
 - `client-event-listeners` - Deduplicate global event listeners
 ### 5. Re-render Optimization (MEDIUM)
 - `rerender-defer-reads` - Don't subscribe to state only used in callbacks
 - `rerender-memo` - Extract expensive work into memoized components
 - `rerender-dependencies` - Use primitive dependencies in effects
 - `rerender-derived-state` - Subscribe to derived booleans, not raw values
 - `rerender-functional-setstate` - Use functional setState for stable callbacks
 - `rerender-lazy-state-init` - Pass function to useState for expensive values
 - `rerender-transitions` - Use startTransition for non-urgent updates
 ### 6. Rendering Performance (MEDIUM)
 - `rendering-animate-svg-wrapper` - Animate div wrapper, not SVG element
 - `rendering-content-visibility` - Use content-visibility for long lists
 - `rendering-hoist-jsx` - Extract static JSX outside components
 - `rendering-svg-precision` - Reduce SVG coordinate precision
 - `rendering-hydration-no-flicker` - Use inline script for client-only data
 - `rendering-activity` - Use Activity component for show/hide
 - `rendering-conditional-render` - Use ternary, not && for conditionals
 ### 7. JavaScript Performance (LOW-MEDIUM)
 - `js-batch-dom-css` - Group CSS changes via classes or cssText
 - `js-index-maps` - Build Map for repeated lookups
 - `js-cache-property-access` - Cache object properties in loops
 - `js-cache-function-results` - Cache function results in module-level Map
 - `js-cache-storage` - Cache localStorage/sessionStorage reads
 - `js-combine-iterations` - Combine multiple filter/map into one loop
 - `js-length-check-first` - Check array length before expensive comparison
 - `js-early-exit` - Return early from functions
 - `js-hoist-regexp` - Hoist RegExp creation outside loops
 - `js-min-max-loop` - Use loop for min/max instead of sort
 - `js-set-map-lookups` - Use Set/Map for O(1) lookups
 - `js-tosorted-immutable` - Use toSorted() for immutability
 ### 8. Advanced Patterns (LOW)
 - `advanced-event-handler-refs` - Store event handlers in refs
 - `advanced-use-latest` - useLatest for stable callback refs
 ## How to Use
 Read individual rule files for detailed explanations and code examples:
 ```
 rules/async-parallel.md
 rules/bundle-barrel-imports.md
 rules/_sections.md
 ```
 Each rule file contains:
 - Brief explanation of why it matters
 - Incorrect code example with explanation
 - Correct code example with explanation
 - Additional context and references
 ## Full Compiled Document
 For the complete guide with all rules expanded: `AGENTS.md`
--- a/.claude/skills/vercel-react-best-practices/rules/advanced-event-handler-refs.md
+++ b/.claude/skills/vercel-react-best-practices/rules/advanced-event-handler-refs.md
@@ -0,0 +1,55 @@
 ---
 title: Store Event Handlers in Refs
 impact: LOW
 impactDescription: stable subscriptions
 tags: advanced, hooks, refs, event-handlers, optimization
 ---
 ## Store Event Handlers in Refs
 Store callbacks in refs when used in effects that shouldn't re-subscribe on callback changes.
 **Incorrect (re-subscribes on every render):**
 ```tsx
 function useWindowEvent(event: string, handler: () => void) {
  useEffect(() => {
    window.addEventListener(event, handler)
    return () => window.removeEventListener(event, handler)
  }, [event, handler])
 }
 ```
 **Correct (stable subscription):**
 ```tsx
 function useWindowEvent(event: string, handler: () => void) {
  const handlerRef = useRef(handler)
  useEffect(() => {
    handlerRef.current = handler
  }, [handler])
  useEffect(() => {
    const listener = () => handlerRef.current()
    window.addEventListener(event, listener)
    return () => window.removeEventListener(event, listener)
  }, [event])
 }
 ```
 **Alternative: use `useEffectEvent` if you're on latest React:**
 ```tsx
 import { useEffectEvent } from 'react'
 function useWindowEvent(event: string, handler: () => void) {
  const onEvent = useEffectEvent(handler)
  useEffect(() => {
    window.addEventListener(event, onEvent)
    return () => window.removeEventListener(event, onEvent)
  }, [event])
 }
 ```
 `useEffectEvent` provides a cleaner API for the same pattern: it creates a stable function reference that always calls the latest version of the handler.
--- a/.claude/skills/vercel-react-best-practices/rules/advanced-use-latest.md
+++ b/.claude/skills/vercel-react-best-practices/rules/advanced-use-latest.md
@@ -0,0 +1,49 @@
 ---
 title: useLatest for Stable Callback Refs
 impact: LOW
 impactDescription: prevents effect re-runs
 tags: advanced, hooks, useLatest, refs, optimization
 ---
 ## useLatest for Stable Callback Refs
 Access latest values in callbacks without adding them to dependency arrays. Prevents effect re-runs while avoiding stale closures.
 **Implementation:**
 ```typescript
 function useLatest<T>(value: T) {
  const ref = useRef(value)
  useEffect(() => {
    ref.current = value
  }, [value])
  return ref
 }
 ```
 **Incorrect (effect re-runs on every callback change):**
 ```tsx
 function SearchInput({ onSearch }: { onSearch: (q: string) => void }) {
  const [query, setQuery] = useState('')
  useEffect(() => {
    const timeout = setTimeout(() => onSearch(query), 300)
    return () => clearTimeout(timeout)
  }, [query, onSearch])
 }
 ```
 **Correct (stable effect, fresh callback):**
 ```tsx
 function SearchInput({ onSearch }: { onSearch: (q: string) => void }) {
  const [query, setQuery] = useState('')
  const onSearchRef = useLatest(onSearch)
  useEffect(() => {
    const timeout = setTimeout(() => onSearchRef.current(query), 300)
    return () => clearTimeout(timeout)
  }, [query])
 }
 ```
--- a/.claude/skills/vercel-react-best-practices/rules/async-api-routes.md
+++ b/.claude/skills/vercel-react-best-practices/rules/async-api-routes.md
@@ -0,0 +1,38 @@
 ---
 title: Prevent Waterfall Chains in API Routes
 impact: CRITICAL
 impactDescription: 2-10× improvement
 tags: api-routes, server-actions, waterfalls, parallelization
 ---
 ## Prevent Waterfall Chains in API Routes
 In API routes and Server Actions, start independent operations immediately, even if you don't await them yet.
 **Incorrect (config waits for auth, data waits for both):**
 ```typescript
 export async function GET(request: Request) {
  const session = await auth()
  const config = await fetchConfig()
  const data = await fetchData(session.user.id)
  return Response.json({ data, config })
 }
 ```
 **Correct (auth and config start immediately):**
 ```typescript
 export async function GET(request: Request) {
  const sessionPromise = auth()
  const configPromise = fetchConfig()
  const session = await sessionPromise
  const [config, data] = await Promise.all([
    configPromise,
    fetchData(session.user.id)
  ])
  return Response.json({ data, config })
 }
 ```
 For operations with more complex dependency chains, use `better-all` to automatically maximize parallelism (see Dependency-Based Parallelization).
--- a/.claude/skills/vercel-react-best-practices/rules/async-defer-await.md
+++ b/.claude/skills/vercel-react-best-practices/rules/async-defer-await.md
@@ -0,0 +1,80 @@
 ---
 title: Defer Await Until Needed
 impact: HIGH
 impactDescription: avoids blocking unused code paths
 tags: async, await, conditional, optimization
 ---
 ## Defer Await Until Needed
 Move `await` operations into the branches where they're actually used to avoid blocking code paths that don't need them.
 **Incorrect (blocks both branches):**
 ```typescript
 async function handleRequest(userId: string, skipProcessing: boolean) {
  const userData = await fetchUserData(userId)
  if (skipProcessing) {
    // Returns immediately but still waited for userData
    return { skipped: true }
  }
  // Only this branch uses userData
  return processUserData(userData)
 }
 ```
 **Correct (only blocks when needed):**
 ```typescript
 async function handleRequest(userId: string, skipProcessing: boolean) {
  if (skipProcessing) {
    // Returns immediately without waiting
    return { skipped: true }
  }
  // Fetch only when needed
  const userData = await fetchUserData(userId)
  return processUserData(userData)
 }
 ```
 **Another example (early return optimization):**
 ```typescript
 // Incorrect: always fetches permissions
 async function updateResource(resourceId: string, userId: string) {
  const permissions = await fetchPermissions(userId)
  const resource = await getResource(resourceId)
  if (!resource) {
    return { error: 'Not found' }
  }
  if (!permissions.canEdit) {
    return { error: 'Forbidden' }
  }
  return await updateResourceData(resource, permissions)
 }
 // Correct: fetches only when needed
 async function updateResource(resourceId: string, userId: string) {
  const resource = await getResource(resourceId)
  if (!resource) {
    return { error: 'Not found' }
  }
  const permissions = await fetchPermissions(userId)
  if (!permissions.canEdit) {
    return { error: 'Forbidden' }
  }
  return await updateResourceData(resource, permissions)
 }
 ```
 This optimization is especially valuable when the skipped branch is frequently taken, or when the deferred operation is expensive.
--- a/.claude/skills/vercel-react-best-practices/rules/async-dependencies.md
+++ b/.claude/skills/vercel-react-best-practices/rules/async-dependencies.md
@@ -0,0 +1,36 @@
 ---
 title: Dependency-Based Parallelization
 impact: CRITICAL
 impactDescription: 2-10× improvement
 tags: async, parallelization, dependencies, better-all
 ---
 ## Dependency-Based Parallelization
 For operations with partial dependencies, use `better-all` to maximize parallelism. It automatically starts each task at the earliest possible moment.
 **Incorrect (profile waits for config unnecessarily):**
 ```typescript
 const [user, config] = await Promise.all([
  fetchUser(),
  fetchConfig()
 ])
 const profile = await fetchProfile(user.id)
 ```
 **Correct (config and profile run in parallel):**
 ```typescript
 import { all } from 'better-all'
 const { user, config, profile } = await all({
  async user() { return fetchUser() },
  async config() { return fetchConfig() },
  async profile() {
    return fetchProfile((await this.$.user).id)
  }
 })
 ```
 Reference: [https://github.com/shuding/better-all](https://github.com/shuding/better-all)
--- a/.claude/skills/vercel-react-best-practices/rules/async-parallel.md
+++ b/.claude/skills/vercel-react-best-practices/rules/async-parallel.md
@@ -0,0 +1,28 @@
 ---
 title: Promise.all() for Independent Operations
 impact: CRITICAL
 impactDescription: 2-10× improvement
 tags: async, parallelization, promises, waterfalls
 ---
 ## Promise.all() for Independent Operations
 When async operations have no interdependencies, execute them concurrently using `Promise.all()`.
 **Incorrect (sequential execution, 3 round trips):**
 ```typescript
 const user = await fetchUser()
 const posts = await fetchPosts()
 const comments = await fetchComments()
 ```
 **Correct (parallel execution, 1 round trip):**
 ```typescript
 const [user, posts, comments] = await Promise.all([
  fetchUser(),
  fetchPosts(),
  fetchComments()
 ])
 ```
--- a/.claude/skills/vercel-react-best-practices/rules/async-suspense-boundaries.md
+++ b/.claude/skills/vercel-react-best-practices/rules/async-suspense-boundaries.md
@@ -0,0 +1,99 @@
 ---
 title: Strategic Suspense Boundaries
 impact: HIGH
 impactDescription: faster initial paint
 tags: async, suspense, streaming, layout-shift
 ---
 ## Strategic Suspense Boundaries
 Instead of awaiting data in async components before returning JSX, use Suspense boundaries to show the wrapper UI faster while data loads.
 **Incorrect (wrapper blocked by data fetching):**
 ```tsx
 async function Page() {
  const data = await fetchData() // Blocks entire page
  return (
    <div>
      <div>Sidebar</div>
      <div>Header</div>
      <div>
        <DataDisplay data={data} />
      </div>
      <div>Footer</div>
    </div>
  )
 }
 ```
 The entire layout waits for data even though only the middle section needs it.
 **Correct (wrapper shows immediately, data streams in):**
 ```tsx
 function Page() {
  return (
    <div>
      <div>Sidebar</div>
      <div>Header</div>
      <div>
        <Suspense fallback={<Skeleton />}>
          <DataDisplay />
        </Suspense>
      </div>
      <div>Footer</div>
    </div>
  )
 }
 async function DataDisplay() {
  const data = await fetchData() // Only blocks this component
  return <div>{data.content}</div>
 }
 ```
 Sidebar, Header, and Footer render immediately. Only DataDisplay waits for data.
 **Alternative (share promise across components):**
 ```tsx
 function Page() {
  // Start fetch immediately, but don't await
  const dataPromise = fetchData()
  return (
    <div>
      <div>Sidebar</div>
      <div>Header</div>
      <Suspense fallback={<Skeleton />}>
        <DataDisplay dataPromise={dataPromise} />
        <DataSummary dataPromise={dataPromise} />
      </Suspense>
      <div>Footer</div>
    </div>
  )
 }
 function DataDisplay({ dataPromise }: { dataPromise: Promise<Data> }) {
  const data = use(dataPromise) // Unwraps the promise
  return <div>{data.content}</div>
 }
 function DataSummary({ dataPromise }: { dataPromise: Promise<Data> }) {
  const data = use(dataPromise) // Reuses the same promise
  return <div>{data.summary}</div>
 }
 ```
 Both components share the same promise, so only one fetch occurs. Layout renders immediately while both components wait together.
 **When NOT to use this pattern:**
 - Critical data needed for layout decisions (affects positioning)
 - SEO-critical content above the fold
 - Small, fast queries where suspense overhead isn't worth it
 - When you want to avoid layout shift (loading → content jump)
 **Trade-off:** Faster initial paint vs potential layout shift. Choose based on your UX priorities.
--- a/.claude/skills/vercel-react-best-practices/rules/bundle-barrel-imports.md
+++ b/.claude/skills/vercel-react-best-practices/rules/bundle-barrel-imports.md
@@ -0,0 +1,59 @@
 ---
 title: Avoid Barrel File Imports
 impact: CRITICAL
 impactDescription: 200-800ms import cost, slow builds
 tags: bundle, imports, tree-shaking, barrel-files, performance
 ---
 ## Avoid Barrel File Imports
 Import directly from source files instead of barrel files to avoid loading thousands of unused modules. **Barrel files** are entry points that re-export multiple modules (e.g., `index.js` that does `export * from './module'`).
 Popular icon and component libraries can have **up to 10,000 re-exports** in their entry file. For many React packages, **it takes 200-800ms just to import them**, affecting both development speed and production cold starts.
 **Why tree-shaking doesn't help:** When a library is marked as external (not bundled), the bundler can't optimize it. If you bundle it to enable tree-shaking, builds become substantially slower analyzing the entire module graph.
 **Incorrect (imports entire library):**
 ```tsx
 import { Check, X, Menu } from 'lucide-react'
 // Loads 1,583 modules, takes ~2.8s extra in dev
 // Runtime cost: 200-800ms on every cold start
 import { Button, TextField } from '@mui/material'
 // Loads 2,225 modules, takes ~4.2s extra in dev
 ```
 **Correct (imports only what you need):**
 ```tsx
 import Check from 'lucide-react/dist/esm/icons/check'
 import X from 'lucide-react/dist/esm/icons/x'
 import Menu from 'lucide-react/dist/esm/icons/menu'
 // Loads only 3 modules (~2KB vs ~1MB)
 import Button from '@mui/material/Button'
 import TextField from '@mui/material/TextField'
 // Loads only what you use
 ```
 **Alternative (Next.js 13.5+):**
 ```js
 // next.config.js - use optimizePackageImports
 module.exports = {
  experimental: {
    optimizePackageImports: ['lucide-react', '@mui/material']
  }
 }
 // Then you can keep the ergonomic barrel imports:
 import { Check, X, Menu } from 'lucide-react'
 // Automatically transformed to direct imports at build time
 ```
 Direct imports provide 15-70% faster dev boot, 28% faster builds, 40% faster cold starts, and significantly faster HMR.
 Libraries commonly affected: `lucide-react`, `@mui/material`, `@mui/icons-material`, `@tabler/icons-react`, `react-icons`, `@headlessui/react`, `@radix-ui/react-*`, `lodash`, `ramda`, `date-fns`, `rxjs`, `react-use`.
 Reference: [How we optimized package imports in Next.js](https://vercel.com/blog/how-we-optimized-package-imports-in-next-js)
--- a/.claude/skills/vercel-react-best-practices/rules/bundle-conditional.md
+++ b/.claude/skills/vercel-react-best-practices/rules/bundle-conditional.md
@@ -0,0 +1,31 @@
 ---
 title: Conditional Module Loading
 impact: HIGH
 impactDescription: loads large data only when needed
 tags: bundle, conditional-loading, lazy-loading
 ---
 ## Conditional Module Loading
 Load large data or modules only when a feature is activated.
 **Example (lazy-load animation frames):**
 ```tsx
 function AnimationPlayer({ enabled }: { enabled: boolean }) {
  const [frames, setFrames] = useState<Frame[] | null>(null)
  useEffect(() => {
    if (enabled && !frames && typeof window !== 'undefined') {
      import('./animation-frames.js')
        .then(mod => setFrames(mod.frames))
        .catch(() => setEnabled(false))
    }
  }, [enabled, frames])
  if (!frames) return <Skeleton />
  return <Canvas frames={frames} />
 }
 ```
 The `typeof window !== 'undefined'` check prevents bundling this module for SSR, optimizing server bundle size and build speed.
--- a/.claude/skills/vercel-react-best-practices/rules/bundle-defer-third-party.md
+++ b/.claude/skills/vercel-react-best-practices/rules/bundle-defer-third-party.md
@@ -0,0 +1,49 @@
 ---
 title: Defer Non-Critical Third-Party Libraries
 impact: MEDIUM
 impactDescription: loads after hydration
 tags: bundle, third-party, analytics, defer
 ---
 ## Defer Non-Critical Third-Party Libraries
 Analytics, logging, and error tracking don't block user interaction. Load them after hydration.
 **Incorrect (blocks initial bundle):**
 ```tsx
 import { Analytics } from '@vercel/analytics/react'
 export default function RootLayout({ children }) {
  return (
    <html>
      <body>
        {children}
        <Analytics />
      </body>
    </html>
  )
 }
 ```
 **Correct (loads after hydration):**
 ```tsx
 import dynamic from 'next/dynamic'
 const Analytics = dynamic(
  () => import('@vercel/analytics/react').then(m => m.Analytics),
  { ssr: false }
 )
 export default function RootLayout({ children }) {
  return (
    <html>
      <body>
        {children}
        <Analytics />
      </body>
    </html>
  )
 }
 ```
--- a/.claude/skills/vercel-react-best-practices/rules/bundle-dynamic-imports.md
+++ b/.claude/skills/vercel-react-best-practices/rules/bundle-dynamic-imports.md
@@ -0,0 +1,35 @@
 ---
 title: Dynamic Imports for Heavy Components
 impact: CRITICAL
 impactDescription: directly affects TTI and LCP
 tags: bundle, dynamic-import, code-splitting, next-dynamic
 ---
 ## Dynamic Imports for Heavy Components
 Use `next/dynamic` to lazy-load large components not needed on initial render.
 **Incorrect (Monaco bundles with main chunk ~300KB):**
 ```tsx
 import { MonacoEditor } from './monaco-editor'
 function CodePanel({ code }: { code: string }) {
  return <MonacoEditor value={code} />
 }
 ```
 **Correct (Monaco loads on demand):**
 ```tsx
 import dynamic from 'next/dynamic'
 const MonacoEditor = dynamic(
  () => import('./monaco-editor').then(m => m.MonacoEditor),
  { ssr: false }
 )
 function CodePanel({ code }: { code: string }) {
  return <MonacoEditor value={code} />
 }
 ```
--- a/.claude/skills/vercel-react-best-practices/rules/bundle-preload.md
+++ b/.claude/skills/vercel-react-best-practices/rules/bundle-preload.md
@@ -0,0 +1,50 @@
 ---
 title: Preload Based on User Intent
 impact: MEDIUM
 impactDescription: reduces perceived latency
 tags: bundle, preload, user-intent, hover
 ---
 ## Preload Based on User Intent
 Preload heavy bundles before they're needed to reduce perceived latency.
 **Example (preload on hover/focus):**
 ```tsx
 function EditorButton({ onClick }: { onClick: () => void }) {
  const preload = () => {
    if (typeof window !== 'undefined') {
      void import('./monaco-editor')
    }
  }
  return (
    <button
      onMouseEnter={preload}
      onFocus={preload}
      onClick={onClick}
    >
      Open Editor
    </button>
  )
 }
 ```
 **Example (preload when feature flag is enabled):**
 ```tsx
 function FlagsProvider({ children, flags }: Props) {
  useEffect(() => {
    if (flags.editorEnabled && typeof window !== 'undefined') {
      void import('./monaco-editor').then(mod => mod.init())
    }
  }, [flags.editorEnabled])
  return <FlagsContext.Provider value={flags}>
    {children}
  </FlagsContext.Provider>
 }
 ```
 The `typeof window !== 'undefined'` check prevents bundling preloaded modules for SSR, optimizing server bundle size and build speed.
--- a/.claude/skills/vercel-react-best-practices/rules/client-event-listeners.md
+++ b/.claude/skills/vercel-react-best-practices/rules/client-event-listeners.md
@@ -0,0 +1,74 @@
 ---
 title: Deduplicate Global Event Listeners
 impact: LOW
 impactDescription: single listener for N components
 tags: client, swr, event-listeners, subscription
 ---
 ## Deduplicate Global Event Listeners
 Use `useSWRSubscription()` to share global event listeners across component instances.
 **Incorrect (N instances = N listeners):**
 ```tsx
 function useKeyboardShortcut(key: string, callback: () => void) {
  useEffect(() => {
    const handler = (e: KeyboardEvent) => {
      if (e.metaKey && e.key === key) {
        callback()
      }
    }
    window.addEventListener('keydown', handler)
    return () => window.removeEventListener('keydown', handler)
  }, [key, callback])
 }
 ```
 When using the `useKeyboardShortcut` hook multiple times, each instance will register a new listener.
 **Correct (N instances = 1 listener):**
 ```tsx
 import useSWRSubscription from 'swr/subscription'
 // Module-level Map to track callbacks per key
 const keyCallbacks = new Map<string, Set<() => void>>()
 function useKeyboardShortcut(key: string, callback: () => void) {
  // Register this callback in the Map
  useEffect(() => {
    if (!keyCallbacks.has(key)) {
      keyCallbacks.set(key, new Set())
    }
    keyCallbacks.get(key)!.add(callback)
    return () => {
      const set = keyCallbacks.get(key)
      if (set) {
        set.delete(callback)
        if (set.size === 0) {
          keyCallbacks.delete(key)
        }
      }
    }
  }, [key, callback])
  useSWRSubscription('global-keydown', () => {
    const handler = (e: KeyboardEvent) => {
      if (e.metaKey && keyCallbacks.has(e.key)) {
        keyCallbacks.get(e.key)!.forEach(cb => cb())
      }
    }
    window.addEventListener('keydown', handler)
    return () => window.removeEventListener('keydown', handler)
  })
 }
 function Profile() {
  // Multiple shortcuts will share the same listener
  useKeyboardShortcut('p', () => { /* ... */ }) 
  useKeyboardShortcut('k', () => { /* ... */ })
  // ...
 }
 ```
--- a/.claude/skills/vercel-react-best-practices/rules/client-swr-dedup.md
+++ b/.claude/skills/vercel-react-best-practices/rules/client-swr-dedup.md
@@ -0,0 +1,56 @@
 ---
 title: Use SWR for Automatic Deduplication
 impact: MEDIUM-HIGH
 impactDescription: automatic deduplication
 tags: client, swr, deduplication, data-fetching
 ---
 ## Use SWR for Automatic Deduplication
 SWR enables request deduplication, caching, and revalidation across component instances.
 **Incorrect (no deduplication, each instance fetches):**
 ```tsx
 function UserList() {
  const [users, setUsers] = useState([])
  useEffect(() => {
    fetch('/api/users')
      .then(r => r.json())
      .then(setUsers)
  }, [])
 }
 ```
 **Correct (multiple instances share one request):**
 ```tsx
 import useSWR from 'swr'
 function UserList() {
  const { data: users } = useSWR('/api/users', fetcher)
 }
 ```
 **For immutable data:**
 ```tsx
 import { useImmutableSWR } from '@/lib/swr'
 function StaticContent() {
  const { data } = useImmutableSWR('/api/config', fetcher)
 }
 ```
 **For mutations:**
 ```tsx
 import { useSWRMutation } from 'swr/mutation'
 function UpdateButton() {
  const { trigger } = useSWRMutation('/api/user', updateUser)
  return <button onClick={() => trigger()}>Update</button>
 }
 ```
 Reference: [https://swr.vercel.app](https://swr.vercel.app)
--- a/.claude/skills/vercel-react-best-practices/rules/js-batch-dom-css.md
+++ b/.claude/skills/vercel-react-best-practices/rules/js-batch-dom-css.md
@@ -0,0 +1,82 @@
 ---
 title: Batch DOM CSS Changes
 impact: MEDIUM
 impactDescription: reduces reflows/repaints
 tags: javascript, dom, css, performance, reflow
 ---
 ## Batch DOM CSS Changes
 Avoid changing styles one property at a time. Group multiple CSS changes together via classes or `cssText` to minimize browser reflows.
 **Incorrect (multiple reflows):**
 ```typescript
 function updateElementStyles(element: HTMLElement) {
  // Each line triggers a reflow
  element.style.width = '100px'
  element.style.height = '200px'
  element.style.backgroundColor = 'blue'
  element.style.border = '1px solid black'
 }
 ```
 **Correct (add class - single reflow):**
 ```typescript
 // CSS file
 .highlighted-box {
  width: 100px;
  height: 200px;
  background-color: blue;
  border: 1px solid black;
 }
 // JavaScript
 function updateElementStyles(element: HTMLElement) {
  element.classList.add('highlighted-box')
 }
 ```
 **Correct (change cssText - single reflow):**
 ```typescript
 function updateElementStyles(element: HTMLElement) {
  element.style.cssText = `
    width: 100px;
    height: 200px;
    background-color: blue;
    border: 1px solid black;
  `
 }
 ```
 **React example:**
 ```tsx
 // Incorrect: changing styles one by one
 function Box({ isHighlighted }: { isHighlighted: boolean }) {
  const ref = useRef<HTMLDivElement>(null)
  useEffect(() => {
    if (ref.current && isHighlighted) {
      ref.current.style.width = '100px'
      ref.current.style.height = '200px'
      ref.current.style.backgroundColor = 'blue'
    }
  }, [isHighlighted])
  return <div ref={ref}>Content</div>
 }
 // Correct: toggle class
 function Box({ isHighlighted }: { isHighlighted: boolean }) {
  return (
    <div className={isHighlighted ? 'highlighted-box' : ''}>
      Content
    </div>
  )
 }
 ```
 Prefer CSS classes over inline styles when possible. Classes are cached by the browser and provide better separation of concerns.
--- a/.claude/skills/vercel-react-best-practices/rules/js-cache-function-results.md
+++ b/.claude/skills/vercel-react-best-practices/rules/js-cache-function-results.md
@@ -0,0 +1,80 @@
 ---
 title: Cache Repeated Function Calls
 impact: MEDIUM
 impactDescription: avoid redundant computation
 tags: javascript, cache, memoization, performance
 ---
 ## Cache Repeated Function Calls
 Use a module-level Map to cache function results when the same function is called repeatedly with the same inputs during render.
 **Incorrect (redundant computation):**
 ```typescript
 function ProjectList({ projects }: { projects: Project[] }) {
  return (
    <div>
      {projects.map(project => {
        // slugify() called 100+ times for same project names
        const slug = slugify(project.name)
        return <ProjectCard key={project.id} slug={slug} />
      })}
    </div>
  )
 }
 ```
 **Correct (cached results):**
 ```typescript
 // Module-level cache
 const slugifyCache = new Map<string, string>()
 function cachedSlugify(text: string): string {
  if (slugifyCache.has(text)) {
    return slugifyCache.get(text)!
  }
  const result = slugify(text)
  slugifyCache.set(text, result)
  return result
 }
 function ProjectList({ projects }: { projects: Project[] }) {
  return (
    <div>
      {projects.map(project => {
        // Computed only once per unique project name
        const slug = cachedSlugify(project.name)
        return <ProjectCard key={project.id} slug={slug} />
      })}
    </div>
  )
 }
 ```
 **Simpler pattern for single-value functions:**
 ```typescript
 let isLoggedInCache: boolean | null = null
 function isLoggedIn(): boolean {
  if (isLoggedInCache !== null) {
    return isLoggedInCache
  }
  isLoggedInCache = document.cookie.includes('auth=')
  return isLoggedInCache
 }
 // Clear cache when auth changes
 function onAuthChange() {
  isLoggedInCache = null
 }
 ```
 Use a Map (not a hook) so it works everywhere: utilities, event handlers, not just React components.
 Reference: [How we made the Vercel Dashboard twice as fast](https://vercel.com/blog/how-we-made-the-vercel-dashboard-twice-as-fast)
--- a/.claude/skills/vercel-react-best-practices/rules/js-cache-property-access.md
+++ b/.claude/skills/vercel-react-best-practices/rules/js-cache-property-access.md
@@ -0,0 +1,28 @@
 ---
 title: Cache Property Access in Loops
 impact: LOW-MEDIUM
 impactDescription: reduces lookups
 tags: javascript, loops, optimization, caching
 ---
 ## Cache Property Access in Loops
 Cache object property lookups in hot paths.
 **Incorrect (3 lookups × N iterations):**
 ```typescript
 for (let i = 0; i < arr.length; i++) {
  process(obj.config.settings.value)
 }
 ```
 **Correct (1 lookup total):**
 ```typescript
 const value = obj.config.settings.value
 const len = arr.length
 for (let i = 0; i < len; i++) {
  process(value)
 }
 ```
--- a/.claude/skills/vercel-react-best-practices/rules/js-cache-storage.md
+++ b/.claude/skills/vercel-react-best-practices/rules/js-cache-storage.md
@@ -0,0 +1,70 @@
 ---
 title: Cache Storage API Calls
 impact: LOW-MEDIUM
 impactDescription: reduces expensive I/O
 tags: javascript, localStorage, storage, caching, performance
 ---
 ## Cache Storage API Calls
 `localStorage`, `sessionStorage`, and `document.cookie` are synchronous and expensive. Cache reads in memory.
 **Incorrect (reads storage on every call):**
 ```typescript
 function getTheme() {
  return localStorage.getItem('theme') ?? 'light'
 }
 // Called 10 times = 10 storage reads
 ```
 **Correct (Map cache):**
 ```typescript
 const storageCache = new Map<string, string | null>()
 function getLocalStorage(key: string) {
  if (!storageCache.has(key)) {
    storageCache.set(key, localStorage.getItem(key))
  }
  return storageCache.get(key)
 }
 function setLocalStorage(key: string, value: string) {
  localStorage.setItem(key, value)
  storageCache.set(key, value)  // keep cache in sync
 }
 ```
 Use a Map (not a hook) so it works everywhere: utilities, event handlers, not just React components.
 **Cookie caching:**
 ```typescript
 let cookieCache: Record<string, string> | null = null
 function getCookie(name: string) {
  if (!cookieCache) {
    cookieCache = Object.fromEntries(
      document.cookie.split('; ').map(c => c.split('='))
    )
  }
  return cookieCache[name]
 }
 ```
 **Important (invalidate on external changes):**
 If storage can change externally (another tab, server-set cookies), invalidate cache:
 ```typescript
 window.addEventListener('storage', (e) => {
  if (e.key) storageCache.delete(e.key)
 })
 document.addEventListener('visibilitychange', () => {
  if (document.visibilityState === 'visible') {
    storageCache.clear()
  }
 })
 ```
--- a/.claude/skills/vercel-react-best-practices/rules/js-combine-iterations.md
+++ b/.claude/skills/vercel-react-best-practices/rules/js-combine-iterations.md
@@ -0,0 +1,32 @@
 ---
 title: Combine Multiple Array Iterations
 impact: LOW-MEDIUM
 impactDescription: reduces iterations
 tags: javascript, arrays, loops, performance
 ---
 ## Combine Multiple Array Iterations
 Multiple `.filter()` or `.map()` calls iterate the array multiple times. Combine into one loop.
 **Incorrect (3 iterations):**
 ```typescript
 const admins = users.filter(u => u.isAdmin)
 const testers = users.filter(u => u.isTester)
 const inactive = users.filter(u => !u.isActive)
 ```
 **Correct (1 iteration):**
 ```typescript
 const admins: User[] = []
 const testers: User[] = []
 const inactive: User[] = []
 for (const user of users) {
  if (user.isAdmin) admins.push(user)
  if (user.isTester) testers.push(user)
  if (!user.isActive) inactive.push(user)
 }
 ```
--- a/.claude/skills/vercel-react-best-practices/rules/js-early-exit.md
+++ b/.claude/skills/vercel-react-best-practices/rules/js-early-exit.md
@@ -0,0 +1,50 @@
 ---
 title: Early Return from Functions
 impact: LOW-MEDIUM
 impactDescription: avoids unnecessary computation
 tags: javascript, functions, optimization, early-return
 ---
 ## Early Return from Functions
 Return early when result is determined to skip unnecessary processing.
 **Incorrect (processes all items even after finding answer):**
 ```typescript
 function validateUsers(users: User[]) {
  let hasError = false
  let errorMessage = ''
  for (const user of users) {
    if (!user.email) {
      hasError = true
      errorMessage = 'Email required'
    }
    if (!user.name) {
      hasError = true
      errorMessage = 'Name required'
    }
    // Continues checking all users even after error found
  }
  return hasError ? { valid: false, error: errorMessage } : { valid: true }
 }
 ```
 **Correct (returns immediately on first error):**
 ```typescript
 function validateUsers(users: User[]) {
  for (const user of users) {
    if (!user.email) {
      return { valid: false, error: 'Email required' }
    }
    if (!user.name) {
      return { valid: false, error: 'Name required' }
    }
  }
  return { valid: true }
 }
 ```
--- a/.claude/skills/vercel-react-best-practices/rules/js-hoist-regexp.md
+++ b/.claude/skills/vercel-react-best-practices/rules/js-hoist-regexp.md
@@ -0,0 +1,45 @@
 ---
 title: Hoist RegExp Creation
 impact: LOW-MEDIUM
 impactDescription: avoids recreation
 tags: javascript, regexp, optimization, memoization
 ---
 ## Hoist RegExp Creation
 Don't create RegExp inside render. Hoist to module scope or memoize with `useMemo()`.
 **Incorrect (new RegExp every render):**
 ```tsx
 function Highlighter({ text, query }: Props) {
  const regex = new RegExp(`(${query})`, 'gi')
  const parts = text.split(regex)
  return <>{parts.map((part, i) => ...)}</>
 }
 ```
 **Correct (memoize or hoist):**
 ```tsx
 const EMAIL_REGEX = /^[^\s@]+@[^\s@]+\.[^\s@]+$/
 function Highlighter({ text, query }: Props) {
  const regex = useMemo(
    () => new RegExp(`(${escapeRegex(query)})`, 'gi'),
    [query]
  )
  const parts = text.split(regex)
  return <>{parts.map((part, i) => ...)}</>
 }
 ```
 **Warning (global regex has mutable state):**
 Global regex (`/g`) has mutable `lastIndex` state:
 ```typescript
 const regex = /foo/g
 regex.test('foo')  // true, lastIndex = 3
 regex.test('foo')  // false, lastIndex = 0
 ```
--- a/.claude/skills/vercel-react-best-practices/rules/js-index-maps.md
+++ b/.claude/skills/vercel-react-best-practices/rules/js-index-maps.md
@@ -0,0 +1,37 @@
 ---
 title: Build Index Maps for Repeated Lookups
 impact: LOW-MEDIUM
 impactDescription: 1M ops to 2K ops
 tags: javascript, map, indexing, optimization, performance
 ---
 ## Build Index Maps for Repeated Lookups
 Multiple `.find()` calls by the same key should use a Map.
 **Incorrect (O(n) per lookup):**
 ```typescript
 function processOrders(orders: Order[], users: User[]) {
  return orders.map(order => ({
    ...order,
    user: users.find(u => u.id === order.userId)
  }))
 }
 ```
 **Correct (O(1) per lookup):**
 ```typescript
 function processOrders(orders: Order[], users: User[]) {
  const userById = new Map(users.map(u => [u.id, u]))
  return orders.map(order => ({
    ...order,
    user: userById.get(order.userId)
  }))
 }
 ```
 Build map once (O(n)), then all lookups are O(1).
 For 1000 orders × 1000 users: 1M ops → 2K ops.
--- a/.claude/skills/vercel-react-best-practices/rules/js-length-check-first.md
+++ b/.claude/skills/vercel-react-best-practices/rules/js-length-check-first.md
@@ -0,0 +1,49 @@
 ---
 title: Early Length Check for Array Comparisons
 impact: MEDIUM-HIGH
 impactDescription: avoids expensive operations when lengths differ
 tags: javascript, arrays, performance, optimization, comparison
 ---
 ## Early Length Check for Array Comparisons
 When comparing arrays with expensive operations (sorting, deep equality, serialization), check lengths first. If lengths differ, the arrays cannot be equal.
 In real-world applications, this optimization is especially valuable when the comparison runs in hot paths (event handlers, render loops).
 **Incorrect (always runs expensive comparison):**
 ```typescript
 function hasChanges(current: string[], original: string[]) {
  // Always sorts and joins, even when lengths differ
  return current.sort().join() !== original.sort().join()
 }
 ```
 Two O(n log n) sorts run even when `current.length` is 5 and `original.length` is 100. There is also overhead of joining the arrays and comparing the strings.
 **Correct (O(1) length check first):**
 ```typescript
 function hasChanges(current: string[], original: string[]) {
  // Early return if lengths differ
  if (current.length !== original.length) {
    return true
  }
  // Only sort/join when lengths match
  const currentSorted = current.toSorted()
  const originalSorted = original.toSorted()
  for (let i = 0; i < currentSorted.length; i++) {
    if (currentSorted[i] !== originalSorted[i]) {
      return true
    }
  }
  return false
 }
 ```
 This new approach is more efficient because:
 - It avoids the overhead of sorting and joining the arrays when lengths differ
 - It avoids consuming memory for the joined strings (especially important for large arrays)
 - It avoids mutating the original arrays
 - It returns early when a difference is found
--- a/.claude/skills/vercel-react-best-practices/rules/js-min-max-loop.md
+++ b/.claude/skills/vercel-react-best-practices/rules/js-min-max-loop.md
@@ -0,0 +1,82 @@
 ---
 title: Use Loop for Min/Max Instead of Sort
 impact: LOW
 impactDescription: O(n) instead of O(n log n)
 tags: javascript, arrays, performance, sorting, algorithms
 ---
 ## Use Loop for Min/Max Instead of Sort
 Finding the smallest or largest element only requires a single pass through the array. Sorting is wasteful and slower.
 **Incorrect (O(n log n) - sort to find latest):**
 ```typescript
 interface Project {
  id: string
  name: string
  updatedAt: number
 }
 function getLatestProject(projects: Project[]) {
  const sorted = [...projects].sort((a, b) => b.updatedAt - a.updatedAt)
  return sorted[0]
 }
 ```
 Sorts the entire array just to find the maximum value.
 **Incorrect (O(n log n) - sort for oldest and newest):**
 ```typescript
 function getOldestAndNewest(projects: Project[]) {
  const sorted = [...projects].sort((a, b) => a.updatedAt - b.updatedAt)
  return { oldest: sorted[0], newest: sorted[sorted.length - 1] }
 }
 ```
 Still sorts unnecessarily when only min/max are needed.
 **Correct (O(n) - single loop):**
 ```typescript
 function getLatestProject(projects: Project[]) {
  if (projects.length === 0) return null
  let latest = projects[0]
  for (let i = 1; i < projects.length; i++) {
    if (projects[i].updatedAt > latest.updatedAt) {
      latest = projects[i]
    }
  }
  return latest
 }
 function getOldestAndNewest(projects: Project[]) {
  if (projects.length === 0) return { oldest: null, newest: null }
  let oldest = projects[0]
  let newest = projects[0]
  for (let i = 1; i < projects.length; i++) {
    if (projects[i].updatedAt < oldest.updatedAt) oldest = projects[i]
    if (projects[i].updatedAt > newest.updatedAt) newest = projects[i]
  }
  return { oldest, newest }
 }
 ```
 Single pass through the array, no copying, no sorting.
 **Alternative (Math.min/Math.max for small arrays):**
 ```typescript
 const numbers = [5, 2, 8, 1, 9]
 const min = Math.min(...numbers)
 const max = Math.max(...numbers)
 ```
 This works for small arrays but can be slower for very large arrays due to spread operator limitations. Use the loop approach for reliability.
--- a/.claude/skills/vercel-react-best-practices/rules/js-set-map-lookups.md
+++ b/.claude/skills/vercel-react-best-practices/rules/js-set-map-lookups.md
@@ -0,0 +1,24 @@
 ---
 title: Use Set/Map for O(1) Lookups
 impact: LOW-MEDIUM
 impactDescription: O(n) to O(1)
 tags: javascript, set, map, data-structures, performance
 ---
 ## Use Set/Map for O(1) Lookups
 Convert arrays to Set/Map for repeated membership checks.
 **Incorrect (O(n) per check):**
 ```typescript
 const allowedIds = ['a', 'b', 'c', ...]
 items.filter(item => allowedIds.includes(item.id))
 ```
 **Correct (O(1) per check):**
 ```typescript
 const allowedIds = new Set(['a', 'b', 'c', ...])
 items.filter(item => allowedIds.has(item.id))
 ```
--- a/.claude/skills/vercel-react-best-practices/rules/js-tosorted-immutable.md
+++ b/.claude/skills/vercel-react-best-practices/rules/js-tosorted-immutable.md
@@ -0,0 +1,57 @@
 ---
 title: Use toSorted() Instead of sort() for Immutability
 impact: MEDIUM-HIGH
 impactDescription: prevents mutation bugs in React state
 tags: javascript, arrays, immutability, react, state, mutation
 ---
 ## Use toSorted() Instead of sort() for Immutability
 `.sort()` mutates the array in place, which can cause bugs with React state and props. Use `.toSorted()` to create a new sorted array without mutation.
 **Incorrect (mutates original array):**
 ```typescript
 function UserList({ users }: { users: User[] }) {
  // Mutates the users prop array!
  const sorted = useMemo(
    () => users.sort((a, b) => a.name.localeCompare(b.name)),
    [users]
  )
  return <div>{sorted.map(renderUser)}</div>
 }
 ```
 **Correct (creates new array):**
 ```typescript
 function UserList({ users }: { users: User[] }) {
  // Creates new sorted array, original unchanged
  const sorted = useMemo(
    () => users.toSorted((a, b) => a.name.localeCompare(b.name)),
    [users]
  )
  return <div>{sorted.map(renderUser)}</div>
 }
 ```
 **Why this matters in React:**
 1. Props/state mutations break React's immutability model - React expects props and state to be treated as read-only
 2. Causes stale closure bugs - Mutating arrays inside closures (callbacks, effects) can lead to unexpected behavior
 **Browser support (fallback for older browsers):**
 `.toSorted()` is available in all modern browsers (Chrome 110+, Safari 16+, Firefox 115+, Node.js 20+). For older environments, use spread operator:
 ```typescript
 // Fallback for older browsers
 const sorted = [...items].sort((a, b) => a.value - b.value)
 ```
 **Other immutable array methods:**
 - `.toSorted()` - immutable sort
 - `.toReversed()` - immutable reverse
 - `.toSpliced()` - immutable splice
 - `.with()` - immutable element replacement
--- a/.claude/skills/vercel-react-best-practices/rules/rendering-activity.md
+++ b/.claude/skills/vercel-react-best-practices/rules/rendering-activity.md
@@ -0,0 +1,26 @@
 ---
 title: Use Activity Component for Show/Hide
 impact: MEDIUM
 impactDescription: preserves state/DOM
 tags: rendering, activity, visibility, state-preservation
 ---
 ## Use Activity Component for Show/Hide
 Use React's `<Activity>` to preserve state/DOM for expensive components that frequently toggle visibility.
 **Usage:**
 ```tsx
 import { Activity } from 'react'
 function Dropdown({ isOpen }: Props) {
  return (
    <Activity mode={isOpen ? 'visible' : 'hidden'}>
      <ExpensiveMenu />
    </Activity>
  )
 }
 ```
 Avoids expensive re-renders and state loss.
--- a/.claude/skills/vercel-react-best-practices/rules/rendering-animate-svg-wrapper.md
+++ b/.claude/skills/vercel-react-best-practices/rules/rendering-animate-svg-wrapper.md
@@ -0,0 +1,47 @@
 ---
 title: Animate SVG Wrapper Instead of SVG Element
 impact: LOW
 impactDescription: enables hardware acceleration
 tags: rendering, svg, css, animation, performance
 ---
 ## Animate SVG Wrapper Instead of SVG Element
 Many browsers don't have hardware acceleration for CSS3 animations on SVG elements. Wrap SVG in a `<div>` and animate the wrapper instead.
 **Incorrect (animating SVG directly - no hardware acceleration):**
 ```tsx
 function LoadingSpinner() {
  return (
    <svg 
      className="animate-spin"
      width="24" 
      height="24" 
      viewBox="0 0 24 24"
    >
      <circle cx="12" cy="12" r="10" stroke="currentColor" />
    </svg>
  )
 }
 ```
 **Correct (animating wrapper div - hardware accelerated):**
 ```tsx
 function LoadingSpinner() {
  return (
    <div className="animate-spin">
      <svg 
        width="24" 
        height="24" 
        viewBox="0 0 24 24"
      >
        <circle cx="12" cy="12" r="10" stroke="currentColor" />
      </svg>
    </div>
  )
 }
 ```
 This applies to all CSS transforms and transitions (`transform`, `opacity`, `translate`, `scale`, `rotate`). The wrapper div allows browsers to use GPU acceleration for smoother animations.
--- a/.claude/skills/vercel-react-best-practices/rules/rendering-conditional-render.md
+++ b/.claude/skills/vercel-react-best-practices/rules/rendering-conditional-render.md
@@ -0,0 +1,40 @@
 ---
 title: Use Explicit Conditional Rendering
 impact: LOW
 impactDescription: prevents rendering 0 or NaN
 tags: rendering, conditional, jsx, falsy-values
 ---
 ## Use Explicit Conditional Rendering
 Use explicit ternary operators (`? :`) instead of `&&` for conditional rendering when the condition can be `0`, `NaN`, or other falsy values that render.
 **Incorrect (renders "0" when count is 0):**
 ```tsx
 function Badge({ count }: { count: number }) {
  return (
    <div>
      {count && <span className="badge">{count}</span>}
    </div>
  )
 }
 // When count = 0, renders: <div>0</div>
 // When count = 5, renders: <div><span class="badge">5</span></div>
 ```
 **Correct (renders nothing when count is 0):**
 ```tsx
 function Badge({ count }: { count: number }) {
  return (
    <div>
      {count > 0 ? <span className="badge">{count}</span> : null}
    </div>
  )
 }
 // When count = 0, renders: <div></div>
 // When count = 5, renders: <div><span class="badge">5</span></div>
 ```
--- a/.claude/skills/vercel-react-best-practices/rules/rendering-content-visibility.md
+++ b/.claude/skills/vercel-react-best-practices/rules/rendering-content-visibility.md
@@ -0,0 +1,38 @@
 ---
 title: CSS content-visibility for Long Lists
 impact: HIGH
 impactDescription: faster initial render
 tags: rendering, css, content-visibility, long-lists
 ---
 ## CSS content-visibility for Long Lists
 Apply `content-visibility: auto` to defer off-screen rendering.
 **CSS:**
 ```css
 .message-item {
  content-visibility: auto;
  contain-intrinsic-size: 0 80px;
 }
 ```
 **Example:**
 ```tsx
 function MessageList({ messages }: { messages: Message[] }) {
  return (
    <div className="overflow-y-auto h-screen">
      {messages.map(msg => (
        <div key={msg.id} className="message-item">
          <Avatar user={msg.author} />
          <div>{msg.content}</div>
        </div>
      ))}
    </div>
  )
 }
 ```
 For 1000 messages, browser skips layout/paint for ~990 off-screen items (10× faster initial render).
--- a/.claude/skills/vercel-react-best-practices/rules/rendering-hoist-jsx.md
+++ b/.claude/skills/vercel-react-best-practices/rules/rendering-hoist-jsx.md
@@ -0,0 +1,46 @@
 ---
 title: Hoist Static JSX Elements
 impact: LOW
 impactDescription: avoids re-creation
 tags: rendering, jsx, static, optimization
 ---
 ## Hoist Static JSX Elements
 Extract static JSX outside components to avoid re-creation.
 **Incorrect (recreates element every render):**
 ```tsx
 function LoadingSkeleton() {
  return <div className="animate-pulse h-20 bg-gray-200" />
 }
 function Container() {
  return (
    <div>
      {loading && <LoadingSkeleton />}
    </div>
  )
 }
 ```
 **Correct (reuses same element):**
 ```tsx
 const loadingSkeleton = (
  <div className="animate-pulse h-20 bg-gray-200" />
 )
 function Container() {
  return (
    <div>
      {loading && loadingSkeleton}
    </div>
  )
 }
 ```
 This is especially helpful for large and static SVG nodes, which can be expensive to recreate on every render.
 **Note:** If your project has [React Compiler](https://react.dev/learn/react-compiler) enabled, the compiler automatically hoists static JSX elements and optimizes component re-renders, making manual hoisting unnecessary.
--- a/.claude/skills/vercel-react-best-practices/rules/rendering-hydration-no-flicker.md
+++ b/.claude/skills/vercel-react-best-practices/rules/rendering-hydration-no-flicker.md
@@ -0,0 +1,82 @@
 ---
 title: Prevent Hydration Mismatch Without Flickering
 impact: MEDIUM
 impactDescription: avoids visual flicker and hydration errors
 tags: rendering, ssr, hydration, localStorage, flicker
 ---
 ## Prevent Hydration Mismatch Without Flickering
 When rendering content that depends on client-side storage (localStorage, cookies), avoid both SSR breakage and post-hydration flickering by injecting a synchronous script that updates the DOM before React hydrates.
 **Incorrect (breaks SSR):**
 ```tsx
 function ThemeWrapper({ children }: { children: ReactNode }) {
  // localStorage is not available on server - throws error
  const theme = localStorage.getItem('theme') || 'light'
  return (
    <div className={theme}>
      {children}
    </div>
  )
 }
 ```
 Server-side rendering will fail because `localStorage` is undefined.
 **Incorrect (visual flickering):**
 ```tsx
 function ThemeWrapper({ children }: { children: ReactNode }) {
  const [theme, setTheme] = useState('light')
  useEffect(() => {
    // Runs after hydration - causes visible flash
    const stored = localStorage.getItem('theme')
    if (stored) {
      setTheme(stored)
    }
  }, [])
  return (
    <div className={theme}>
      {children}
    </div>
  )
 }
 ```
 Component first renders with default value (`light`), then updates after hydration, causing a visible flash of incorrect content.
 **Correct (no flicker, no hydration mismatch):**
 ```tsx
 function ThemeWrapper({ children }: { children: ReactNode }) {
  return (
    <>
      <div id="theme-wrapper">
        {children}
      </div>
      <script
        dangerouslySetInnerHTML={{
          __html: `
            (function() {
              try {
                var theme = localStorage.getItem('theme') || 'light';
                var el = document.getElementById('theme-wrapper');
                if (el) el.className = theme;
              } catch (e) {}
            })();
          `,
        }}
      />
    </>
  )
 }
 ```
 The inline script executes synchronously before showing the element, ensuring the DOM already has the correct value. No flickering, no hydration mismatch.
 This pattern is especially useful for theme toggles, user preferences, authentication states, and any client-only data that should render immediately without flashing default values.
--- a/.claude/skills/vercel-react-best-practices/rules/rendering-svg-precision.md
+++ b/.claude/skills/vercel-react-best-practices/rules/rendering-svg-precision.md
@@ -0,0 +1,28 @@
 ---
 title: Optimize SVG Precision
 impact: LOW
 impactDescription: reduces file size
 tags: rendering, svg, optimization, svgo
 ---
 ## Optimize SVG Precision
 Reduce SVG coordinate precision to decrease file size. The optimal precision depends on the viewBox size, but in general reducing precision should be considered.
 **Incorrect (excessive precision):**
 ```svg
 <path d="M 10.293847 20.847362 L 30.938472 40.192837" />
 ```
 **Correct (1 decimal place):**
 ```svg
 <path d="M 10.3 20.8 L 30.9 40.2" />
 ```
 **Automate with SVGO:**
 ```bash
 npx svgo --precision=1 --multipass icon.svg
 ```
--- a/.claude/skills/vercel-react-best-practices/rules/rerender-defer-reads.md
+++ b/.claude/skills/vercel-react-best-practices/rules/rerender-defer-reads.md
@@ -0,0 +1,39 @@
 ---
 title: Defer State Reads to Usage Point
 impact: MEDIUM
 impactDescription: avoids unnecessary subscriptions
 tags: rerender, searchParams, localStorage, optimization
 ---
 ## Defer State Reads to Usage Point
 Don't subscribe to dynamic state (searchParams, localStorage) if you only read it inside callbacks.
 **Incorrect (subscribes to all searchParams changes):**
 ```tsx
 function ShareButton({ chatId }: { chatId: string }) {
  const searchParams = useSearchParams()
  const handleShare = () => {
    const ref = searchParams.get('ref')
    shareChat(chatId, { ref })
  }
  return <button onClick={handleShare}>Share</button>
 }
 ```
 **Correct (reads on demand, no subscription):**
 ```tsx
 function ShareButton({ chatId }: { chatId: string }) {
  const handleShare = () => {
    const params = new URLSearchParams(window.location.search)
    const ref = params.get('ref')
    shareChat(chatId, { ref })
  }
  return <button onClick={handleShare}>Share</button>
 }
 ```
--- a/.claude/skills/vercel-react-best-practices/rules/rerender-dependencies.md
+++ b/.claude/skills/vercel-react-best-practices/rules/rerender-dependencies.md
@@ -0,0 +1,45 @@
 ---
 title: Narrow Effect Dependencies
 impact: LOW
 impactDescription: minimizes effect re-runs
 tags: rerender, useEffect, dependencies, optimization
 ---
 ## Narrow Effect Dependencies
 Specify primitive dependencies instead of objects to minimize effect re-runs.
 **Incorrect (re-runs on any user field change):**
 ```tsx
 useEffect(() => {
  console.log(user.id)
 }, [user])
 ```
 **Correct (re-runs only when id changes):**
 ```tsx
 useEffect(() => {
  console.log(user.id)
 }, [user.id])
 ```
 **For derived state, compute outside effect:**
 ```tsx
 // Incorrect: runs on width=767, 766, 765...
 useEffect(() => {
  if (width < 768) {
    enableMobileMode()
  }
 }, [width])
 // Correct: runs only on boolean transition
 const isMobile = width < 768
 useEffect(() => {
  if (isMobile) {
    enableMobileMode()
  }
 }, [isMobile])
 ```
--- a/.claude/skills/vercel-react-best-practices/rules/rerender-derived-state.md
+++ b/.claude/skills/vercel-react-best-practices/rules/rerender-derived-state.md
@@ -0,0 +1,29 @@
 ---
 title: Subscribe to Derived State
 impact: MEDIUM
 impactDescription: reduces re-render frequency
 tags: rerender, derived-state, media-query, optimization
 ---
 ## Subscribe to Derived State
 Subscribe to derived boolean state instead of continuous values to reduce re-render frequency.
 **Incorrect (re-renders on every pixel change):**
 ```tsx
 function Sidebar() {
  const width = useWindowWidth()  // updates continuously
  const isMobile = width < 768
  return <nav className={isMobile ? 'mobile' : 'desktop'}>
 }
 ```
 **Correct (re-renders only when boolean changes):**
 ```tsx
 function Sidebar() {
  const isMobile = useMediaQuery('(max-width: 767px)')
  return <nav className={isMobile ? 'mobile' : 'desktop'}>
 }
 ```
--- a/.claude/skills/vercel-react-best-practices/rules/rerender-functional-setstate.md
+++ b/.claude/skills/vercel-react-best-practices/rules/rerender-functional-setstate.md
@@ -0,0 +1,74 @@
 ---
 title: Use Functional setState Updates
 impact: MEDIUM
 impactDescription: prevents stale closures and unnecessary callback recreations
 tags: react, hooks, useState, useCallback, callbacks, closures
 ---
 ## Use Functional setState Updates
 When updating state based on the current state value, use the functional update form of setState instead of directly referencing the state variable. This prevents stale closures, eliminates unnecessary dependencies, and creates stable callback references.
 **Incorrect (requires state as dependency):**
 ```tsx
 function TodoList() {
  const [items, setItems] = useState(initialItems)
  // Callback must depend on items, recreated on every items change
  const addItems = useCallback((newItems: Item[]) => {
    setItems([...items, ...newItems])
  }, [items])  // ❌ items dependency causes recreations
  // Risk of stale closure if dependency is forgotten
  const removeItem = useCallback((id: string) => {
    setItems(items.filter(item => item.id !== id))
  }, [])  // ❌ Missing items dependency - will use stale items!
  return <ItemsEditor items={items} onAdd={addItems} onRemove={removeItem} />
 }
 ```
 The first callback is recreated every time `items` changes, which can cause child components to re-render unnecessarily. The second callback has a stale closure bug—it will always reference the initial `items` value.
 **Correct (stable callbacks, no stale closures):**
 ```tsx
 function TodoList() {
  const [items, setItems] = useState(initialItems)
  // Stable callback, never recreated
  const addItems = useCallback((newItems: Item[]) => {
    setItems(curr => [...curr, ...newItems])
  }, [])  // ✅ No dependencies needed
  // Always uses latest state, no stale closure risk
  const removeItem = useCallback((id: string) => {
    setItems(curr => curr.filter(item => item.id !== id))
  }, [])  // ✅ Safe and stable
  return <ItemsEditor items={items} onAdd={addItems} onRemove={removeItem} />
 }
 ```
 **Benefits:**
 1. **Stable callback references** - Callbacks don't need to be recreated when state changes
 2. **No stale closures** - Always operates on the latest state value
 3. **Fewer dependencies** - Simplifies dependency arrays and reduces memory leaks
 4. **Prevents bugs** - Eliminates the most common source of React closure bugs
 **When to use functional updates:**
 - Any setState that depends on the current state value
 - Inside useCallback/useMemo when state is needed
 - Event handlers that reference state
 - Async operations that update state
 **When direct updates are fine:**
 - Setting state to a static value: `setCount(0)`
 - Setting state from props/arguments only: `setName(newName)`
 - State doesn't depend on previous value
 **Note:** If your project has [React Compiler](https://react.dev/learn/react-compiler) enabled, the compiler can automatically optimize some cases, but functional updates are still recommended for correctness and to prevent stale closure bugs.
--- a/.claude/skills/vercel-react-best-practices/rules/rerender-lazy-state-init.md
+++ b/.claude/skills/vercel-react-best-practices/rules/rerender-lazy-state-init.md
@@ -0,0 +1,58 @@
 ---
 title: Use Lazy State Initialization
 impact: MEDIUM
 impactDescription: wasted computation on every render
 tags: react, hooks, useState, performance, initialization
 ---
 ## Use Lazy State Initialization
 Pass a function to `useState` for expensive initial values. Without the function form, the initializer runs on every render even though the value is only used once.
 **Incorrect (runs on every render):**
 ```tsx
 function FilteredList({ items }: { items: Item[] }) {
  // buildSearchIndex() runs on EVERY render, even after initialization
  const [searchIndex, setSearchIndex] = useState(buildSearchIndex(items))
  const [query, setQuery] = useState('')
  // When query changes, buildSearchIndex runs again unnecessarily
  return <SearchResults index={searchIndex} query={query} />
 }
 function UserProfile() {
  // JSON.parse runs on every render
  const [settings, setSettings] = useState(
    JSON.parse(localStorage.getItem('settings') || '{}')
  )
  return <SettingsForm settings={settings} onChange={setSettings} />
 }
 ```
 **Correct (runs only once):**
 ```tsx
 function FilteredList({ items }: { items: Item[] }) {
  // buildSearchIndex() runs ONLY on initial render
  const [searchIndex, setSearchIndex] = useState(() => buildSearchIndex(items))
  const [query, setQuery] = useState('')
  return <SearchResults index={searchIndex} query={query} />
 }
 function UserProfile() {
  // JSON.parse runs only on initial render
  const [settings, setSettings] = useState(() => {
    const stored = localStorage.getItem('settings')
    return stored ? JSON.parse(stored) : {}
  })
  return <SettingsForm settings={settings} onChange={setSettings} />
 }
 ```
 Use lazy initialization when computing initial values from localStorage/sessionStorage, building data structures (indexes, maps), reading from the DOM, or performing heavy transformations.
 For simple primitives (`useState(0)`), direct references (`useState(props.value)`), or cheap literals (`useState({})`), the function form is unnecessary.
--- a/.claude/skills/vercel-react-best-practices/rules/rerender-memo.md
+++ b/.claude/skills/vercel-react-best-practices/rules/rerender-memo.md
@@ -0,0 +1,44 @@
 ---
 title: Extract to Memoized Components
 impact: MEDIUM
 impactDescription: enables early returns
 tags: rerender, memo, useMemo, optimization
 ---
 ## Extract to Memoized Components
 Extract expensive work into memoized components to enable early returns before computation.
 **Incorrect (computes avatar even when loading):**
 ```tsx
 function Profile({ user, loading }: Props) {
  const avatar = useMemo(() => {
    const id = computeAvatarId(user)
    return <Avatar id={id} />
  }, [user])
  if (loading) return <Skeleton />
  return <div>{avatar}</div>
 }
 ```
 **Correct (skips computation when loading):**
 ```tsx
 const UserAvatar = memo(function UserAvatar({ user }: { user: User }) {
  const id = useMemo(() => computeAvatarId(user), [user])
  return <Avatar id={id} />
 })
 function Profile({ user, loading }: Props) {
  if (loading) return <Skeleton />
  return (
    <div>
      <UserAvatar user={user} />
    </div>
  )
 }
 ```
 **Note:** If your project has [React Compiler](https://react.dev/learn/react-compiler) enabled, manual memoization with `memo()` and `useMemo()` is not necessary. The compiler automatically optimizes re-renders.
--- a/.claude/skills/vercel-react-best-practices/rules/rerender-transitions.md
+++ b/.claude/skills/vercel-react-best-practices/rules/rerender-transitions.md
@@ -0,0 +1,40 @@
 ---
 title: Use Transitions for Non-Urgent Updates
 impact: MEDIUM
 impactDescription: maintains UI responsiveness
 tags: rerender, transitions, startTransition, performance
 ---
 ## Use Transitions for Non-Urgent Updates
 Mark frequent, non-urgent state updates as transitions to maintain UI responsiveness.
 **Incorrect (blocks UI on every scroll):**
 ```tsx
 function ScrollTracker() {
  const [scrollY, setScrollY] = useState(0)
  useEffect(() => {
    const handler = () => setScrollY(window.scrollY)
    window.addEventListener('scroll', handler, { passive: true })
    return () => window.removeEventListener('scroll', handler)
  }, [])
 }
 ```
 **Correct (non-blocking updates):**
 ```tsx
 import { startTransition } from 'react'
 function ScrollTracker() {
  const [scrollY, setScrollY] = useState(0)
  useEffect(() => {
    const handler = () => {
      startTransition(() => setScrollY(window.scrollY))
    }
    window.addEventListener('scroll', handler, { passive: true })
    return () => window.removeEventListener('scroll', handler)
  }, [])
 }
 ```
--- a/.claude/skills/vercel-react-best-practices/rules/server-after-nonblocking.md
+++ b/.claude/skills/vercel-react-best-practices/rules/server-after-nonblocking.md
@@ -0,0 +1,73 @@
 ---
 title: Use after() for Non-Blocking Operations
 impact: MEDIUM
 impactDescription: faster response times
 tags: server, async, logging, analytics, side-effects
 ---
 ## Use after() for Non-Blocking Operations
 Use Next.js's `after()` to schedule work that should execute after a response is sent. This prevents logging, analytics, and other side effects from blocking the response.
 **Incorrect (blocks response):**
 ```tsx
 import { logUserAction } from '@/app/utils'
 export async function POST(request: Request) {
  // Perform mutation
  await updateDatabase(request)
  // Logging blocks the response
  const userAgent = request.headers.get('user-agent') || 'unknown'
  await logUserAction({ userAgent })
  return new Response(JSON.stringify({ status: 'success' }), {
    status: 200,
    headers: { 'Content-Type': 'application/json' }
  })
 }
 ```
 **Correct (non-blocking):**
 ```tsx
 import { after } from 'next/server'
 import { headers, cookies } from 'next/headers'
 import { logUserAction } from '@/app/utils'
 export async function POST(request: Request) {
  // Perform mutation
  await updateDatabase(request)
  // Log after response is sent
  after(async () => {
    const userAgent = (await headers()).get('user-agent') || 'unknown'
    const sessionCookie = (await cookies()).get('session-id')?.value || 'anonymous'
    logUserAction({ sessionCookie, userAgent })
  })
  return new Response(JSON.stringify({ status: 'success' }), {
    status: 200,
    headers: { 'Content-Type': 'application/json' }
  })
 }
 ```
 The response is sent immediately while logging happens in the background.
 **Common use cases:**
 - Analytics tracking
 - Audit logging
 - Sending notifications
 - Cache invalidation
 - Cleanup tasks
 **Important notes:**
 - `after()` runs even if the response fails or redirects
 - Works in Server Actions, Route Handlers, and Server Components
 Reference: [https://nextjs.org/docs/app/api-reference/functions/after](https://nextjs.org/docs/app/api-reference/functions/after)
--- a/.claude/skills/vercel-react-best-practices/rules/server-cache-lru.md
+++ b/.claude/skills/vercel-react-best-practices/rules/server-cache-lru.md
@@ -0,0 +1,41 @@
 ---
 title: Cross-Request LRU Caching
 impact: HIGH
 impactDescription: caches across requests
 tags: server, cache, lru, cross-request
 ---
 ## Cross-Request LRU Caching
 `React.cache()` only works within one request. For data shared across sequential requests (user clicks button A then button B), use an LRU cache.
 **Implementation:**
 ```typescript
 import { LRUCache } from 'lru-cache'
 const cache = new LRUCache<string, any>({
  max: 1000,
  ttl: 5 * 60 * 1000  // 5 minutes
 })
 export async function getUser(id: string) {
  const cached = cache.get(id)
  if (cached) return cached
  const user = await db.user.findUnique({ where: { id } })
  cache.set(id, user)
  return user
 }
 // Request 1: DB query, result cached
 // Request 2: cache hit, no DB query
 ```
 Use when sequential user actions hit multiple endpoints needing the same data within seconds.
 **With Vercel's [Fluid Compute](https://vercel.com/docs/fluid-compute):** LRU caching is especially effective because multiple concurrent requests can share the same function instance and cache. This means the cache persists across requests without needing external storage like Redis.
 **In traditional serverless:** Each invocation runs in isolation, so consider Redis for cross-process caching.
 Reference: [https://github.com/isaacs/node-lru-cache](https://github.com/isaacs/node-lru-cache)
--- a/.claude/skills/vercel-react-best-practices/rules/server-cache-react.md
+++ b/.claude/skills/vercel-react-best-practices/rules/server-cache-react.md
@@ -0,0 +1,26 @@
 ---
 title: Per-Request Deduplication with React.cache()
 impact: MEDIUM
 impactDescription: deduplicates within request
 tags: server, cache, react-cache, deduplication
 ---
 ## Per-Request Deduplication with React.cache()
 Use `React.cache()` for server-side request deduplication. Authentication and database queries benefit most.
 **Usage:**
 ```typescript
 import { cache } from 'react'
 export const getCurrentUser = cache(async () => {
  const session = await auth()
  if (!session?.user?.id) return null
  return await db.user.findUnique({
    where: { id: session.user.id }
  })
 })
 ```
 Within a single request, multiple calls to `getCurrentUser()` execute the query only once.
--- a/.claude/skills/vercel-react-best-practices/rules/server-parallel-fetching.md
+++ b/.claude/skills/vercel-react-best-practices/rules/server-parallel-fetching.md
@@ -0,0 +1,79 @@
 ---
 title: Parallel Data Fetching with Component Composition
 impact: CRITICAL
 impactDescription: eliminates server-side waterfalls
 tags: server, rsc, parallel-fetching, composition
 ---
 ## Parallel Data Fetching with Component Composition
 React Server Components execute sequentially within a tree. Restructure with composition to parallelize data fetching.
 **Incorrect (Sidebar waits for Page's fetch to complete):**
 ```tsx
 export default async function Page() {
  const header = await fetchHeader()
  return (
    <div>
      <div>{header}</div>
      <Sidebar />
    </div>
  )
 }
 async function Sidebar() {
  const items = await fetchSidebarItems()
  return <nav>{items.map(renderItem)}</nav>
 }
 ```
 **Correct (both fetch simultaneously):**
 ```tsx
 async function Header() {
  const data = await fetchHeader()
  return <div>{data}</div>
 }
 async function Sidebar() {
  const items = await fetchSidebarItems()
  return <nav>{items.map(renderItem)}</nav>
 }
 export default function Page() {
  return (
    <div>
      <Header />
      <Sidebar />
    </div>
  )
 }
 ```
 **Alternative with children prop:**
 ```tsx
 async function Layout({ children }: { children: ReactNode }) {
  const header = await fetchHeader()
  return (
    <div>
      <div>{header}</div>
      {children}
    </div>
  )
 }
 async function Sidebar() {
  const items = await fetchSidebarItems()
  return <nav>{items.map(renderItem)}</nav>
 }
 export default function Page() {
  return (
    <Layout>
      <Sidebar />
    </Layout>
  )
 }
 ```
--- a/.claude/skills/vercel-react-best-practices/rules/server-serialization.md
+++ b/.claude/skills/vercel-react-best-practices/rules/server-serialization.md
@@ -0,0 +1,38 @@
 ---
 title: Minimize Serialization at RSC Boundaries
 impact: HIGH
 impactDescription: reduces data transfer size
 tags: server, rsc, serialization, props
 ---
 ## Minimize Serialization at RSC Boundaries
 The React Server/Client boundary serializes all object properties into strings and embeds them in the HTML response and subsequent RSC requests. This serialized data directly impacts page weight and load time, so **size matters a lot**. Only pass fields that the client actually uses.
 **Incorrect (serializes all 50 fields):**
 ```tsx
 async function Page() {
  const user = await fetchUser()  // 50 fields
  return <Profile user={user} />
 }
 'use client'
 function Profile({ user }: { user: User }) {
  return <div>{user.name}</div>  // uses 1 field
 }
 ```
 **Correct (serializes only 1 field):**
 ```tsx
 async function Page() {
  const user = await fetchUser()
  return <Profile name={user.name} />
 }
 'use client'
 function Profile({ name }: { name: string }) {
  return <div>{name}</div>
 }
 ```
--- a/.dockerignore
+++ b/.dockerignore
@@ -1,6 +1,9 @@
 # Ignore everything by default, selectively add things to context
 *
 # Documentation (for embeddings/search)
 !docs/
 # Platform - Libs
 !autogpt_platform/autogpt_libs/autogpt_libs/
 !autogpt_platform/autogpt_libs/pyproject.toml
--- a/.github/workflows/classic-autogpt-ci.yml
+++ b/.github/workflows/classic-autogpt-ci.yml
@@ -6,11 +6,15 @@ on:
    paths:
      - '.github/workflows/classic-autogpt-ci.yml'
      - 'classic/original_autogpt/**'
      - 'classic/direct_benchmark/**'
      - 'classic/forge/**'
  pull_request:
    branches: [ master, dev, release-* ]
    paths:
      - '.github/workflows/classic-autogpt-ci.yml'
      - 'classic/original_autogpt/**'
      - 'classic/direct_benchmark/**'
      - 'classic/forge/**'
 concurrency:
  group: ${{ format('classic-autogpt-ci-{0}', github.head_ref && format('{0}-{1}', github.event_name, github.event.pull_request.number) || github.sha) }}
@@ -19,47 +23,22 @@ concurrency:
 defaults:
  run:
    shell: bash
-    working-directory: classic/original_autogpt
+    working-directory: classic
 jobs:
  test:
    permissions:
      contents: read
    timeout-minutes: 30
-    strategy:
+    runs-on: ubuntu-latest
      fail-fast: false
      matrix:
        python-version: ["3.10"]
        platform-os: [ubuntu, macos, macos-arm64, windows]
    runs-on: ${{ matrix.platform-os != 'macos-arm64' && format('{0}-latest', matrix.platform-os) || 'macos-14' }}
    steps:
-      # Quite slow on macOS (2~4 minutes to set up Docker)
+      - name: Start MinIO service
      # - name: Set up Docker (macOS)
      #   if: runner.os == 'macOS'
      #   uses: crazy-max/ghaction-setup-docker@v3
      - name: Start MinIO service (Linux)
        if: runner.os == 'Linux'
        working-directory: '.'
        run: |
          docker pull minio/minio:edge-cicd
          docker run -d -p 9000:9000 minio/minio:edge-cicd
      - name: Start MinIO service (macOS)
        if: runner.os == 'macOS'
        working-directory: ${{ runner.temp }}
        run: |
          brew install minio/stable/minio
          mkdir data
          minio server ./data &
      # No MinIO on Windows:
      # - Windows doesn't support running Linux Docker containers
      # - It doesn't seem possible to start background processes on Windows. They are
      #   killed after the step returns.
      #   See: https://github.com/actions/runner/issues/598#issuecomment-2011890429
      - name: Checkout repository
        uses: actions/checkout@v4
        with:
@@ -71,41 +50,23 @@ jobs:
          git config --global user.name "Auto-GPT-Bot"
          git config --global user.email "github-bot@agpt.co"
-      - name: Set up Python ${{ matrix.python-version }}
+      - name: Set up Python 3.12
        uses: actions/setup-python@v5
        with:
-          python-version: ${{ matrix.python-version }}
+          python-version: "3.12"
      - id: get_date
        name: Get date
        run: echo "date=$(date +'%Y-%m-%d')" >> $GITHUB_OUTPUT
      - name: Set up Python dependency cache
        # On Windows, unpacking cached dependencies takes longer than just installing them
        if: runner.os != 'Windows'
        uses: actions/cache@v4
        with:
-          path: ${{ runner.os == 'macOS' && '~/Library/Caches/pypoetry' || '~/.cache/pypoetry' }}
+          path: ~/.cache/pypoetry
-          key: poetry-${{ runner.os }}-${{ hashFiles('classic/original_autogpt/poetry.lock') }}
+          key: poetry-${{ runner.os }}-${{ hashFiles('classic/poetry.lock') }}
-      - name: Install Poetry (Unix)
+      - name: Install Poetry
-        if: runner.os != 'Windows'
+        run: curl -sSL https://install.python-poetry.org | python3 -
        run: |
          curl -sSL https://install.python-poetry.org | python3 -
          if [ "${{ runner.os }}" = "macOS" ]; then
            PATH="$HOME/.local/bin:$PATH"
            echo "$HOME/.local/bin" >> $GITHUB_PATH
          fi
      - name: Install Poetry (Windows)
        if: runner.os == 'Windows'
        shell: pwsh
        run: |
          (Invoke-WebRequest -Uri https://install.python-poetry.org -UseBasicParsing).Content | python -
          $env:PATH += ";$env:APPDATA\Python\Scripts"
          echo "$env:APPDATA\Python\Scripts" >> $env:GITHUB_PATH
      - name: Install Python dependencies
        run: poetry install
@@ -116,12 +77,12 @@ jobs:
            --cov=autogpt --cov-branch --cov-report term-missing --cov-report xml \
            --numprocesses=logical --durations=10 \
            --junitxml=junit.xml -o junit_family=legacy \
-            tests/unit tests/integration
+            original_autogpt/tests/unit original_autogpt/tests/integration
        env:
          CI: true
          PLAIN_OUTPUT: True
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
-          S3_ENDPOINT_URL: ${{ runner.os != 'Windows' && 'http://127.0.0.1:9000' || '' }}
+          S3_ENDPOINT_URL: http://127.0.0.1:9000
          AWS_ACCESS_KEY_ID: minioadmin
          AWS_SECRET_ACCESS_KEY: minioadmin
@@ -135,11 +96,11 @@ jobs:
        uses: codecov/codecov-action@v5
        with:
          token: ${{ secrets.CODECOV_TOKEN }}
-          flags: autogpt-agent,${{ runner.os }}
+          flags: autogpt-agent
      - name: Upload logs to artifact
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: test-logs
-          path: classic/original_autogpt/logs/
+          path: classic/logs/
--- a/.github/workflows/classic-autogpts-ci.yml
+++ b/.github/workflows/classic-autogpts-ci.yml
@@ -11,9 +11,6 @@ on:
      - 'classic/original_autogpt/**'
      - 'classic/forge/**'
      - 'classic/benchmark/**'
      - 'classic/run'
      - 'classic/cli.py'
      - 'classic/setup.py'
      - '!**/*.md'
  pull_request:
    branches: [ master, dev, release-* ]
@@ -22,9 +19,6 @@ on:
      - 'classic/original_autogpt/**'
      - 'classic/forge/**'
      - 'classic/benchmark/**'
      - 'classic/run'
      - 'classic/cli.py'
      - 'classic/setup.py'
      - '!**/*.md'
 defaults:
@@ -35,13 +29,9 @@ defaults:
 jobs:
  serve-agent-protocol:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        agent-name: [ original_autogpt ]
      fail-fast: false
    timeout-minutes: 20
    env:
-      min-python-version: '3.10'
+      min-python-version: '3.12'
    steps:
      - name: Checkout repository
        uses: actions/checkout@v4
@@ -55,22 +45,22 @@ jobs:
          python-version: ${{ env.min-python-version }}
      - name: Install Poetry
        working-directory: ./classic/${{ matrix.agent-name }}/
        run: |
          curl -sSL https://install.python-poetry.org | python -
-      - name: Run regression tests
+      - name: Install dependencies
        run: poetry install
      - name: Run smoke tests with direct-benchmark
        run: |
-          ./run agent start ${{ matrix.agent-name }}
+          poetry run direct-benchmark run \
-          cd ${{ matrix.agent-name }}
+            --strategies one_shot \
-          poetry run agbenchmark --mock --test=BasicRetrieval --test=Battleship --test=WebArenaTask_0
+            --models claude \
-          poetry run agbenchmark --test=WriteFile
+            --tests ReadFile,WriteFile \
            --json
        env:
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
-          AGENT_NAME: ${{ matrix.agent-name }}
+          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
          REQUESTS_CA_BUNDLE: /etc/ssl/certs/ca-certificates.crt
-          HELICONE_CACHE_ENABLED: false
+          NONINTERACTIVE_MODE: "true"
-          HELICONE_PROPERTY_AGENT: ${{ matrix.agent-name }}
+          CI: true
          REPORTS_FOLDER: ${{ format('../../reports/{0}', matrix.agent-name) }}
          TELEMETRY_ENVIRONMENT: autogpt-ci
          TELEMETRY_OPT_IN: ${{ github.ref_name == 'master' }}
--- a/.github/workflows/classic-benchmark-ci.yml
+++ b/.github/workflows/classic-benchmark-ci.yml
@@ -1,17 +1,21 @@
-name: Classic - AGBenchmark CI
+name: Classic - Direct Benchmark CI
 on:
  push:
    branches: [ master, dev, ci-test* ]
    paths:
-      - 'classic/benchmark/**'
+      - 'classic/direct_benchmark/**'
-      - '!classic/benchmark/reports/**'
+      - 'classic/benchmark/agbenchmark/challenges/**'
      - 'classic/original_autogpt/**'
      - 'classic/forge/**'
      - .github/workflows/classic-benchmark-ci.yml
  pull_request:
    branches: [ master, dev, release-* ]
    paths:
-      - 'classic/benchmark/**'
+      - 'classic/direct_benchmark/**'
-      - '!classic/benchmark/reports/**'
+      - 'classic/benchmark/agbenchmark/challenges/**'
      - 'classic/original_autogpt/**'
      - 'classic/forge/**'
      - .github/workflows/classic-benchmark-ci.yml
 concurrency:
@@ -23,23 +27,16 @@ defaults:
    shell: bash
 env:
-  min-python-version: '3.10'
+  min-python-version: '3.12'
 jobs:
-  test:
+  benchmark-tests:
-    permissions:
+    runs-on: ubuntu-latest
      contents: read
    timeout-minutes: 30
    strategy:
      fail-fast: false
      matrix:
        python-version: ["3.10"]
        platform-os: [ubuntu, macos, macos-arm64, windows]
    runs-on: ${{ matrix.platform-os != 'macos-arm64' && format('{0}-latest', matrix.platform-os) || 'macos-14' }}
    defaults:
      run:
        shell: bash
-        working-directory: classic/benchmark
+        working-directory: classic
    steps:
      - name: Checkout repository
        uses: actions/checkout@v4
@@ -47,71 +44,84 @@ jobs:
          fetch-depth: 0
          submodules: true
-      - name: Set up Python ${{ matrix.python-version }}
+      - name: Set up Python ${{ env.min-python-version }}
        uses: actions/setup-python@v5
        with:
-          python-version: ${{ matrix.python-version }}
+          python-version: ${{ env.min-python-version }}
      - name: Set up Python dependency cache
        # On Windows, unpacking cached dependencies takes longer than just installing them
        if: runner.os != 'Windows'
        uses: actions/cache@v4
        with:
-          path: ${{ runner.os == 'macOS' && '~/Library/Caches/pypoetry' || '~/.cache/pypoetry' }}
+          path: ~/.cache/pypoetry
-          key: poetry-${{ runner.os }}-${{ hashFiles('classic/benchmark/poetry.lock') }}
+          key: poetry-${{ runner.os }}-${{ hashFiles('classic/poetry.lock') }}
-      - name: Install Poetry (Unix)
+      - name: Install Poetry
        if: runner.os != 'Windows'
        run: |
          curl -sSL https://install.python-poetry.org | python3 -
-          if [ "${{ runner.os }}" = "macOS" ]; then
+      - name: Install dependencies
            PATH="$HOME/.local/bin:$PATH"
            echo "$HOME/.local/bin" >> $GITHUB_PATH
          fi
      - name: Install Poetry (Windows)
        if: runner.os == 'Windows'
        shell: pwsh
        run: |
          (Invoke-WebRequest -Uri https://install.python-poetry.org -UseBasicParsing).Content | python -
          $env:PATH += ";$env:APPDATA\Python\Scripts"
          echo "$env:APPDATA\Python\Scripts" >> $env:GITHUB_PATH
      - name: Install Python dependencies
        run: poetry install
-      - name: Run pytest with coverage
+      - name: Run basic benchmark tests
        run: |
-          poetry run pytest -vv \
+          echo "Testing ReadFile challenge with one_shot strategy..."
-            --cov=agbenchmark --cov-branch --cov-report term-missing --cov-report xml \
+          poetry run direct-benchmark run \
-            --durations=10 \
+            --strategies one_shot \
-            --junitxml=junit.xml -o junit_family=legacy \
+            --models claude \
-            tests
+            --tests ReadFile \
            --json
          echo "Testing WriteFile challenge..."
          poetry run direct-benchmark run \
            --strategies one_shot \
            --models claude \
            --tests WriteFile \
            --json
        env:
          CI: true
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
          NONINTERACTIVE_MODE: "true"
-      - name: Upload test results to Codecov
+      - name: Test category filtering
-        if: ${{ !cancelled() }}  # Run even if tests fail
+        run: |
-        uses: codecov/test-results-action@v1
+          echo "Testing coding category..."
-        with:
+          poetry run direct-benchmark run \
-          token: ${{ secrets.CODECOV_TOKEN }}
+            --strategies one_shot \
            --models claude \
            --categories coding \
            --tests ReadFile,WriteFile \
            --json
        env:
          CI: true
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
          NONINTERACTIVE_MODE: "true"
-      - name: Upload coverage reports to Codecov
+      - name: Test multiple strategies
-        uses: codecov/codecov-action@v5
+        run: |
-        with:
+          echo "Testing multiple strategies..."
-          token: ${{ secrets.CODECOV_TOKEN }}
+          poetry run direct-benchmark run \
-          flags: agbenchmark,${{ runner.os }}
+            --strategies one_shot,plan_execute \
            --models claude \
            --tests ReadFile \
            --parallel 2 \
            --json
        env:
          CI: true
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
          NONINTERACTIVE_MODE: "true"
-  self-test-with-agent:
+  # Run regression tests on maintain challenges
  regression-tests:
    runs-on: ubuntu-latest
-    strategy:
+    timeout-minutes: 45
-      matrix:
+    if: github.ref == 'refs/heads/master' || github.ref == 'refs/heads/dev'
-        agent-name: [forge]
+    defaults:
-      fail-fast: false
+      run:
-    timeout-minutes: 20
+        shell: bash
        working-directory: classic
    steps:
      - name: Checkout repository
        uses: actions/checkout@v4
@@ -126,51 +136,22 @@ jobs:
      - name: Install Poetry
        run: |
-          curl -sSL https://install.python-poetry.org | python -
+          curl -sSL https://install.python-poetry.org | python3 -
      - name: Install dependencies
        run: poetry install
      - name: Run regression tests
        working-directory: classic
        run: |
-          ./run agent start ${{ matrix.agent-name }}
+          echo "Running regression tests (previously beaten challenges)..."
-          cd ${{ matrix.agent-name }}
+          poetry run direct-benchmark run \
-
+            --strategies one_shot \
-          set +e # Ignore non-zero exit codes and continue execution
+            --models claude \
-          echo "Running the following command: poetry run agbenchmark --maintain --mock"
+            --maintain \
-          poetry run agbenchmark --maintain --mock
+            --parallel 4 \
-          EXIT_CODE=$?
+            --json
          set -e  # Stop ignoring non-zero exit codes
          # Check if the exit code was 5, and if so, exit with 0 instead
          if [ $EXIT_CODE -eq 5 ]; then
            echo "regression_tests.json is empty."
          fi
          echo "Running the following command: poetry run agbenchmark --mock"
          poetry run agbenchmark --mock
          echo "Running the following command: poetry run agbenchmark --mock --category=data"
          poetry run agbenchmark --mock --category=data
          echo "Running the following command: poetry run agbenchmark --mock --category=coding"
          poetry run agbenchmark --mock --category=coding
          # echo "Running the following command: poetry run agbenchmark --test=WriteFile"
          # poetry run agbenchmark --test=WriteFile
          cd ../benchmark
          poetry install
          echo "Adding the BUILD_SKILL_TREE environment variable. This will attempt to add new elements in the skill tree. If new elements are added, the CI fails because they should have been pushed"
          export BUILD_SKILL_TREE=true
          # poetry run agbenchmark --mock
          # CHANGED=$(git diff --name-only | grep -E '(agbenchmark/challenges)|(../classic/frontend/assets)') || echo "No diffs"
          # if [ ! -z "$CHANGED" ]; then
          #   echo "There are unstaged changes please run agbenchmark and commit those changes since they are needed."
          #   echo "$CHANGED"
          #   exit 1
          # else
          #   echo "No unstaged changes."
          # fi
        env:
          CI: true
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
-          TELEMETRY_ENVIRONMENT: autogpt-benchmark-ci
+          NONINTERACTIVE_MODE: "true"
          TELEMETRY_OPT_IN: ${{ github.ref_name == 'master' }}
--- a/.github/workflows/classic-forge-ci.yml
+++ b/.github/workflows/classic-forge-ci.yml
@@ -6,13 +6,11 @@ on:
    paths:
      - '.github/workflows/classic-forge-ci.yml'
      - 'classic/forge/**'
      - '!classic/forge/tests/vcr_cassettes'
  pull_request:
    branches: [ master, dev, release-* ]
    paths:
      - '.github/workflows/classic-forge-ci.yml'
      - 'classic/forge/**'
      - '!classic/forge/tests/vcr_cassettes'
 concurrency:
  group: ${{ format('forge-ci-{0}', github.head_ref && format('{0}-{1}', github.event_name, github.event.pull_request.number) || github.sha) }}
@@ -21,115 +19,38 @@ concurrency:
 defaults:
  run:
    shell: bash
-    working-directory: classic/forge
+    working-directory: classic
 jobs:
  test:
    permissions:
      contents: read
    timeout-minutes: 30
-    strategy:
+    runs-on: ubuntu-latest
      fail-fast: false
      matrix:
        python-version: ["3.10"]
        platform-os: [ubuntu, macos, macos-arm64, windows]
    runs-on: ${{ matrix.platform-os != 'macos-arm64' && format('{0}-latest', matrix.platform-os) || 'macos-14' }}
    steps:
-      # Quite slow on macOS (2~4 minutes to set up Docker)
+      - name: Start MinIO service
      # - name: Set up Docker (macOS)
      #   if: runner.os == 'macOS'
      #   uses: crazy-max/ghaction-setup-docker@v3
      - name: Start MinIO service (Linux)
        if: runner.os == 'Linux'
        working-directory: '.'
        run: |
          docker pull minio/minio:edge-cicd
          docker run -d -p 9000:9000 minio/minio:edge-cicd
      - name: Start MinIO service (macOS)
        if: runner.os == 'macOS'
        working-directory: ${{ runner.temp }}
        run: |
          brew install minio/stable/minio
          mkdir data
          minio server ./data &
      # No MinIO on Windows:
      # - Windows doesn't support running Linux Docker containers
      # - It doesn't seem possible to start background processes on Windows. They are
      #   killed after the step returns.
      #   See: https://github.com/actions/runner/issues/598#issuecomment-2011890429
      - name: Checkout repository
        uses: actions/checkout@v4
        with:
          fetch-depth: 0
          submodules: true
-      - name: Checkout cassettes
+      - name: Set up Python 3.12
        if: ${{ startsWith(github.event_name, 'pull_request') }}
        env:
          PR_BASE: ${{ github.event.pull_request.base.ref }}
          PR_BRANCH: ${{ github.event.pull_request.head.ref }}
          PR_AUTHOR: ${{ github.event.pull_request.user.login }}
        run: |
          cassette_branch="${PR_AUTHOR}-${PR_BRANCH}"
          cassette_base_branch="${PR_BASE}"
          cd tests/vcr_cassettes
          if ! git ls-remote --exit-code --heads origin $cassette_base_branch ; then
            cassette_base_branch="master"
          fi
          if git ls-remote --exit-code --heads origin $cassette_branch ; then
            git fetch origin $cassette_branch
            git fetch origin $cassette_base_branch
            git checkout $cassette_branch
            # Pick non-conflicting cassette updates from the base branch
            git merge --no-commit --strategy-option=ours origin/$cassette_base_branch
            echo "Using cassettes from mirror branch '$cassette_branch'," \
              "synced to upstream branch '$cassette_base_branch'."
          else
            git checkout -b $cassette_branch
            echo "Branch '$cassette_branch' does not exist in cassette submodule." \
              "Using cassettes from '$cassette_base_branch'."
          fi
      - name: Set up Python ${{ matrix.python-version }}
        uses: actions/setup-python@v5
        with:
-          python-version: ${{ matrix.python-version }}
+          python-version: "3.12"
      - name: Set up Python dependency cache
        # On Windows, unpacking cached dependencies takes longer than just installing them
        if: runner.os != 'Windows'
        uses: actions/cache@v4
        with:
-          path: ${{ runner.os == 'macOS' && '~/Library/Caches/pypoetry' || '~/.cache/pypoetry' }}
+          path: ~/.cache/pypoetry
-          key: poetry-${{ runner.os }}-${{ hashFiles('classic/forge/poetry.lock') }}
+          key: poetry-${{ runner.os }}-${{ hashFiles('classic/poetry.lock') }}
-      - name: Install Poetry (Unix)
+      - name: Install Poetry
-        if: runner.os != 'Windows'
+        run: curl -sSL https://install.python-poetry.org | python3 -
        run: |
          curl -sSL https://install.python-poetry.org | python3 -
          if [ "${{ runner.os }}" = "macOS" ]; then
            PATH="$HOME/.local/bin:$PATH"
            echo "$HOME/.local/bin" >> $GITHUB_PATH
          fi
      - name: Install Poetry (Windows)
        if: runner.os == 'Windows'
        shell: pwsh
        run: |
          (Invoke-WebRequest -Uri https://install.python-poetry.org -UseBasicParsing).Content | python -
          $env:PATH += ";$env:APPDATA\Python\Scripts"
          echo "$env:APPDATA\Python\Scripts" >> $env:GITHUB_PATH
      - name: Install Python dependencies
        run: poetry install
@@ -140,12 +61,15 @@ jobs:
            --cov=forge --cov-branch --cov-report term-missing --cov-report xml \
            --durations=10 \
            --junitxml=junit.xml -o junit_family=legacy \
-            forge
+            forge/forge forge/tests
        env:
          CI: true
          PLAIN_OUTPUT: True
          # API keys - tests that need these will skip if not available
          # Secrets are not available to fork PRs (GitHub security feature)
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
-          S3_ENDPOINT_URL: ${{ runner.os != 'Windows' && 'http://127.0.0.1:9000' || '' }}
+          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
          S3_ENDPOINT_URL: http://127.0.0.1:9000
          AWS_ACCESS_KEY_ID: minioadmin
          AWS_SECRET_ACCESS_KEY: minioadmin
@@ -159,85 +83,11 @@ jobs:
        uses: codecov/codecov-action@v5
        with:
          token: ${{ secrets.CODECOV_TOKEN }}
-          flags: forge,${{ runner.os }}
+          flags: forge
      - id: setup_git_auth
        name: Set up git token authentication
        # Cassettes may be pushed even when tests fail
        if: success() || failure()
        run: |
          config_key="http.${{ github.server_url }}/.extraheader"
          if [ "${{ runner.os }}" = 'macOS' ]; then
            base64_pat=$(echo -n "pat:${{ secrets.PAT_REVIEW }}" | base64)
          else
            base64_pat=$(echo -n "pat:${{ secrets.PAT_REVIEW }}" | base64 -w0)
          fi
          git config "$config_key" \
            "Authorization: Basic $base64_pat"
          cd tests/vcr_cassettes
          git config "$config_key" \
            "Authorization: Basic $base64_pat"
          echo "config_key=$config_key" >> $GITHUB_OUTPUT
      - id: push_cassettes
        name: Push updated cassettes
        # For pull requests, push updated cassettes even when tests fail
        if: github.event_name == 'push' || (! github.event.pull_request.head.repo.fork && (success() || failure()))
        env:
          PR_BRANCH: ${{ github.event.pull_request.head.ref }}
          PR_AUTHOR: ${{ github.event.pull_request.user.login }}
        run: |
          if [ "${{ startsWith(github.event_name, 'pull_request') }}" = "true" ]; then
            is_pull_request=true
            cassette_branch="${PR_AUTHOR}-${PR_BRANCH}"
          else
            cassette_branch="${{ github.ref_name }}"
          fi
          cd tests/vcr_cassettes
          # Commit & push changes to cassettes if any
          if ! git diff --quiet; then
            git add .
            git commit -m "Auto-update cassettes"
            git push origin HEAD:$cassette_branch
            if [ ! $is_pull_request ]; then
              cd ../..
              git add tests/vcr_cassettes
              git commit -m "Update cassette submodule"
              git push origin HEAD:$cassette_branch
            fi
            echo "updated=true" >> $GITHUB_OUTPUT
          else
            echo "updated=false" >> $GITHUB_OUTPUT
            echo "No cassette changes to commit"
          fi
      - name: Post Set up git token auth
        if: steps.setup_git_auth.outcome == 'success'
        run: |
          git config --unset-all '${{ steps.setup_git_auth.outputs.config_key }}'
          git submodule foreach git config --unset-all '${{ steps.setup_git_auth.outputs.config_key }}'
      - name: Apply "behaviour change" label and comment on PR
        if: ${{ startsWith(github.event_name, 'pull_request') }}
        run: |
          PR_NUMBER="${{ github.event.pull_request.number }}"
          TOKEN="${{ secrets.PAT_REVIEW }}"
          REPO="${{ github.repository }}"
          if [[ "${{ steps.push_cassettes.outputs.updated }}" == "true" ]]; then
            echo "Adding label and comment..."
            echo $TOKEN | gh auth login --with-token
            gh issue edit $PR_NUMBER --add-label "behaviour change"
            gh issue comment $PR_NUMBER --body "You changed AutoGPT's behaviour on ${{ runner.os }}. The cassettes have been updated and will be merged to the submodule when this Pull Request gets merged."
          fi
      - name: Upload logs to artifact
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: test-logs
-          path: classic/forge/logs/
+          path: classic/logs/
--- a/.github/workflows/classic-frontend-ci.yml
+++ b/.github/workflows/classic-frontend-ci.yml
@@ -1,60 +0,0 @@
 name: Classic - Frontend CI/CD
 on:
  push:
    branches:
      - master
      - dev
      - 'ci-test*' # This will match any branch that starts with "ci-test"
    paths:
      - 'classic/frontend/**'
      - '.github/workflows/classic-frontend-ci.yml'
  pull_request:
    paths:
      - 'classic/frontend/**'
      - '.github/workflows/classic-frontend-ci.yml'
 jobs:
  build:
    permissions:
      contents: write
      pull-requests: write
    runs-on: ubuntu-latest
    env:
      BUILD_BRANCH: ${{ format('classic-frontend-build/{0}', github.ref_name) }}
    steps:
      - name: Checkout Repo
        uses: actions/checkout@v4
      - name: Setup Flutter
        uses: subosito/flutter-action@v2
        with:
          flutter-version: '3.13.2'
      - name: Build Flutter to Web
        run: |
          cd classic/frontend
          flutter build web --base-href /app/
      # - name: Commit and Push to ${{ env.BUILD_BRANCH }}
      #   if: github.event_name == 'push'
      #   run: |
      #     git config --local user.email "action@github.com"
      #     git config --local user.name "GitHub Action"
      #     git add classic/frontend/build/web
      #     git checkout -B ${{ env.BUILD_BRANCH }}
      #     git commit -m "Update frontend build to ${GITHUB_SHA:0:7}" -a
      #     git push -f origin ${{ env.BUILD_BRANCH }}
      - name: Create PR ${{ env.BUILD_BRANCH }} -> ${{ github.ref_name }}
        if: github.event_name == 'push'
        uses: peter-evans/create-pull-request@v7
        with:
          add-paths: classic/frontend/build/web
          base: ${{ github.ref_name }}
          branch: ${{ env.BUILD_BRANCH }}
          delete-branch: true
          title: "Update frontend build in `${{ github.ref_name }}`"
          body: "This PR updates the frontend build based on commit ${{ github.sha }}."
          commit-message: "Update frontend build based on commit ${{ github.sha }}"
--- a/.github/workflows/classic-python-checks.yml
+++ b/.github/workflows/classic-python-checks.yml
@@ -7,7 +7,9 @@ on:
      - '.github/workflows/classic-python-checks-ci.yml'
      - 'classic/original_autogpt/**'
      - 'classic/forge/**'
-      - 'classic/benchmark/**'
+      - 'classic/direct_benchmark/**'
      - 'classic/pyproject.toml'
      - 'classic/poetry.lock'
      - '**.py'
      - '!classic/forge/tests/vcr_cassettes'
  pull_request:
@@ -16,7 +18,9 @@ on:
      - '.github/workflows/classic-python-checks-ci.yml'
      - 'classic/original_autogpt/**'
      - 'classic/forge/**'
-      - 'classic/benchmark/**'
+      - 'classic/direct_benchmark/**'
      - 'classic/pyproject.toml'
      - 'classic/poetry.lock'
      - '**.py'
      - '!classic/forge/tests/vcr_cassettes'
@@ -27,44 +31,13 @@ concurrency:
 defaults:
  run:
    shell: bash
    working-directory: classic
 jobs:
  get-changed-parts:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout repository
        uses: actions/checkout@v4
      - id: changes-in
        name: Determine affected subprojects
        uses: dorny/paths-filter@v3
        with:
          filters: |
            original_autogpt:
              - classic/original_autogpt/autogpt/**
              - classic/original_autogpt/tests/**
              - classic/original_autogpt/poetry.lock
            forge:
              - classic/forge/forge/**
              - classic/forge/tests/**
              - classic/forge/poetry.lock
            benchmark:
              - classic/benchmark/agbenchmark/**
              - classic/benchmark/tests/**
              - classic/benchmark/poetry.lock
    outputs:
      changed-parts: ${{ steps.changes-in.outputs.changes }}
  lint:
    needs: get-changed-parts
    runs-on: ubuntu-latest
    env:
-      min-python-version: "3.10"
+      min-python-version: "3.12"
    strategy:
      matrix:
        sub-package: ${{ fromJson(needs.get-changed-parts.outputs.changed-parts) }}
      fail-fast: false
    steps:
      - name: Checkout repository
@@ -81,42 +54,31 @@ jobs:
        uses: actions/cache@v4
        with:
          path: ~/.cache/pypoetry
-          key: ${{ runner.os }}-poetry-${{ hashFiles(format('{0}/poetry.lock', matrix.sub-package)) }}
+          key: ${{ runner.os }}-poetry-${{ hashFiles('classic/poetry.lock') }}
      - name: Install Poetry
        run: curl -sSL https://install.python-poetry.org | python3 -
      # Install dependencies
      - name: Install Python dependencies
-        run: poetry -C classic/${{ matrix.sub-package }} install
+        run: poetry install
      # Lint
      - name: Lint (isort)
        run: poetry run isort --check .
        working-directory: classic/${{ matrix.sub-package }}
      - name: Lint (Black)
        if: success() || failure()
        run: poetry run black --check .
        working-directory: classic/${{ matrix.sub-package }}
      - name: Lint (Flake8)
        if: success() || failure()
        run: poetry run flake8 .
        working-directory: classic/${{ matrix.sub-package }}
  types:
    needs: get-changed-parts
    runs-on: ubuntu-latest
    env:
-      min-python-version: "3.10"
+      min-python-version: "3.12"
    strategy:
      matrix:
        sub-package: ${{ fromJson(needs.get-changed-parts.outputs.changed-parts) }}
      fail-fast: false
    steps:
      - name: Checkout repository
@@ -133,19 +95,16 @@ jobs:
        uses: actions/cache@v4
        with:
          path: ~/.cache/pypoetry
-          key: ${{ runner.os }}-poetry-${{ hashFiles(format('{0}/poetry.lock', matrix.sub-package)) }}
+          key: ${{ runner.os }}-poetry-${{ hashFiles('classic/poetry.lock') }}
      - name: Install Poetry
        run: curl -sSL https://install.python-poetry.org | python3 -
      # Install dependencies
      - name: Install Python dependencies
-        run: poetry -C classic/${{ matrix.sub-package }} install
+        run: poetry install
      # Typecheck
      - name: Typecheck
        if: success() || failure()
        run: poetry run pyright
        working-directory: classic/${{ matrix.sub-package }}
--- a/.github/workflows/claude-ci-failure-auto-fix.yml
+++ b/.github/workflows/claude-ci-failure-auto-fix.yml
@@ -93,5 +93,5 @@ jobs:
            Error logs:
            ${{ toJSON(fromJSON(steps.failure_details.outputs.result).errorLogs) }}
-          anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
+          claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
          claude_args: "--allowedTools 'Edit,MultiEdit,Write,Read,Glob,Grep,LS,Bash(git:*),Bash(bun:*),Bash(npm:*),Bash(npx:*),Bash(gh:*)'"
--- a/.github/workflows/claude-dependabot.yml
+++ b/.github/workflows/claude-dependabot.yml
@@ -7,7 +7,7 @@
 # - Provide actionable recommendations for the development team
 #
 # Triggered on: Dependabot PRs (opened, synchronize)
-# Requirements: ANTHROPIC_API_KEY secret must be configured
+# Requirements: CLAUDE_CODE_OAUTH_TOKEN secret must be configured
 name: Claude Dependabot PR Review
@@ -308,7 +308,7 @@ jobs:
        id: claude_review
        uses: anthropics/claude-code-action@v1
        with:
-          anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
+          claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
          claude_args: |
            --allowedTools "Bash(npm:*),Bash(pnpm:*),Bash(poetry:*),Bash(git:*),Edit,Replace,NotebookEditCell,mcp__github_inline_comment__create_inline_comment,Bash(gh pr comment:*), Bash(gh pr diff:*), Bash(gh pr view:*)"
          prompt: |
--- a/.github/workflows/claude.yml
+++ b/.github/workflows/claude.yml
@@ -323,7 +323,7 @@ jobs:
        id: claude
        uses: anthropics/claude-code-action@v1
        with:
-          anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
+          claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
          claude_args: |
            --allowedTools "Bash(npm:*),Bash(pnpm:*),Bash(poetry:*),Bash(git:*),Edit,Replace,NotebookEditCell,mcp__github_inline_comment__create_inline_comment,Bash(gh pr comment:*), Bash(gh pr diff:*), Bash(gh pr view:*), Bash(gh pr edit:*)"
            --model opus
--- a/.github/workflows/docs-block-sync.yml
+++ b/.github/workflows/docs-block-sync.yml
@@ -0,0 +1,78 @@
 name: Block Documentation Sync Check
 on:
  push:
    branches: [master, dev]
    paths:
      - "autogpt_platform/backend/backend/blocks/**"
      - "docs/integrations/**"
      - "autogpt_platform/backend/scripts/generate_block_docs.py"
      - ".github/workflows/docs-block-sync.yml"
  pull_request:
    branches: [master, dev]
    paths:
      - "autogpt_platform/backend/backend/blocks/**"
      - "docs/integrations/**"
      - "autogpt_platform/backend/scripts/generate_block_docs.py"
      - ".github/workflows/docs-block-sync.yml"
 jobs:
  check-docs-sync:
    runs-on: ubuntu-latest
    timeout-minutes: 15
    steps:
      - name: Checkout code
        uses: actions/checkout@v4
        with:
          fetch-depth: 1
      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.11"
      - name: Set up Python dependency cache
        uses: actions/cache@v4
        with:
          path: ~/.cache/pypoetry
          key: poetry-${{ runner.os }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
          restore-keys: |
            poetry-${{ runner.os }}-
      - name: Install Poetry
        run: |
          cd autogpt_platform/backend
          HEAD_POETRY_VERSION=$(python3 ../../.github/workflows/scripts/get_package_version_from_lockfile.py poetry)
          echo "Found Poetry version ${HEAD_POETRY_VERSION} in backend/poetry.lock"
          curl -sSL https://install.python-poetry.org | POETRY_VERSION=$HEAD_POETRY_VERSION python3 -
          echo "$HOME/.local/bin" >> $GITHUB_PATH
      - name: Install dependencies
        working-directory: autogpt_platform/backend
        run: |
          poetry install --only main
          poetry run prisma generate
      - name: Check block documentation is in sync
        working-directory: autogpt_platform/backend
        run: |
          echo "Checking if block documentation is in sync with code..."
          poetry run python scripts/generate_block_docs.py --check
      - name: Show diff if out of sync
        if: failure()
        working-directory: autogpt_platform/backend
        run: |
          echo "::error::Block documentation is out of sync with code!"
          echo ""
          echo "To fix this, run the following command locally:"
          echo "  cd autogpt_platform/backend && poetry run python scripts/generate_block_docs.py"
          echo ""
          echo "Then commit the updated documentation files."
          echo ""
          echo "Regenerating docs to show diff..."
          poetry run python scripts/generate_block_docs.py
          echo ""
          echo "Changes detected:"
          git diff ../../docs/integrations/ || true
--- a/.github/workflows/docs-claude-review.yml
+++ b/.github/workflows/docs-claude-review.yml
@@ -0,0 +1,95 @@
 name: Claude Block Docs Review
 on:
  pull_request:
    types: [opened, synchronize]
    paths:
      - "docs/integrations/**"
      - "autogpt_platform/backend/backend/blocks/**"
 jobs:
  claude-review:
    # Only run for PRs from members/collaborators
    if: |
      github.event.pull_request.author_association == 'OWNER' ||
      github.event.pull_request.author_association == 'MEMBER' ||
      github.event.pull_request.author_association == 'COLLABORATOR'
    runs-on: ubuntu-latest
    timeout-minutes: 15
    permissions:
      contents: read
      pull-requests: write
      id-token: write
    steps:
      - name: Checkout code
        uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.11"
      - name: Set up Python dependency cache
        uses: actions/cache@v4
        with:
          path: ~/.cache/pypoetry
          key: poetry-${{ runner.os }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
          restore-keys: |
            poetry-${{ runner.os }}-
      - name: Install Poetry
        run: |
          cd autogpt_platform/backend
          HEAD_POETRY_VERSION=$(python3 ../../.github/workflows/scripts/get_package_version_from_lockfile.py poetry)
          curl -sSL https://install.python-poetry.org | POETRY_VERSION=$HEAD_POETRY_VERSION python3 -
          echo "$HOME/.local/bin" >> $GITHUB_PATH
      - name: Install dependencies
        working-directory: autogpt_platform/backend
        run: |
          poetry install --only main
          poetry run prisma generate
      - name: Run Claude Code Review
        uses: anthropics/claude-code-action@v1
        with:
          claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
          claude_args: |
            --allowedTools "Read,Glob,Grep,Bash(gh pr comment:*),Bash(gh pr diff:*),Bash(gh pr view:*)"
          prompt: |
            You are reviewing a PR that modifies block documentation or block code for AutoGPT.
            ## Your Task
            Review the changes in this PR and provide constructive feedback. Focus on:
            1. **Documentation Accuracy**: For any block code changes, verify that:
               - Input/output tables in docs match the actual block schemas
               - Description text accurately reflects what the block does
               - Any new blocks have corresponding documentation
            2. **Manual Content Quality**: Check manual sections (marked with `<!-- MANUAL: -->` markers):
               - "How it works" sections should have clear technical explanations
               - "Possible use case" sections should have practical, real-world examples
               - Content should be helpful for users trying to understand the blocks
            3. **Template Compliance**: Ensure docs follow the standard template:
               - What it is (brief intro)
               - What it does (description)
               - How it works (technical explanation)
               - Inputs table
               - Outputs table
               - Possible use case
            4. **Cross-references**: Check that links and anchors are correct
            ## Review Process
            1. First, get the PR diff to see what changed: `gh pr diff ${{ github.event.pull_request.number }}`
            2. Read any modified block files to understand the implementation
            3. Read corresponding documentation files to verify accuracy
            4. Provide your feedback as a PR comment
            Be constructive and specific. If everything looks good, say so!
            If there are issues, explain what's wrong and suggest how to fix it.
--- a/.github/workflows/docs-enhance.yml
+++ b/.github/workflows/docs-enhance.yml
@@ -0,0 +1,194 @@
 name: Enhance Block Documentation
 on:
  workflow_dispatch:
    inputs:
      block_pattern:
        description: 'Block file pattern to enhance (e.g., "google/*.md" or "*" for all blocks)'
        required: true
        default: '*'
        type: string
      dry_run:
        description: 'Dry run mode - show proposed changes without committing'
        type: boolean
        default: true
      max_blocks:
        description: 'Maximum number of blocks to process (0 for unlimited)'
        type: number
        default: 10
 jobs:
  enhance-docs:
    runs-on: ubuntu-latest
    timeout-minutes: 45
    permissions:
      contents: write
      pull-requests: write
      id-token: write
    steps:
      - name: Checkout code
        uses: actions/checkout@v4
        with:
          fetch-depth: 1
      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.11"
      - name: Set up Python dependency cache
        uses: actions/cache@v4
        with:
          path: ~/.cache/pypoetry
          key: poetry-${{ runner.os }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
          restore-keys: |
            poetry-${{ runner.os }}-
      - name: Install Poetry
        run: |
          cd autogpt_platform/backend
          HEAD_POETRY_VERSION=$(python3 ../../.github/workflows/scripts/get_package_version_from_lockfile.py poetry)
          curl -sSL https://install.python-poetry.org | POETRY_VERSION=$HEAD_POETRY_VERSION python3 -
          echo "$HOME/.local/bin" >> $GITHUB_PATH
      - name: Install dependencies
        working-directory: autogpt_platform/backend
        run: |
          poetry install --only main
          poetry run prisma generate
      - name: Run Claude Enhancement
        uses: anthropics/claude-code-action@v1
        with:
          claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
          claude_args: |
            --allowedTools "Read,Edit,Glob,Grep,Write,Bash(git:*),Bash(gh:*),Bash(find:*),Bash(ls:*)"
          prompt: |
            You are enhancing block documentation for AutoGPT. Your task is to improve the MANUAL sections
            of block documentation files by reading the actual block implementations and writing helpful content.
            ## Configuration
            - Block pattern: ${{ inputs.block_pattern }}
            - Dry run: ${{ inputs.dry_run }}
            - Max blocks to process: ${{ inputs.max_blocks }}
            ## Your Task
            1. **Find Documentation Files**
               Find block documentation files matching the pattern in `docs/integrations/`
               Pattern: ${{ inputs.block_pattern }}
               Use: `find docs/integrations -name "*.md" -type f`
            2. **For Each Documentation File** (up to ${{ inputs.max_blocks }} files):
               a. Read the documentation file
               b. Identify which block(s) it documents (look for the block class name)
               c. Find and read the corresponding block implementation in `autogpt_platform/backend/backend/blocks/`
               d. Improve the MANUAL sections:
                  **"How it works" section** (within `<!-- MANUAL: how_it_works -->` markers):
                  - Explain the technical flow of the block
                  - Describe what APIs or services it connects to
                  - Note any important configuration or prerequisites
                  - Keep it concise but informative (2-4 paragraphs)
                  **"Possible use case" section** (within `<!-- MANUAL: use_case -->` markers):
                  - Provide 2-3 practical, real-world examples
                  - Make them specific and actionable
                  - Show how this block could be used in an automation workflow
            3. **Important Rules**
               - ONLY modify content within `<!-- MANUAL: -->` and `<!-- END MANUAL -->` markers
               - Do NOT modify auto-generated sections (inputs/outputs tables, descriptions)
               - Keep content accurate based on the actual block implementation
               - Write for users who may not be technical experts
            4. **Output**
               ${{ inputs.dry_run == true && 'DRY RUN MODE: Show proposed changes for each file but do NOT actually edit the files. Describe what you would change.' || 'LIVE MODE: Actually edit the files to improve the documentation.' }}
            ## Example Improvements
            **Before (How it works):**
            ```
            _Add technical explanation here._
            ```
            **After (How it works):**
            ```
            This block connects to the GitHub API to retrieve issue information. When executed,
            it authenticates using your GitHub credentials and fetches issue details including
            title, body, labels, and assignees.
            The block requires a valid GitHub OAuth connection with repository access permissions.
            It supports both public and private repositories you have access to.
            ```
            **Before (Possible use case):**
            ```
            _Add practical use case examples here._
            ```
            **After (Possible use case):**
            ```
            **Customer Support Automation**: Monitor a GitHub repository for new issues with
            the "bug" label, then automatically create a ticket in your support system and
            notify the on-call engineer via Slack.
            **Release Notes Generation**: When a new release is published, gather all closed
            issues since the last release and generate a summary for your changelog.
            ```
            Begin by finding and listing the documentation files to process.
      - name: Create PR with enhanced documentation
        if: ${{ inputs.dry_run == false }}
        env:
          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: |
          # Check if there are changes
          if git diff --quiet docs/integrations/; then
            echo "No changes to commit"
            exit 0
          fi
          # Configure git
          git config user.name "github-actions[bot]"
          git config user.email "github-actions[bot]@users.noreply.github.com"
          # Create branch and commit
          BRANCH_NAME="docs/enhance-blocks-$(date +%Y%m%d-%H%M%S)"
          git checkout -b "$BRANCH_NAME"
          git add docs/integrations/
          git commit -m "docs: enhance block documentation with LLM-generated content
          Pattern: ${{ inputs.block_pattern }}
          Max blocks: ${{ inputs.max_blocks }}
          🤖 Generated with [Claude Code](https://claude.com/claude-code)
          Co-Authored-By: Claude <noreply@anthropic.com>"
          # Push and create PR
          git push -u origin "$BRANCH_NAME"
          gh pr create \
            --title "docs: LLM-enhanced block documentation" \
            --body "## Summary
          This PR contains LLM-enhanced documentation for block files matching pattern: \`${{ inputs.block_pattern }}\`
          The following manual sections were improved:
          - **How it works**: Technical explanations based on block implementations
          - **Possible use case**: Practical, real-world examples
          ## Review Checklist
          - [ ] Content is accurate based on block implementations
          - [ ] Examples are practical and helpful
          - [ ] No auto-generated sections were modified
          ---
          🤖 Generated with [Claude Code](https://claude.com/claude-code)" \
            --base dev
--- a/.github/workflows/platform-backend-ci.yml
+++ b/.github/workflows/platform-backend-ci.yml
@@ -176,7 +176,7 @@ jobs:
          }
      - name: Run Database Migrations
-        run: poetry run prisma migrate dev --name updates
+        run: poetry run prisma migrate deploy
        env:
          DATABASE_URL: ${{ steps.supabase.outputs.DB_URL }}
          DIRECT_URL: ${{ steps.supabase.outputs.DB_URL }}
--- a/.github/workflows/platform-frontend-ci.yml
+++ b/.github/workflows/platform-frontend-ci.yml
@@ -11,6 +11,7 @@ on:
      - ".github/workflows/platform-frontend-ci.yml"
      - "autogpt_platform/frontend/**"
  merge_group:
  workflow_dispatch:
 concurrency:
  group: ${{ github.workflow }}-${{ github.event_name == 'merge_group' && format('merge-queue-{0}', github.ref) || format('{0}-{1}', github.ref, github.event.pull_request.number || github.sha) }}
@@ -151,6 +152,14 @@ jobs:
        run: |
          cp ../.env.default ../.env
      - name: Copy backend .env and set OpenAI API key
        run: |
          cp ../backend/.env.default ../backend/.env
          echo "OPENAI_INTERNAL_API_KEY=${{ secrets.OPENAI_API_KEY }}" >> ../backend/.env
        env:
          # Used by E2E test data script to generate embeddings for approved store agents
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3
@@ -226,13 +235,25 @@ jobs:
      - name: Run Playwright tests
        run: pnpm test:no-build
        continue-on-error: false
-      - name: Upload Playwright artifacts
+      - name: Upload Playwright report
-        if: failure()
+        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: playwright-report
          path: playwright-report
          if-no-files-found: ignore
          retention-days: 3
      - name: Upload Playwright test results
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: playwright-test-results
          path: test-results
          if-no-files-found: ignore
          retention-days: 3
      - name: Print Final Docker Compose logs
        if: always()
--- a/.gitignore
+++ b/.gitignore
@@ -3,6 +3,7 @@
 classic/original_autogpt/keys.py
 classic/original_autogpt/*.json
 auto_gpt_workspace/*
 .autogpt/
 *.mpeg
 .env
 # Root .env files
@@ -177,5 +178,5 @@ autogpt_platform/backend/settings.py
 *.ign.*
 .test-contents
-.claude/settings.local.json
+**/.claude/settings.local.json
 /autogpt_platform/backend/logs
--- a/.gitmodules
+++ b/.gitmodules
@@ -1,3 +0,0 @@
 [submodule "classic/forge/tests/vcr_cassettes"]
 	path = classic/forge/tests/vcr_cassettes
 	url = https://github.com/Significant-Gravitas/Auto-GPT-test-cassettes
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -43,29 +43,10 @@ repos:
        pass_filenames: false
      - id: poetry-install
-        name: Check & Install dependencies - Classic - AutoGPT
+        name: Check & Install dependencies - Classic
-        alias: poetry-install-classic-autogpt
+        alias: poetry-install-classic
-        entry: poetry -C classic/original_autogpt install
+        entry: poetry -C classic install
-        # include forge source (since it's a path dependency)
+        files: ^classic/poetry\.lock$
        files: ^classic/(original_autogpt|forge)/poetry\.lock$
        types: [file]
        language: system
        pass_filenames: false
      - id: poetry-install
        name: Check & Install dependencies - Classic - Forge
        alias: poetry-install-classic-forge
        entry: poetry -C classic/forge install
        files: ^classic/forge/poetry\.lock$
        types: [file]
        language: system
        pass_filenames: false
      - id: poetry-install
        name: Check & Install dependencies - Classic - Benchmark
        alias: poetry-install-classic-benchmark
        entry: poetry -C classic/benchmark install
        files: ^classic/benchmark/poetry\.lock$
        types: [file]
        language: system
        pass_filenames: false
@@ -116,26 +97,10 @@ repos:
        language: system
      - id: isort
-        name: Lint (isort) - Classic - AutoGPT
+        name: Lint (isort) - Classic
-        alias: isort-classic-autogpt
+        alias: isort-classic
-        entry: poetry -P classic/original_autogpt run isort -p autogpt
+        entry: bash -c 'cd classic && poetry run isort $(echo "$@" | sed "s|classic/||g")' --
-        files: ^classic/original_autogpt/
+        files: ^classic/(original_autogpt|forge|direct_benchmark)/
        types: [file, python]
        language: system
      - id: isort
        name: Lint (isort) - Classic - Forge
        alias: isort-classic-forge
        entry: poetry -P classic/forge run isort -p forge
        files: ^classic/forge/
        types: [file, python]
        language: system
      - id: isort
        name: Lint (isort) - Classic - Benchmark
        alias: isort-classic-benchmark
        entry: poetry -P classic/benchmark run isort -p agbenchmark
        files: ^classic/benchmark/
        types: [file, python]
        language: system
@@ -149,26 +114,13 @@ repos:
  - repo: https://github.com/PyCQA/flake8
    rev: 7.0.0
-    # To have flake8 load the config of the individual subprojects, we have to call
+    # Use consolidated flake8 config at classic/.flake8
    # them separately.
    hooks:
      - id: flake8
-        name: Lint (Flake8) - Classic - AutoGPT
+        name: Lint (Flake8) - Classic
-        alias: flake8-classic-autogpt
+        alias: flake8-classic
-        files: ^classic/original_autogpt/(autogpt|scripts|tests)/
+        files: ^classic/(original_autogpt|forge|direct_benchmark)/
-        args: [--config=classic/original_autogpt/.flake8]
+        args: [--config=classic/.flake8]
      - id: flake8
        name: Lint (Flake8) - Classic - Forge
        alias: flake8-classic-forge
        files: ^classic/forge/(forge|tests)/
        args: [--config=classic/forge/.flake8]
      - id: flake8
        name: Lint (Flake8) - Classic - Benchmark
        alias: flake8-classic-benchmark
        files: ^classic/benchmark/(agbenchmark|tests)/((?!reports).)*[/.]
        args: [--config=classic/benchmark/.flake8]
  - repo: local
    hooks:
@@ -204,29 +156,10 @@ repos:
        pass_filenames: false
      - id: pyright
-        name: Typecheck - Classic - AutoGPT
+        name: Typecheck - Classic
-        alias: pyright-classic-autogpt
+        alias: pyright-classic
-        entry: poetry -C classic/original_autogpt run pyright
+        entry: poetry -C classic run pyright
-        # include forge source (since it's a path dependency) but exclude *_test.py files:
+        files: ^classic/(original_autogpt|forge|direct_benchmark)/.*\.py$|^classic/poetry\.lock$
        files: ^(classic/original_autogpt/((autogpt|scripts|tests)/|poetry\.lock$)|classic/forge/(forge/.*(?<!_test)\.py|poetry\.lock)$)
        types: [file]
        language: system
        pass_filenames: false
      - id: pyright
        name: Typecheck - Classic - Forge
        alias: pyright-classic-forge
        entry: poetry -C classic/forge run pyright
        files: ^classic/forge/(forge/|poetry\.lock$)
        types: [file]
        language: system
        pass_filenames: false
      - id: pyright
        name: Typecheck - Classic - Benchmark
        alias: pyright-classic-benchmark
        entry: poetry -C classic/benchmark run pyright
        files: ^classic/benchmark/(agbenchmark/|tests/|poetry\.lock$)
        types: [file]
        language: system
        pass_filenames: false
--- a/autogpt_platform/Makefile
+++ b/autogpt_platform/Makefile
@@ -6,9 +6,10 @@ start-core:
 # Stop core services
 stop-core:
-	docker compose stop deps
+	docker compose stop 
 reset-db:
 	docker compose stop db
 	rm -rf db/docker/volumes/db/data
 	cd backend && poetry run prisma migrate deploy
 	cd backend && poetry run prisma generate
@@ -60,4 +61,4 @@ help:
 	@echo "  run-backend - Run the backend FastAPI server"
 	@echo "  run-frontend - Run the frontend Next.js development server"
 	@echo "  test-data - Run the test data creator"
-	@echo "  load-store-agents - Load store agents from agents/ folder into test database"
+	@echo "  load-store-agents - Load store agents from agents/ folder into test database"
--- a/autogpt_platform/backend/.env.default
+++ b/autogpt_platform/backend/.env.default
@@ -58,6 +58,13 @@ V0_API_KEY=
 OPEN_ROUTER_API_KEY=
 NVIDIA_API_KEY=
 # Langfuse Prompt Management
 # Used for managing the CoPilot system prompt externally
 # Get credentials from https://cloud.langfuse.com or your self-hosted instance
 LANGFUSE_PUBLIC_KEY=
 LANGFUSE_SECRET_KEY=
 LANGFUSE_HOST=https://cloud.langfuse.com
 # OAuth Credentials
 # For the OAuth callback URL, use <your_frontend_url>/auth/integrations/oauth_callback,
 # e.g. http://localhost:3000/auth/integrations/oauth_callback
--- a/autogpt_platform/backend/.gitignore
+++ b/autogpt_platform/backend/.gitignore
@@ -18,3 +18,4 @@ load-tests/results/
 load-tests/*.json
 load-tests/*.log
 load-tests/node_modules/*
 migrations/*/rollback*.sql
--- a/autogpt_platform/backend/Dockerfile
+++ b/autogpt_platform/backend/Dockerfile
@@ -100,6 +100,7 @@ COPY autogpt_platform/backend/migrations /app/autogpt_platform/backend/migration
 FROM server_dependencies AS server
 COPY autogpt_platform/backend /app/autogpt_platform/backend
 COPY docs /app/docs
 RUN poetry install --no-ansi --only-root
 ENV PORT=8000
--- a/autogpt_platform/backend/backend/api/external/v1/tools.py
+++ b/autogpt_platform/backend/backend/api/external/v1/tools.py
@@ -70,7 +70,7 @@ class RunAgentRequest(BaseModel):
    )
-def _create_ephemeral_session(user_id: str | None) -> ChatSession:
+def _create_ephemeral_session(user_id: str) -> ChatSession:
    """Create an ephemeral session for stateless API requests."""
    return ChatSession.new(user_id)
--- a/autogpt_platform/backend/backend/api/features/admin/execution_analytics_routes.py
+++ b/autogpt_platform/backend/backend/api/features/admin/execution_analytics_routes.py
@@ -28,6 +28,7 @@ from backend.executor.manager import get_db_async_client
 from backend.util.settings import Settings
 logger = logging.getLogger(__name__)
 settings = Settings()
 class ExecutionAnalyticsRequest(BaseModel):
@@ -63,6 +64,8 @@ class ExecutionAnalyticsResult(BaseModel):
    score: Optional[float]
    status: str  # "success", "failed", "skipped"
    error_message: Optional[str] = None
    started_at: Optional[datetime] = None
    ended_at: Optional[datetime] = None
 class ExecutionAnalyticsResponse(BaseModel):
@@ -224,11 +227,6 @@ async def generate_execution_analytics(
    )
    try:
        # Validate model configuration
        settings = Settings()
        if not settings.secrets.openai_internal_api_key:
            raise HTTPException(status_code=500, detail="OpenAI API key not configured")
        # Get database client
        db_client = get_db_async_client()
@@ -320,6 +318,8 @@ async def generate_execution_analytics(
                    ),
                    status="skipped",
                    error_message=None,  # Not an error - just already processed
                    started_at=execution.started_at,
                    ended_at=execution.ended_at,
                )
            )
@@ -349,6 +349,9 @@ async def _process_batch(
 ) -> list[ExecutionAnalyticsResult]:
    """Process a batch of executions concurrently."""
    if not settings.secrets.openai_internal_api_key:
        raise HTTPException(status_code=500, detail="OpenAI API key not configured")
    async def process_single_execution(execution) -> ExecutionAnalyticsResult:
        try:
            # Generate activity status and score using the specified model
@@ -387,6 +390,8 @@ async def _process_batch(
                    score=None,
                    status="skipped",
                    error_message="Activity generation returned None",
                    started_at=execution.started_at,
                    ended_at=execution.ended_at,
                )
            # Update the execution stats
@@ -416,6 +421,8 @@ async def _process_batch(
                summary_text=activity_response["activity_status"],
                score=activity_response["correctness_score"],
                status="success",
                started_at=execution.started_at,
                ended_at=execution.ended_at,
            )
        except Exception as e:
@@ -429,6 +436,8 @@ async def _process_batch(
                score=None,
                status="failed",
                error_message=str(e),
                started_at=execution.started_at,
                ended_at=execution.ended_at,
            )
    # Process all executions in the batch concurrently
--- a/autogpt_platform/backend/backend/api/features/chat/config.py
+++ b/autogpt_platform/backend/backend/api/features/chat/config.py
@@ -1,7 +1,6 @@
 """Configuration management for chat system."""
 import os
 from pathlib import Path
 from pydantic import Field, field_validator
 from pydantic_settings import BaseSettings
@@ -12,7 +11,11 @@ class ChatConfig(BaseSettings):
    # OpenAI API Configuration
    model: str = Field(
-        default="qwen/qwen3-235b-a22b-2507", description="Default model to use"
+        default="anthropic/claude-opus-4.5", description="Default model to use"
    )
    title_model: str = Field(
        default="openai/gpt-4o-mini",
        description="Model to use for generating session titles (should be fast/cheap)",
    )
    api_key: str | None = Field(default=None, description="OpenAI API key")
    base_url: str | None = Field(
@@ -23,12 +26,6 @@ class ChatConfig(BaseSettings):
    # Session TTL Configuration - 12 hours
    session_ttl: int = Field(default=43200, description="Session TTL in seconds")
    # System Prompt Configuration
    system_prompt_path: str = Field(
        default="prompts/chat_system.md",
        description="Path to system prompt file relative to chat module",
    )
    # Streaming Configuration
    max_context_messages: int = Field(
        default=50, ge=1, le=200, description="Maximum context messages"
@@ -41,6 +38,13 @@ class ChatConfig(BaseSettings):
        default=3, description="Maximum number of agent schedules"
    )
    # Langfuse Prompt Management Configuration
    # Note: Langfuse credentials are in Settings().secrets (settings.py)
    langfuse_prompt_name: str = Field(
        default="CoPilot Prompt",
        description="Name of the prompt in Langfuse to fetch",
    )
    @field_validator("api_key", mode="before")
    @classmethod
    def get_api_key(cls, v):
@@ -72,43 +76,11 @@ class ChatConfig(BaseSettings):
                v = "https://openrouter.ai/api/v1"
        return v
-    def get_system_prompt(self, **template_vars) -> str:
+    # Prompt paths for different contexts
-        """Load and render the system prompt from file.
+    PROMPT_PATHS: dict[str, str] = {
-
+        "default": "prompts/chat_system.md",
-        Args:
+        "onboarding": "prompts/onboarding_system.md",
-            **template_vars: Variables to substitute in the template
+    }
        Returns:
            Rendered system prompt string
        """
        # Get the path relative to this module
        module_dir = Path(__file__).parent
        prompt_path = module_dir / self.system_prompt_path
        # Check for .j2 extension first (Jinja2 template)
        j2_path = Path(str(prompt_path) + ".j2")
        if j2_path.exists():
            try:
                from jinja2 import Template
                template = Template(j2_path.read_text())
                return template.render(**template_vars)
            except ImportError:
                # Jinja2 not installed, fall back to reading as plain text
                return j2_path.read_text()
        # Check for markdown file
        if prompt_path.exists():
            content = prompt_path.read_text()
            # Simple variable substitution if Jinja2 is not available
            for key, value in template_vars.items():
                placeholder = f"{{{key}}}"
                content = content.replace(placeholder, str(value))
            return content
        raise FileNotFoundError(f"System prompt file not found: {prompt_path}")
    class Config:
        """Pydantic config."""
--- a/autogpt_platform/backend/backend/api/features/chat/db.py
+++ b/autogpt_platform/backend/backend/api/features/chat/db.py
@@ -0,0 +1,249 @@
 """Database operations for chat sessions."""
 import asyncio
 import logging
 from datetime import UTC, datetime
 from typing import Any, cast
 from prisma.models import ChatMessage as PrismaChatMessage
 from prisma.models import ChatSession as PrismaChatSession
 from prisma.types import (
    ChatMessageCreateInput,
    ChatSessionCreateInput,
    ChatSessionUpdateInput,
    ChatSessionWhereInput,
 )
 from backend.data.db import transaction
 from backend.util.json import SafeJson
 logger = logging.getLogger(__name__)
 async def get_chat_session(session_id: str) -> PrismaChatSession | None:
    """Get a chat session by ID from the database."""
    session = await PrismaChatSession.prisma().find_unique(
        where={"id": session_id},
        include={"Messages": True},
    )
    if session and session.Messages:
        # Sort messages by sequence in Python - Prisma Python client doesn't support
        # order_by in include clauses (unlike Prisma JS), so we sort after fetching
        session.Messages.sort(key=lambda m: m.sequence)
    return session
 async def create_chat_session(
    session_id: str,
    user_id: str,
 ) -> PrismaChatSession:
    """Create a new chat session in the database."""
    data = ChatSessionCreateInput(
        id=session_id,
        userId=user_id,
        credentials=SafeJson({}),
        successfulAgentRuns=SafeJson({}),
        successfulAgentSchedules=SafeJson({}),
    )
    return await PrismaChatSession.prisma().create(
        data=data,
        include={"Messages": True},
    )
 async def update_chat_session(
    session_id: str,
    credentials: dict[str, Any] | None = None,
    successful_agent_runs: dict[str, Any] | None = None,
    successful_agent_schedules: dict[str, Any] | None = None,
    total_prompt_tokens: int | None = None,
    total_completion_tokens: int | None = None,
    title: str | None = None,
 ) -> PrismaChatSession | None:
    """Update a chat session's metadata."""
    data: ChatSessionUpdateInput = {"updatedAt": datetime.now(UTC)}
    if credentials is not None:
        data["credentials"] = SafeJson(credentials)
    if successful_agent_runs is not None:
        data["successfulAgentRuns"] = SafeJson(successful_agent_runs)
    if successful_agent_schedules is not None:
        data["successfulAgentSchedules"] = SafeJson(successful_agent_schedules)
    if total_prompt_tokens is not None:
        data["totalPromptTokens"] = total_prompt_tokens
    if total_completion_tokens is not None:
        data["totalCompletionTokens"] = total_completion_tokens
    if title is not None:
        data["title"] = title
    session = await PrismaChatSession.prisma().update(
        where={"id": session_id},
        data=data,
        include={"Messages": True},
    )
    if session and session.Messages:
        # Sort in Python - Prisma Python doesn't support order_by in include clauses
        session.Messages.sort(key=lambda m: m.sequence)
    return session
 async def add_chat_message(
    session_id: str,
    role: str,
    sequence: int,
    content: str | None = None,
    name: str | None = None,
    tool_call_id: str | None = None,
    refusal: str | None = None,
    tool_calls: list[dict[str, Any]] | None = None,
    function_call: dict[str, Any] | None = None,
 ) -> PrismaChatMessage:
    """Add a message to a chat session."""
    # Build input dict dynamically rather than using ChatMessageCreateInput directly
    # because Prisma's TypedDict validation rejects optional fields set to None.
    # We only include fields that have values, then cast at the end.
    data: dict[str, Any] = {
        "Session": {"connect": {"id": session_id}},
        "role": role,
        "sequence": sequence,
    }
    # Add optional string fields
    if content is not None:
        data["content"] = content
    if name is not None:
        data["name"] = name
    if tool_call_id is not None:
        data["toolCallId"] = tool_call_id
    if refusal is not None:
        data["refusal"] = refusal
    # Add optional JSON fields only when they have values
    if tool_calls is not None:
        data["toolCalls"] = SafeJson(tool_calls)
    if function_call is not None:
        data["functionCall"] = SafeJson(function_call)
    # Run message create and session timestamp update in parallel for lower latency
    _, message = await asyncio.gather(
        PrismaChatSession.prisma().update(
            where={"id": session_id},
            data={"updatedAt": datetime.now(UTC)},
        ),
        PrismaChatMessage.prisma().create(data=cast(ChatMessageCreateInput, data)),
    )
    return message
 async def add_chat_messages_batch(
    session_id: str,
    messages: list[dict[str, Any]],
    start_sequence: int,
 ) -> list[PrismaChatMessage]:
    """Add multiple messages to a chat session in a batch.
    Uses a transaction for atomicity - if any message creation fails,
    the entire batch is rolled back.
    """
    if not messages:
        return []
    created_messages = []
    async with transaction() as tx:
        for i, msg in enumerate(messages):
            # Build input dict dynamically rather than using ChatMessageCreateInput
            # directly because Prisma's TypedDict validation rejects optional fields
            # set to None. We only include fields that have values, then cast.
            data: dict[str, Any] = {
                "Session": {"connect": {"id": session_id}},
                "role": msg["role"],
                "sequence": start_sequence + i,
            }
            # Add optional string fields
            if msg.get("content") is not None:
                data["content"] = msg["content"]
            if msg.get("name") is not None:
                data["name"] = msg["name"]
            if msg.get("tool_call_id") is not None:
                data["toolCallId"] = msg["tool_call_id"]
            if msg.get("refusal") is not None:
                data["refusal"] = msg["refusal"]
            # Add optional JSON fields only when they have values
            if msg.get("tool_calls") is not None:
                data["toolCalls"] = SafeJson(msg["tool_calls"])
            if msg.get("function_call") is not None:
                data["functionCall"] = SafeJson(msg["function_call"])
            created = await PrismaChatMessage.prisma(tx).create(
                data=cast(ChatMessageCreateInput, data)
            )
            created_messages.append(created)
        # Update session's updatedAt timestamp within the same transaction.
        # Note: Token usage (total_prompt_tokens, total_completion_tokens) is updated
        # separately via update_chat_session() after streaming completes.
        await PrismaChatSession.prisma(tx).update(
            where={"id": session_id},
            data={"updatedAt": datetime.now(UTC)},
        )
    return created_messages
 async def get_user_chat_sessions(
    user_id: str,
    limit: int = 50,
    offset: int = 0,
 ) -> list[PrismaChatSession]:
    """Get chat sessions for a user, ordered by most recent."""
    return await PrismaChatSession.prisma().find_many(
        where={"userId": user_id},
        order={"updatedAt": "desc"},
        take=limit,
        skip=offset,
    )
 async def get_user_session_count(user_id: str) -> int:
    """Get the total number of chat sessions for a user."""
    return await PrismaChatSession.prisma().count(where={"userId": user_id})
 async def delete_chat_session(session_id: str, user_id: str | None = None) -> bool:
    """Delete a chat session and all its messages.
    Args:
        session_id: The session ID to delete.
        user_id: If provided, validates that the session belongs to this user
            before deletion. This prevents unauthorized deletion of other
            users' sessions.
    Returns:
        True if deleted successfully, False otherwise.
    """
    try:
        # Build typed where clause with optional user_id validation
        where_clause: ChatSessionWhereInput = {"id": session_id}
        if user_id is not None:
            where_clause["userId"] = user_id
        result = await PrismaChatSession.prisma().delete_many(where=where_clause)
        if result == 0:
            logger.warning(
                f"No session deleted for {session_id} "
                f"(user_id validation: {user_id is not None})"
            )
            return False
        return True
    except Exception as e:
        logger.error(f"Failed to delete chat session {session_id}: {e}")
        return False
 async def get_chat_session_message_count(session_id: str) -> int:
    """Get the number of messages in a chat session."""
    count = await PrismaChatMessage.prisma().count(where={"sessionId": session_id})
    return count
--- a/autogpt_platform/backend/backend/api/features/chat/model.py
+++ b/autogpt_platform/backend/backend/api/features/chat/model.py
@@ -1,6 +1,9 @@
 import asyncio
 import logging
 import uuid
 from datetime import UTC, datetime
 from typing import Any
 from weakref import WeakValueDictionary
 from openai.types.chat import (
    ChatCompletionAssistantMessageParam,
@@ -16,17 +19,63 @@ from openai.types.chat.chat_completion_message_tool_call_param import (
    ChatCompletionMessageToolCallParam,
    Function,
 )
 from prisma.models import ChatMessage as PrismaChatMessage
 from prisma.models import ChatSession as PrismaChatSession
 from pydantic import BaseModel
 from backend.data.redis_client import get_redis_async
-from backend.util.exceptions import RedisError
+from backend.util import json
 from backend.util.exceptions import DatabaseError, RedisError
 from . import db as chat_db
 from .config import ChatConfig
 logger = logging.getLogger(__name__)
 config = ChatConfig()
 def _parse_json_field(value: str | dict | list | None, default: Any = None) -> Any:
    """Parse a JSON field that may be stored as string or already parsed."""
    if value is None:
        return default
    if isinstance(value, str):
        return json.loads(value)
    return value
 # Redis cache key prefix for chat sessions
 CHAT_SESSION_CACHE_PREFIX = "chat:session:"
 def _get_session_cache_key(session_id: str) -> str:
    """Get the Redis cache key for a chat session."""
    return f"{CHAT_SESSION_CACHE_PREFIX}{session_id}"
 # Session-level locks to prevent race conditions during concurrent upserts.
 # Uses WeakValueDictionary to automatically garbage collect locks when no longer referenced,
 # preventing unbounded memory growth while maintaining lock semantics for active sessions.
 # Invalidation: Locks are auto-removed by GC when no coroutine holds a reference (after
 # async with lock: completes). Explicit cleanup also occurs in delete_chat_session().
 _session_locks: WeakValueDictionary[str, asyncio.Lock] = WeakValueDictionary()
 _session_locks_mutex = asyncio.Lock()
 async def _get_session_lock(session_id: str) -> asyncio.Lock:
    """Get or create a lock for a specific session to prevent concurrent upserts.
    Uses WeakValueDictionary for automatic cleanup: locks are garbage collected
    when no coroutine holds a reference to them, preventing memory leaks from
    unbounded growth of session locks.
    """
    async with _session_locks_mutex:
        lock = _session_locks.get(session_id)
        if lock is None:
            lock = asyncio.Lock()
            _session_locks[session_id] = lock
        return lock
 class ChatMessage(BaseModel):
    role: str
    content: str | None = None
@@ -45,7 +94,8 @@ class Usage(BaseModel):
 class ChatSession(BaseModel):
    session_id: str
-    user_id: str | None
+    user_id: str
    title: str | None = None
    messages: list[ChatMessage]
    usage: list[Usage]
    credentials: dict[str, dict] = {}  # Map of provider -> credential metadata
@@ -55,10 +105,11 @@ class ChatSession(BaseModel):
    successful_agent_schedules: dict[str, int] = {}
    @staticmethod
-    def new(user_id: str | None) -> "ChatSession":
+    def new(user_id: str) -> "ChatSession":
        return ChatSession(
            session_id=str(uuid.uuid4()),
            user_id=user_id,
            title=None,
            messages=[],
            usage=[],
            credentials={},
@@ -66,6 +117,61 @@ class ChatSession(BaseModel):
            updated_at=datetime.now(UTC),
        )
    @staticmethod
    def from_db(
        prisma_session: PrismaChatSession,
        prisma_messages: list[PrismaChatMessage] | None = None,
    ) -> "ChatSession":
        """Convert Prisma models to Pydantic ChatSession."""
        messages = []
        if prisma_messages:
            for msg in prisma_messages:
                messages.append(
                    ChatMessage(
                        role=msg.role,
                        content=msg.content,
                        name=msg.name,
                        tool_call_id=msg.toolCallId,
                        refusal=msg.refusal,
                        tool_calls=_parse_json_field(msg.toolCalls),
                        function_call=_parse_json_field(msg.functionCall),
                    )
                )
        # Parse JSON fields from Prisma
        credentials = _parse_json_field(prisma_session.credentials, default={})
        successful_agent_runs = _parse_json_field(
            prisma_session.successfulAgentRuns, default={}
        )
        successful_agent_schedules = _parse_json_field(
            prisma_session.successfulAgentSchedules, default={}
        )
        # Calculate usage from token counts
        usage = []
        if prisma_session.totalPromptTokens or prisma_session.totalCompletionTokens:
            usage.append(
                Usage(
                    prompt_tokens=prisma_session.totalPromptTokens or 0,
                    completion_tokens=prisma_session.totalCompletionTokens or 0,
                    total_tokens=(prisma_session.totalPromptTokens or 0)
                    + (prisma_session.totalCompletionTokens or 0),
                )
            )
        return ChatSession(
            session_id=prisma_session.id,
            user_id=prisma_session.userId,
            title=prisma_session.title,
            messages=messages,
            usage=usage,
            credentials=credentials,
            started_at=prisma_session.createdAt,
            updated_at=prisma_session.updatedAt,
            successful_agent_runs=successful_agent_runs,
            successful_agent_schedules=successful_agent_schedules,
        )
    def to_openai_messages(self) -> list[ChatCompletionMessageParam]:
        messages = []
        for message in self.messages:
@@ -155,50 +261,337 @@ class ChatSession(BaseModel):
        return messages
-async def get_chat_session(
+async def _get_session_from_cache(session_id: str) -> ChatSession | None:
-    session_id: str,
+    """Get a chat session from Redis cache."""
-    user_id: str | None,
+    redis_key = _get_session_cache_key(session_id)
 ) -> ChatSession | None:
    """Get a chat session by ID."""
    redis_key = f"chat:session:{session_id}"
    async_redis = await get_redis_async()
    raw_session: bytes | None = await async_redis.get(redis_key)
    if raw_session is None:
        logger.warning(f"Session {session_id} not found in Redis")
        return None
    try:
        session = ChatSession.model_validate_json(raw_session)
        logger.info(
            f"Loading session {session_id} from cache: "
            f"message_count={len(session.messages)}, "
            f"roles={[m.role for m in session.messages]}"
        )
        return session
    except Exception as e:
        logger.error(f"Failed to deserialize session {session_id}: {e}", exc_info=True)
        raise RedisError(f"Corrupted session data for {session_id}") from e
-    if session.user_id is not None and session.user_id != user_id:
+
 async def _cache_session(session: ChatSession) -> None:
    """Cache a chat session in Redis."""
    redis_key = _get_session_cache_key(session.session_id)
    async_redis = await get_redis_async()
    await async_redis.setex(redis_key, config.session_ttl, session.model_dump_json())
 async def _get_session_from_db(session_id: str) -> ChatSession | None:
    """Get a chat session from the database."""
    prisma_session = await chat_db.get_chat_session(session_id)
    if not prisma_session:
        return None
    messages = prisma_session.Messages
    logger.info(
        f"Loading session {session_id} from DB: "
        f"has_messages={messages is not None}, "
        f"message_count={len(messages) if messages else 0}, "
        f"roles={[m.role for m in messages] if messages else []}"
    )
    return ChatSession.from_db(prisma_session, messages)
 async def _save_session_to_db(
    session: ChatSession, existing_message_count: int
 ) -> None:
    """Save or update a chat session in the database."""
    # Check if session exists in DB
    existing = await chat_db.get_chat_session(session.session_id)
    if not existing:
        # Create new session
        await chat_db.create_chat_session(
            session_id=session.session_id,
            user_id=session.user_id,
        )
        existing_message_count = 0
    # Calculate total tokens from usage
    total_prompt = sum(u.prompt_tokens for u in session.usage)
    total_completion = sum(u.completion_tokens for u in session.usage)
    # Update session metadata
    await chat_db.update_chat_session(
        session_id=session.session_id,
        credentials=session.credentials,
        successful_agent_runs=session.successful_agent_runs,
        successful_agent_schedules=session.successful_agent_schedules,
        total_prompt_tokens=total_prompt,
        total_completion_tokens=total_completion,
    )
    # Add new messages (only those after existing count)
    new_messages = session.messages[existing_message_count:]
    if new_messages:
        messages_data = []
        for msg in new_messages:
            messages_data.append(
                {
                    "role": msg.role,
                    "content": msg.content,
                    "name": msg.name,
                    "tool_call_id": msg.tool_call_id,
                    "refusal": msg.refusal,
                    "tool_calls": msg.tool_calls,
                    "function_call": msg.function_call,
                }
            )
        logger.info(
            f"Saving {len(new_messages)} new messages to DB for session {session.session_id}: "
            f"roles={[m['role'] for m in messages_data]}, "
            f"start_sequence={existing_message_count}"
        )
        await chat_db.add_chat_messages_batch(
            session_id=session.session_id,
            messages=messages_data,
            start_sequence=existing_message_count,
        )
 async def get_chat_session(
    session_id: str,
    user_id: str | None = None,
 ) -> ChatSession | None:
    """Get a chat session by ID.
    Checks Redis cache first, falls back to database if not found.
    Caches database results back to Redis.
    Args:
        session_id: The session ID to fetch.
        user_id: If provided, validates that the session belongs to this user.
            If None, ownership is not validated (admin/system access).
    """
    # Try cache first
    try:
        session = await _get_session_from_cache(session_id)
        if session:
            # Verify user ownership if user_id was provided for validation
            if user_id is not None and session.user_id != user_id:
                logger.warning(
                    f"Session {session_id} user id mismatch: {session.user_id} != {user_id}"
                )
                return None
            return session
    except RedisError:
        logger.warning(f"Cache error for session {session_id}, trying database")
    except Exception as e:
        logger.warning(f"Unexpected cache error for session {session_id}: {e}")
    # Fall back to database
    logger.info(f"Session {session_id} not in cache, checking database")
    session = await _get_session_from_db(session_id)
    if session is None:
        logger.warning(f"Session {session_id} not found in cache or database")
        return None
    # Verify user ownership if user_id was provided for validation
    if user_id is not None and session.user_id != user_id:
        logger.warning(
            f"Session {session_id} user id mismatch: {session.user_id} != {user_id}"
        )
        return None
    # Cache the session from DB
    try:
        await _cache_session(session)
        logger.info(f"Cached session {session_id} from database")
    except Exception as e:
        logger.warning(f"Failed to cache session {session_id}: {e}")
    return session
 async def upsert_chat_session(
    session: ChatSession,
 ) -> ChatSession:
-    """Update a chat session with the given messages."""
+    """Update a chat session in both cache and database.
-    redis_key = f"chat:session:{session.session_id}"
+    Uses session-level locking to prevent race conditions when concurrent
    operations (e.g., background title update and main stream handler)
    attempt to upsert the same session simultaneously.
-    async_redis = await get_redis_async()
+    Raises:
-    resp = await async_redis.setex(
+        DatabaseError: If the database write fails. The cache is still updated
-        redis_key, config.session_ttl, session.model_dump_json()
+            as a best-effort optimization, but the error is propagated to ensure
-    )
+            callers are aware of the persistence failure.
        RedisError: If the cache write fails (after successful DB write).
    """
    # Acquire session-specific lock to prevent concurrent upserts
    lock = await _get_session_lock(session.session_id)
-    if not resp:
+    async with lock:
-        raise RedisError(
+        # Get existing message count from DB for incremental saves
-            f"Failed to persist chat session {session.session_id} to Redis: {resp}"
+        existing_message_count = await chat_db.get_chat_session_message_count(
            session.session_id
        )
        db_error: Exception | None = None
        # Save to database (primary storage)
        try:
            await _save_session_to_db(session, existing_message_count)
        except Exception as e:
            logger.error(
                f"Failed to save session {session.session_id} to database: {e}"
            )
            db_error = e
        # Save to cache (best-effort, even if DB failed)
        try:
            await _cache_session(session)
        except Exception as e:
            # If DB succeeded but cache failed, raise cache error
            if db_error is None:
                raise RedisError(
                    f"Failed to persist chat session {session.session_id} to Redis: {e}"
                ) from e
            # If both failed, log cache error but raise DB error (more critical)
            logger.warning(
                f"Cache write also failed for session {session.session_id}: {e}"
            )
        # Propagate DB error after attempting cache (prevents data loss)
        if db_error is not None:
            raise DatabaseError(
                f"Failed to persist chat session {session.session_id} to database"
            ) from db_error
        return session
 async def create_chat_session(user_id: str) -> ChatSession:
    """Create a new chat session and persist it.
    Raises:
        DatabaseError: If the database write fails. We fail fast to ensure
            callers never receive a non-persisted session that only exists
            in cache (which would be lost when the cache expires).
    """
    session = ChatSession.new(user_id)
    # Create in database first - fail fast if this fails
    try:
        await chat_db.create_chat_session(
            session_id=session.session_id,
            user_id=user_id,
        )
    except Exception as e:
        logger.error(f"Failed to create session {session.session_id} in database: {e}")
        raise DatabaseError(
            f"Failed to create chat session {session.session_id} in database"
        ) from e
    # Cache the session (best-effort optimization, DB is source of truth)
    try:
        await _cache_session(session)
    except Exception as e:
        logger.warning(f"Failed to cache new session {session.session_id}: {e}")
    return session
 async def get_user_sessions(
    user_id: str,
    limit: int = 50,
    offset: int = 0,
 ) -> tuple[list[ChatSession], int]:
    """Get chat sessions for a user from the database with total count.
    Returns:
        A tuple of (sessions, total_count) where total_count is the overall
        number of sessions for the user (not just the current page).
    """
    prisma_sessions = await chat_db.get_user_chat_sessions(user_id, limit, offset)
    total_count = await chat_db.get_user_session_count(user_id)
    sessions = []
    for prisma_session in prisma_sessions:
        # Convert without messages for listing (lighter weight)
        sessions.append(ChatSession.from_db(prisma_session, None))
    return sessions, total_count
 async def delete_chat_session(session_id: str, user_id: str | None = None) -> bool:
    """Delete a chat session from both cache and database.
    Args:
        session_id: The session ID to delete.
        user_id: If provided, validates that the session belongs to this user
            before deletion. This prevents unauthorized deletion.
    Returns:
        True if deleted successfully, False otherwise.
    """
    # Delete from database first (with optional user_id validation)
    # This confirms ownership before invalidating cache
    deleted = await chat_db.delete_chat_session(session_id, user_id)
    if not deleted:
        return False
    # Only invalidate cache and clean up lock after DB confirms deletion
    try:
        redis_key = _get_session_cache_key(session_id)
        async_redis = await get_redis_async()
        await async_redis.delete(redis_key)
    except Exception as e:
        logger.warning(f"Failed to delete session {session_id} from cache: {e}")
    # Clean up session lock (belt-and-suspenders with WeakValueDictionary)
    async with _session_locks_mutex:
        _session_locks.pop(session_id, None)
    return True
 async def update_session_title(session_id: str, title: str) -> bool:
    """Update only the title of a chat session.
    This is a lightweight operation that doesn't touch messages, avoiding
    race conditions with concurrent message updates. Use this for background
    title generation instead of upsert_chat_session.
    Args:
        session_id: The session ID to update.
        title: The new title to set.
    Returns:
        True if updated successfully, False otherwise.
    """
    try:
        result = await chat_db.update_chat_session(session_id=session_id, title=title)
        if result is None:
            logger.warning(f"Session {session_id} not found for title update")
            return False
        # Invalidate cache so next fetch gets updated title
        try:
            redis_key = _get_session_cache_key(session_id)
            async_redis = await get_redis_async()
            await async_redis.delete(redis_key)
        except Exception as e:
            logger.warning(f"Failed to invalidate cache for session {session_id}: {e}")
        return True
    except Exception as e:
        logger.error(f"Failed to update title for session {session_id}: {e}")
        return False
--- a/autogpt_platform/backend/backend/api/features/chat/model_test.py
+++ b/autogpt_platform/backend/backend/api/features/chat/model_test.py
@@ -43,9 +43,9 @@ async def test_chatsession_serialization_deserialization():
@pytest.mark.asyncio(loop_scope="session")
-async def test_chatsession_redis_storage():
+async def test_chatsession_redis_storage(setup_test_user, test_user_id):
-    s = ChatSession.new(user_id=None)
+    s = ChatSession.new(user_id=test_user_id)
    s.messages = messages
    s = await upsert_chat_session(s)
@@ -59,12 +59,61 @@ async def test_chatsession_redis_storage():
@pytest.mark.asyncio(loop_scope="session")
-async def test_chatsession_redis_storage_user_id_mismatch():
+async def test_chatsession_redis_storage_user_id_mismatch(
    setup_test_user, test_user_id
 ):
-    s = ChatSession.new(user_id="abc123")
+    s = ChatSession.new(user_id=test_user_id)
    s.messages = messages
    s = await upsert_chat_session(s)
-    s2 = await get_chat_session(s.session_id, None)
+    s2 = await get_chat_session(s.session_id, "different_user_id")
    assert s2 is None
@pytest.mark.asyncio(loop_scope="session")
 async def test_chatsession_db_storage(setup_test_user, test_user_id):
    """Test that messages are correctly saved to and loaded from DB (not cache)."""
    from backend.data.redis_client import get_redis_async
    # Create session with messages including assistant message
    s = ChatSession.new(user_id=test_user_id)
    s.messages = messages  # Contains user, assistant, and tool messages
    assert s.session_id is not None, "Session id is not set"
    # Upsert to save to both cache and DB
    s = await upsert_chat_session(s)
    # Clear the Redis cache to force DB load
    redis_key = f"chat:session:{s.session_id}"
    async_redis = await get_redis_async()
    await async_redis.delete(redis_key)
    # Load from DB (cache was cleared)
    s2 = await get_chat_session(
        session_id=s.session_id,
        user_id=s.user_id,
    )
    assert s2 is not None, "Session not found after loading from DB"
    assert len(s2.messages) == len(
        s.messages
    ), f"Message count mismatch: expected {len(s.messages)}, got {len(s2.messages)}"
    # Verify all roles are present
    roles = [m.role for m in s2.messages]
    assert "user" in roles, f"User message missing. Roles found: {roles}"
    assert "assistant" in roles, f"Assistant message missing. Roles found: {roles}"
    assert "tool" in roles, f"Tool message missing. Roles found: {roles}"
    # Verify message content
    for orig, loaded in zip(s.messages, s2.messages):
        assert orig.role == loaded.role, f"Role mismatch: {orig.role} != {loaded.role}"
        assert (
            orig.content == loaded.content
        ), f"Content mismatch for {orig.role}: {orig.content} != {loaded.content}"
        if orig.tool_calls:
            assert (
                loaded.tool_calls is not None
            ), f"Tool calls missing for {orig.role} message"
            assert len(orig.tool_calls) == len(loaded.tool_calls)
--- a/autogpt_platform/backend/backend/api/features/chat/prompts/chat_system.md
+++ b/autogpt_platform/backend/backend/api/features/chat/prompts/chat_system.md
@@ -1,104 +0,0 @@
 You are Otto, an AI Co-Pilot and Forward Deployed Engineer for AutoGPT, an AI Business Automation tool. Your mission is to help users quickly find and set up AutoGPT agents to solve their business problems.
 Here are the functions available to you:
 <functions>
 1. **find_agent** - Search for agents that solve the user's problem
 2. **run_agent** - Run or schedule an agent (automatically handles setup)
 </functions>
 ## HOW run_agent WORKS
 The `run_agent` tool automatically handles the entire setup flow:
 1. **First call** (no inputs) → Returns available inputs so user can decide what values to use
 2. **Credentials check** → If missing, UI automatically prompts user to add them (you don't need to mention this)
 3. **Execution** → Runs when you provide `inputs` OR set `use_defaults=true`
 Parameters:
 - `username_agent_slug` (required): Agent identifier like "creator/agent-name"
 - `inputs`: Object with input values for the agent
 - `use_defaults`: Set to `true` to run with default values (only after user confirms)
 - `schedule_name` + `cron`: For scheduled execution
 ## WORKFLOW
 1. **find_agent** - Search for agents that solve the user's problem
 2. **run_agent** (first call, no inputs) - Get available inputs for the agent
 3. **Ask user** what values they want to use OR if they want to use defaults
 4. **run_agent** (second call) - Either with `inputs={...}` or `use_defaults=true`
 ## YOUR APPROACH
 **Step 1: Understand the Problem**
 - Ask maximum 1-2 targeted questions
 - Focus on: What business problem are they solving?
 - Move quickly to searching for solutions
 **Step 2: Find Agents**
 - Use `find_agent` immediately with relevant keywords
 - Suggest the best option from search results
 - Explain briefly how it solves their problem
 **Step 3: Get Agent Inputs**
 - Call `run_agent(username_agent_slug="creator/agent-name")` without inputs
 - This returns the available inputs (required and optional)
 - Present these to the user and ask what values they want
 **Step 4: Run with User's Choice**
 - If user provides values: `run_agent(username_agent_slug="...", inputs={...})`
 - If user says "use defaults": `run_agent(username_agent_slug="...", use_defaults=true)`
 - On success, share the agent link with the user
 **For Scheduled Execution:**
 - Add `schedule_name` and `cron` parameters
 - Example: `run_agent(username_agent_slug="...", inputs={...}, schedule_name="Daily Report", cron="0 9 * * *")`
 ## FUNCTION CALL FORMAT
 To call a function, use this exact format:
 `<function_call>function_name(parameter="value")</function_call>`
 Examples:
 - `<function_call>find_agent(query="social media automation")</function_call>`
 - `<function_call>run_agent(username_agent_slug="creator/agent-name")</function_call>` (get inputs)
 - `<function_call>run_agent(username_agent_slug="creator/agent-name", inputs={"topic": "AI news"})</function_call>`
 - `<function_call>run_agent(username_agent_slug="creator/agent-name", use_defaults=true)</function_call>`
 ## KEY RULES
 **What You DON'T Do:**
 - Don't help with login (frontend handles this)
 - Don't mention or explain credentials to the user (frontend handles this automatically)
 - Don't run agents without first showing available inputs to the user
 - Don't use `use_defaults=true` without user explicitly confirming
 - Don't write responses longer than 3 sentences
 **What You DO:**
 - Always call run_agent first without inputs to see what's available
 - Ask user what values they want OR if they want to use defaults
 - Keep all responses to maximum 3 sentences
 - Include the agent link in your response after successful execution
 **Error Handling:**
 - Authentication needed → "Please sign in via the interface"
 - Credentials missing → The UI handles this automatically. Focus on asking the user about input values instead.
 ## RESPONSE STRUCTURE
 Before responding, wrap your analysis in <thinking> tags to systematically plan your approach:
 - Extract the key business problem or request from the user's message
 - Determine what function call (if any) you need to make next
 - Plan your response to stay under the 3-sentence maximum
 Example interaction:
 ```
 User: "Run the AI news agent for me"
 Otto: <function_call>run_agent(username_agent_slug="autogpt/ai-news")</function_call>
 [Tool returns: Agent accepts inputs - Required: topic. Optional: num_articles (default: 5)]
 Otto: The AI News agent needs a topic. What topic would you like news about, or should I use the defaults?
 User: "Use defaults"
 Otto: <function_call>run_agent(username_agent_slug="autogpt/ai-news", use_defaults=true)</function_call>
 ```
 KEEP ANSWERS TO 3 SENTENCES
--- a/autogpt_platform/backend/backend/api/features/chat/response_model.py
+++ b/autogpt_platform/backend/backend/api/features/chat/response_model.py
@@ -1,3 +1,10 @@
 """
 Response models for Vercel AI SDK UI Stream Protocol.
 This module implements the AI SDK UI Stream Protocol (v1) for streaming chat responses.
 See: https://ai-sdk.dev/docs/ai-sdk-ui/stream-protocol
 """
 from enum import Enum
 from typing import Any
@@ -5,97 +12,133 @@ from pydantic import BaseModel, Field
 class ResponseType(str, Enum):
-    """Types of streaming responses."""
+    """Types of streaming responses following AI SDK protocol."""
-    TEXT_CHUNK = "text_chunk"
+    # Message lifecycle
-    TEXT_ENDED = "text_ended"
+    START = "start"
-    TOOL_CALL = "tool_call"
+    FINISH = "finish"
-    TOOL_CALL_START = "tool_call_start"
+
-    TOOL_RESPONSE = "tool_response"
+    # Text streaming
    TEXT_START = "text-start"
    TEXT_DELTA = "text-delta"
    TEXT_END = "text-end"
    # Tool interaction
    TOOL_INPUT_START = "tool-input-start"
    TOOL_INPUT_AVAILABLE = "tool-input-available"
    TOOL_OUTPUT_AVAILABLE = "tool-output-available"
    # Other
    ERROR = "error"
    USAGE = "usage"
    STREAM_END = "stream_end"
 class StreamBaseResponse(BaseModel):
    """Base response model for all streaming responses."""
    type: ResponseType
    timestamp: str | None = None
    def to_sse(self) -> str:
        """Convert to SSE format."""
        return f"data: {self.model_dump_json()}\n\n"
-class StreamTextChunk(StreamBaseResponse):
+# ========== Message Lifecycle ==========
    """Streaming text content from the assistant."""
    type: ResponseType = ResponseType.TEXT_CHUNK
    content: str = Field(..., description="Text content chunk")
-class StreamToolCallStart(StreamBaseResponse):
+class StreamStart(StreamBaseResponse):
    """Start of a new message."""
    type: ResponseType = ResponseType.START
    messageId: str = Field(..., description="Unique message ID")
 class StreamFinish(StreamBaseResponse):
    """End of message/stream."""
    type: ResponseType = ResponseType.FINISH
 # ========== Text Streaming ==========
 class StreamTextStart(StreamBaseResponse):
    """Start of a text block."""
    type: ResponseType = ResponseType.TEXT_START
    id: str = Field(..., description="Text block ID")
 class StreamTextDelta(StreamBaseResponse):
    """Streaming text content delta."""
    type: ResponseType = ResponseType.TEXT_DELTA
    id: str = Field(..., description="Text block ID")
    delta: str = Field(..., description="Text content delta")
 class StreamTextEnd(StreamBaseResponse):
    """End of a text block."""
    type: ResponseType = ResponseType.TEXT_END
    id: str = Field(..., description="Text block ID")
 # ========== Tool Interaction ==========
 class StreamToolInputStart(StreamBaseResponse):
    """Tool call started notification."""
-    type: ResponseType = ResponseType.TOOL_CALL_START
+    type: ResponseType = ResponseType.TOOL_INPUT_START
-    tool_name: str = Field(..., description="Name of the tool that was executed")
+    toolCallId: str = Field(..., description="Unique tool call ID")
-    tool_id: str = Field(..., description="Unique tool call ID")
+    toolName: str = Field(..., description="Name of the tool being called")
-class StreamToolCall(StreamBaseResponse):
+class StreamToolInputAvailable(StreamBaseResponse):
-    """Tool invocation notification."""
+    """Tool input is ready for execution."""
-    type: ResponseType = ResponseType.TOOL_CALL
+    type: ResponseType = ResponseType.TOOL_INPUT_AVAILABLE
-    tool_id: str = Field(..., description="Unique tool call ID")
+    toolCallId: str = Field(..., description="Unique tool call ID")
-    tool_name: str = Field(..., description="Name of the tool being called")
+    toolName: str = Field(..., description="Name of the tool being called")
-    arguments: dict[str, Any] = Field(
+    input: dict[str, Any] = Field(
-        default_factory=dict, description="Tool arguments"
+        default_factory=dict, description="Tool input arguments"
    )
-class StreamToolExecutionResult(StreamBaseResponse):
+class StreamToolOutputAvailable(StreamBaseResponse):
    """Tool execution result."""
-    type: ResponseType = ResponseType.TOOL_RESPONSE
+    type: ResponseType = ResponseType.TOOL_OUTPUT_AVAILABLE
-    tool_id: str = Field(..., description="Tool call ID this responds to")
+    toolCallId: str = Field(..., description="Tool call ID this responds to")
-    tool_name: str = Field(..., description="Name of the tool that was executed")
+    output: str | dict[str, Any] = Field(..., description="Tool execution output")
-    result: str | dict[str, Any] = Field(..., description="Tool execution result")
+    # Additional fields for internal use (not part of AI SDK spec but useful)
    toolName: str | None = Field(
        default=None, description="Name of the tool that was executed"
    )
    success: bool = Field(
        default=True, description="Whether the tool execution succeeded"
    )
 # ========== Other ==========
 class StreamUsage(StreamBaseResponse):
    """Token usage statistics."""
    type: ResponseType = ResponseType.USAGE
-    prompt_tokens: int
+    promptTokens: int = Field(..., description="Number of prompt tokens")
-    completion_tokens: int
+    completionTokens: int = Field(..., description="Number of completion tokens")
-    total_tokens: int
+    totalTokens: int = Field(..., description="Total number of tokens")
 class StreamError(StreamBaseResponse):
    """Error response."""
    type: ResponseType = ResponseType.ERROR
-    message: str = Field(..., description="Error message")
+    errorText: str = Field(..., description="Error message text")
    code: str | None = Field(default=None, description="Error code")
    details: dict[str, Any] | None = Field(
        default=None, description="Additional error details"
    )
 class StreamTextEnded(StreamBaseResponse):
    """Text streaming completed marker."""
    type: ResponseType = ResponseType.TEXT_ENDED
 class StreamEnd(StreamBaseResponse):
    """End of stream marker."""
    type: ResponseType = ResponseType.STREAM_END
    summary: dict[str, Any] | None = Field(
        default=None, description="Stream summary statistics"
    )
--- a/autogpt_platform/backend/backend/api/features/chat/routes.py
+++ b/autogpt_platform/backend/backend/api/features/chat/routes.py
@@ -13,12 +13,25 @@ from backend.util.exceptions import NotFoundError
 from . import service as chat_service
 from .config import ChatConfig
 from .model import ChatSession, create_chat_session, get_chat_session, get_user_sessions
 config = ChatConfig()
 logger = logging.getLogger(__name__)
 async def _validate_and_get_session(
    session_id: str,
    user_id: str | None,
 ) -> ChatSession:
    """Validate session exists and belongs to user."""
    session = await get_chat_session(session_id, user_id)
    if not session:
        raise NotFoundError(f"Session {session_id} not found.")
    return session
 router = APIRouter(
    tags=["chat"],
 )
@@ -26,6 +39,14 @@ router = APIRouter(
 # ========== Request/Response Models ==========
 class StreamChatRequest(BaseModel):
    """Request model for streaming chat with optional context."""
    message: str
    is_user_message: bool = True
    context: dict[str, str] | None = None  # {url: str, content: str}
 class CreateSessionResponse(BaseModel):
    """Response model containing information on a newly created chat session."""
@@ -44,22 +65,77 @@ class SessionDetailResponse(BaseModel):
    messages: list[dict]
 class SessionSummaryResponse(BaseModel):
    """Response model for a session summary (without messages)."""
    id: str
    created_at: str
    updated_at: str
    title: str | None = None
 class ListSessionsResponse(BaseModel):
    """Response model for listing chat sessions."""
    sessions: list[SessionSummaryResponse]
    total: int
 # ========== Routes ==========
@router.get(
    "/sessions",
    dependencies=[Security(auth.requires_user)],
 )
 async def list_sessions(
    user_id: Annotated[str, Security(auth.get_user_id)],
    limit: int = Query(default=50, ge=1, le=100),
    offset: int = Query(default=0, ge=0),
 ) -> ListSessionsResponse:
    """
    List chat sessions for the authenticated user.
    Returns a paginated list of chat sessions belonging to the current user,
    ordered by most recently updated.
    Args:
        user_id: The authenticated user's ID.
        limit: Maximum number of sessions to return (1-100).
        offset: Number of sessions to skip for pagination.
    Returns:
        ListSessionsResponse: List of session summaries and total count.
    """
    sessions, total_count = await get_user_sessions(user_id, limit, offset)
    return ListSessionsResponse(
        sessions=[
            SessionSummaryResponse(
                id=session.session_id,
                created_at=session.started_at.isoformat(),
                updated_at=session.updated_at.isoformat(),
                title=session.title,
            )
            for session in sessions
        ],
        total=total_count,
    )
@router.post(
    "/sessions",
 )
 async def create_session(
-    user_id: Annotated[str | None, Depends(auth.get_user_id)],
+    user_id: Annotated[str, Depends(auth.get_user_id)],
 ) -> CreateSessionResponse:
    """
    Create a new chat session.
-    Initiates a new chat session for either an authenticated or anonymous user.
+    Initiates a new chat session for the authenticated user.
    Args:
-        user_id: The optional authenticated user ID parsed from the JWT. If missing, creates an anonymous session.
+        user_id: The authenticated user ID parsed from the JWT (required).
    Returns:
        CreateSessionResponse: Details of the created session.
@@ -67,15 +143,15 @@ async def create_session(
    """
    logger.info(
        f"Creating session with user_id: "
-        f"...{user_id[-8:] if user_id and len(user_id) > 8 else '<redacted>'}"
+        f"...{user_id[-8:] if len(user_id) > 8 else '<redacted>'}"
    )
-    session = await chat_service.create_chat_session(user_id)
+    session = await create_chat_session(user_id)
    return CreateSessionResponse(
        id=session.session_id,
        created_at=session.started_at.isoformat(),
-        user_id=session.user_id or None,
+        user_id=session.user_id,
    )
@@ -99,29 +175,88 @@ async def get_session(
        SessionDetailResponse: Details for the requested session; raises NotFoundError if not found.
    """
-    session = await chat_service.get_session(session_id, user_id)
+    session = await get_chat_session(session_id, user_id)
    if not session:
        raise NotFoundError(f"Session {session_id} not found")
    messages = [message.model_dump() for message in session.messages]
    logger.info(
        f"Returning session {session_id}: "
        f"message_count={len(messages)}, "
        f"roles={[m.get('role') for m in messages]}"
    )
    return SessionDetailResponse(
        id=session.session_id,
        created_at=session.started_at.isoformat(),
        updated_at=session.updated_at.isoformat(),
        user_id=session.user_id or None,
-        messages=[message.model_dump() for message in session.messages],
+        messages=messages,
    )
@router.post(
    "/sessions/{session_id}/stream",
 )
 async def stream_chat_post(
    session_id: str,
    request: StreamChatRequest,
    user_id: str | None = Depends(auth.get_user_id),
 ):
    """
    Stream chat responses for a session (POST with context support).
    Streams the AI/completion responses in real time over Server-Sent Events (SSE), including:
      - Text fragments as they are generated
      - Tool call UI elements (if invoked)
      - Tool execution results
    Args:
        session_id: The chat session identifier to associate with the streamed messages.
        request: Request body containing message, is_user_message, and optional context.
        user_id: Optional authenticated user ID.
    Returns:
        StreamingResponse: SSE-formatted response chunks.
    """
    session = await _validate_and_get_session(session_id, user_id)
    async def event_generator() -> AsyncGenerator[str, None]:
        async for chunk in chat_service.stream_chat_completion(
            session_id,
            request.message,
            is_user_message=request.is_user_message,
            user_id=user_id,
            session=session,  # Pass pre-fetched session to avoid double-fetch
            context=request.context,
        ):
            yield chunk.to_sse()
        # AI SDK protocol termination
        yield "data: [DONE]\n\n"
    return StreamingResponse(
        event_generator(),
        media_type="text/event-stream",
        headers={
            "Cache-Control": "no-cache",
            "Connection": "keep-alive",
            "X-Accel-Buffering": "no",  # Disable nginx buffering
            "x-vercel-ai-ui-message-stream": "v1",  # AI SDK protocol header
        },
    )
@router.get(
    "/sessions/{session_id}/stream",
 )
-async def stream_chat(
+async def stream_chat_get(
    session_id: str,
    message: Annotated[str, Query(min_length=1, max_length=10000)],
    user_id: str | None = Depends(auth.get_user_id),
    is_user_message: bool = Query(default=True),
 ):
    """
-    Stream chat responses for a session.
+    Stream chat responses for a session (GET - legacy endpoint).
    Streams the AI/completion responses in real time over Server-Sent Events (SSE), including:
      - Text fragments as they are generated
@@ -137,14 +272,7 @@ async def stream_chat(
        StreamingResponse: SSE-formatted response chunks.
    """
-    # Validate session exists before starting the stream
+    session = await _validate_and_get_session(session_id, user_id)
    # This prevents errors after the response has already started
    session = await chat_service.get_session(session_id, user_id)
    if not session:
        raise NotFoundError(f"Session {session_id} not found. ")
    if session.user_id is None and user_id is not None:
        session = await chat_service.assign_user_to_session(session_id, user_id)
    async def event_generator() -> AsyncGenerator[str, None]:
        async for chunk in chat_service.stream_chat_completion(
@@ -155,6 +283,8 @@ async def stream_chat(
            session=session,  # Pass pre-fetched session to avoid double-fetch
        ):
            yield chunk.to_sse()
        # AI SDK protocol termination
        yield "data: [DONE]\n\n"
    return StreamingResponse(
        event_generator(),
@@ -163,6 +293,7 @@ async def stream_chat(
            "Cache-Control": "no-cache",
            "Connection": "keep-alive",
            "X-Accel-Buffering": "no",  # Disable nginx buffering
            "x-vercel-ai-ui-message-stream": "v1",  # AI SDK protocol header
        },
    )
@@ -201,16 +332,28 @@ async def health_check() -> dict:
    """
    Health check endpoint for the chat service.
-    Performs a full cycle test of session creation, assignment, and retrieval. Should always return healthy
+    Performs a full cycle test of session creation and retrieval. Should always return healthy
    if the service and data layer are operational.
    Returns:
        dict: A status dictionary indicating health, service name, and API version.
    """
-    session = await chat_service.create_chat_session(None)
+    from backend.data.user import get_or_create_user
-    await chat_service.assign_user_to_session(session.session_id, "test_user")
+
-    await chat_service.get_session(session.session_id, "test_user")
+    # Ensure health check user exists (required for FK constraint)
    health_check_user_id = "health-check-user"
    await get_or_create_user(
        {
            "sub": health_check_user_id,
            "email": "health-check@system.local",
            "user_metadata": {"name": "Health Check User"},
        }
    )
    # Create and retrieve session to verify full data layer
    session = await create_chat_session(health_check_user_id)
    await get_chat_session(session.session_id, health_check_user_id)
    return {
        "status": "healthy",
--- a/autogpt_platform/backend/backend/api/features/chat/service.py
+++ b/autogpt_platform/backend/backend/api/features/chat/service.py
--- a/autogpt_platform/backend/backend/api/features/chat/service_test.py
+++ b/autogpt_platform/backend/backend/api/features/chat/service_test.py
@@ -4,18 +4,19 @@ from os import getenv
 import pytest
 from . import service as chat_service
 from .model import create_chat_session, get_chat_session, upsert_chat_session
 from .response_model import (
    StreamEnd,
    StreamError,
-    StreamTextChunk,
+    StreamFinish,
-    StreamToolExecutionResult,
+    StreamTextDelta,
    StreamToolOutputAvailable,
 )
 logger = logging.getLogger(__name__)
@pytest.mark.asyncio(loop_scope="session")
-async def test_stream_chat_completion():
+async def test_stream_chat_completion(setup_test_user, test_user_id):
    """
    Test the stream_chat_completion function.
    """
@@ -23,7 +24,7 @@ async def test_stream_chat_completion():
    if not api_key:
        return pytest.skip("OPEN_ROUTER_API_KEY is not set, skipping test")
-    session = await chat_service.create_chat_session()
+    session = await create_chat_session(test_user_id)
    has_errors = False
    has_ended = False
@@ -34,9 +35,9 @@ async def test_stream_chat_completion():
        logger.info(chunk)
        if isinstance(chunk, StreamError):
            has_errors = True
-        if isinstance(chunk, StreamTextChunk):
+        if isinstance(chunk, StreamTextDelta):
-            assistant_message += chunk.content
+            assistant_message += chunk.delta
-        if isinstance(chunk, StreamEnd):
+        if isinstance(chunk, StreamFinish):
            has_ended = True
    assert has_ended, "Chat completion did not end"
@@ -45,7 +46,7 @@ async def test_stream_chat_completion():
@pytest.mark.asyncio(loop_scope="session")
-async def test_stream_chat_completion_with_tool_calls():
+async def test_stream_chat_completion_with_tool_calls(setup_test_user, test_user_id):
    """
    Test the stream_chat_completion function.
    """
@@ -53,8 +54,8 @@ async def test_stream_chat_completion_with_tool_calls():
    if not api_key:
        return pytest.skip("OPEN_ROUTER_API_KEY is not set, skipping test")
-    session = await chat_service.create_chat_session()
+    session = await create_chat_session(test_user_id)
-    session = await chat_service.upsert_chat_session(session)
+    session = await upsert_chat_session(session)
    has_errors = False
    has_ended = False
@@ -68,14 +69,14 @@ async def test_stream_chat_completion_with_tool_calls():
        if isinstance(chunk, StreamError):
            has_errors = True
-        if isinstance(chunk, StreamEnd):
+        if isinstance(chunk, StreamFinish):
            has_ended = True
-        if isinstance(chunk, StreamToolExecutionResult):
+        if isinstance(chunk, StreamToolOutputAvailable):
            had_tool_calls = True
    assert has_ended, "Chat completion did not end"
    assert not has_errors, "Error occurred while streaming chat completion"
    assert had_tool_calls, "Tool calls did not occur"
-    session = await chat_service.get_session(session.session_id)
+    session = await get_chat_session(session.session_id)
    assert session, "Session not found"
    assert session.usage, "Usage is empty"
--- a/autogpt_platform/backend/backend/api/features/chat/tools/init.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/init.py
@@ -4,21 +4,44 @@ from openai.types.chat import ChatCompletionToolParam
 from backend.api.features.chat.model import ChatSession
 from .add_understanding import AddUnderstandingTool
 from .agent_output import AgentOutputTool
 from .base import BaseTool
 from .create_agent import CreateAgentTool
 from .edit_agent import EditAgentTool
 from .find_agent import FindAgentTool
 from .find_block import FindBlockTool
 from .find_library_agent import FindLibraryAgentTool
 from .get_doc_page import GetDocPageTool
 from .run_agent import RunAgentTool
 from .run_block import RunBlockTool
 from .search_docs import SearchDocsTool
 if TYPE_CHECKING:
-    from backend.api.features.chat.response_model import StreamToolExecutionResult
+    from backend.api.features.chat.response_model import StreamToolOutputAvailable
-# Initialize tool instances
+# Single source of truth for all tools
-find_agent_tool = FindAgentTool()
+TOOL_REGISTRY: dict[str, BaseTool] = {
-run_agent_tool = RunAgentTool()
+    "add_understanding": AddUnderstandingTool(),
    "create_agent": CreateAgentTool(),
    "edit_agent": EditAgentTool(),
    "find_agent": FindAgentTool(),
    "find_block": FindBlockTool(),
    "find_library_agent": FindLibraryAgentTool(),
    "run_agent": RunAgentTool(),
    "run_block": RunBlockTool(),
    "view_agent_output": AgentOutputTool(),
    "search_docs": SearchDocsTool(),
    "get_doc_page": GetDocPageTool(),
 }
-# Export tools as OpenAI format
+# Export individual tool instances for backwards compatibility
 find_agent_tool = TOOL_REGISTRY["find_agent"]
 run_agent_tool = TOOL_REGISTRY["run_agent"]
 # Generated from registry for OpenAI API
 tools: list[ChatCompletionToolParam] = [
-    find_agent_tool.as_openai_tool(),
+    tool.as_openai_tool() for tool in TOOL_REGISTRY.values()
    run_agent_tool.as_openai_tool(),
 ]
@@ -28,14 +51,9 @@ async def execute_tool(
    user_id: str | None,
    session: ChatSession,
    tool_call_id: str,
-) -> "StreamToolExecutionResult":
+) -> "StreamToolOutputAvailable":
-
+    """Execute a tool by name."""
-    tool_map: dict[str, BaseTool] = {
+    tool = TOOL_REGISTRY.get(tool_name)
-        "find_agent": find_agent_tool,
+    if not tool:
        "run_agent": run_agent_tool,
    }
    if tool_name not in tool_map:
        raise ValueError(f"Tool {tool_name} not found")
-    return await tool_map[tool_name].execute(
+    return await tool.execute(user_id, session, tool_call_id, **parameters)
        user_id, session, tool_call_id, **parameters
    )
--- a/autogpt_platform/backend/backend/api/features/chat/tools/_test_data.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/_test_data.py
@@ -3,6 +3,7 @@ from datetime import UTC, datetime
 from os import getenv
 import pytest
 from prisma.types import ProfileCreateInput
 from pydantic import SecretStr
 from backend.api.features.chat.model import ChatSession
@@ -17,7 +18,7 @@ from backend.data.user import get_or_create_user
 from backend.integrations.credentials_store import IntegrationCredentialsStore
-def make_session(user_id: str | None = None):
+def make_session(user_id: str):
    return ChatSession(
        session_id=str(uuid.uuid4()),
        user_id=user_id,
@@ -49,13 +50,13 @@ async def setup_test_data():
    # 1b. Create a profile with username for the user (required for store agent lookup)
    username = user.email.split("@")[0]
    await prisma.profile.create(
-        data={
+        data=ProfileCreateInput(
-            "userId": user.id,
+            userId=user.id,
-            "username": username,
+            username=username,
-            "name": f"Test User {username}",
+            name=f"Test User {username}",
-            "description": "Test user profile",
+            description="Test user profile",
-            "links": [],  # Required field - empty array for test profiles
+            links=[],  # Required field - empty array for test profiles
-        }
+        )
    )
    # 2. Create a test graph with agent input -> agent output
@@ -172,13 +173,13 @@ async def setup_llm_test_data():
    # 1b. Create a profile with username for the user (required for store agent lookup)
    username = user.email.split("@")[0]
    await prisma.profile.create(
-        data={
+        data=ProfileCreateInput(
-            "userId": user.id,
+            userId=user.id,
-            "username": username,
+            username=username,
-            "name": f"Test User {username}",
+            name=f"Test User {username}",
-            "description": "Test user profile for LLM tests",
+            description="Test user profile for LLM tests",
-            "links": [],  # Required field - empty array for test profiles
+            links=[],  # Required field - empty array for test profiles
-        }
+        )
    )
    # 2. Create test OpenAI credentials for the user
@@ -332,13 +333,13 @@ async def setup_firecrawl_test_data():
    # 1b. Create a profile with username for the user (required for store agent lookup)
    username = user.email.split("@")[0]
    await prisma.profile.create(
-        data={
+        data=ProfileCreateInput(
-            "userId": user.id,
+            userId=user.id,
-            "username": username,
+            username=username,
-            "name": f"Test User {username}",
+            name=f"Test User {username}",
-            "description": "Test user profile for Firecrawl tests",
+            description="Test user profile for Firecrawl tests",
-            "links": [],  # Required field - empty array for test profiles
+            links=[],  # Required field - empty array for test profiles
-        }
+        )
    )
    # NOTE: We deliberately do NOT create Firecrawl credentials for this user
--- a/autogpt_platform/backend/backend/api/features/chat/tools/add_understanding.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/add_understanding.py
@@ -0,0 +1,122 @@
 """Tool for capturing user business understanding incrementally."""
 import logging
 from typing import Any
 from langfuse import observe
 from backend.api.features.chat.model import ChatSession
 from backend.data.understanding import (
    BusinessUnderstandingInput,
    upsert_business_understanding,
 )
 from .base import BaseTool
 from .models import ErrorResponse, ToolResponseBase, UnderstandingUpdatedResponse
 logger = logging.getLogger(__name__)
 class AddUnderstandingTool(BaseTool):
    """Tool for capturing user's business understanding incrementally."""
    @property
    def name(self) -> str:
        return "add_understanding"
    @property
    def description(self) -> str:
        return """Capture and store information about the user's business context,
 workflows, pain points, and automation goals. Call this tool whenever the user
 shares information about their business. Each call incrementally adds to the
 existing understanding - you don't need to provide all fields at once.
 Use this to build a comprehensive profile that helps recommend better agents
 and automations for the user's specific needs."""
    @property
    def parameters(self) -> dict[str, Any]:
        # Auto-generate from Pydantic model schema
        schema = BusinessUnderstandingInput.model_json_schema()
        properties = {}
        for field_name, field_schema in schema.get("properties", {}).items():
            prop: dict[str, Any] = {"description": field_schema.get("description", "")}
            # Handle anyOf for Optional types
            if "anyOf" in field_schema:
                for option in field_schema["anyOf"]:
                    if option.get("type") != "null":
                        prop["type"] = option.get("type", "string")
                        if "items" in option:
                            prop["items"] = option["items"]
                        break
            else:
                prop["type"] = field_schema.get("type", "string")
                if "items" in field_schema:
                    prop["items"] = field_schema["items"]
            properties[field_name] = prop
        return {"type": "object", "properties": properties, "required": []}
    @property
    def requires_auth(self) -> bool:
        """Requires authentication to store user-specific data."""
        return True
    @observe(as_type="tool", name="add_understanding")
    async def _execute(
        self,
        user_id: str | None,
        session: ChatSession,
        **kwargs,
    ) -> ToolResponseBase:
        """
        Capture and store business understanding incrementally.
        Each call merges new data with existing understanding:
        - String fields are overwritten if provided
        - List fields are appended (with deduplication)
        """
        session_id = session.session_id
        if not user_id:
            return ErrorResponse(
                message="Authentication required to save business understanding.",
                session_id=session_id,
            )
        # Check if any data was provided
        if not any(v is not None for v in kwargs.values()):
            return ErrorResponse(
                message="Please provide at least one field to update.",
                session_id=session_id,
            )
        # Build input model from kwargs (only include fields defined in the model)
        valid_fields = set(BusinessUnderstandingInput.model_fields.keys())
        input_data = BusinessUnderstandingInput(
            **{k: v for k, v in kwargs.items() if k in valid_fields}
        )
        # Track which fields were updated
        updated_fields = [
            k for k, v in kwargs.items() if k in valid_fields and v is not None
        ]
        # Upsert with merge
        understanding = await upsert_business_understanding(user_id, input_data)
        # Build current understanding summary (filter out empty values)
        current_understanding = {
            k: v
            for k, v in understanding.model_dump(
                exclude={"id", "user_id", "created_at", "updated_at"}
            ).items()
            if v is not None and v != [] and v != ""
        }
        return UnderstandingUpdatedResponse(
            message=f"Updated understanding with: {', '.join(updated_fields)}. "
            "I now have a better picture of your business context.",
            session_id=session_id,
            updated_fields=updated_fields,
            current_understanding=current_understanding,
        )
--- a/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/init.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/init.py
@@ -0,0 +1,29 @@
 """Agent generator package - Creates agents from natural language."""
 from .core import (
    apply_agent_patch,
    decompose_goal,
    generate_agent,
    generate_agent_patch,
    get_agent_as_json,
    save_agent_to_library,
 )
 from .fixer import apply_all_fixes
 from .utils import get_blocks_info
 from .validator import validate_agent
 __all__ = [
    # Core functions
    "decompose_goal",
    "generate_agent",
    "generate_agent_patch",
    "apply_agent_patch",
    "save_agent_to_library",
    "get_agent_as_json",
    # Fixer
    "apply_all_fixes",
    # Validator
    "validate_agent",
    # Utils
    "get_blocks_info",
 ]
--- a/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/client.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/client.py
@@ -0,0 +1,25 @@
 """OpenRouter client configuration for agent generation."""
 import os
 from openai import AsyncOpenAI
 # Configuration - use OPEN_ROUTER_API_KEY for consistency with chat/config.py
 OPENROUTER_API_KEY = os.getenv("OPEN_ROUTER_API_KEY")
 AGENT_GENERATOR_MODEL = os.getenv("AGENT_GENERATOR_MODEL", "anthropic/claude-opus-4.5")
 # OpenRouter client (OpenAI-compatible API)
 _client: AsyncOpenAI | None = None
 def get_client() -> AsyncOpenAI:
    """Get or create the OpenRouter client."""
    global _client
    if _client is None:
        if not OPENROUTER_API_KEY:
            raise ValueError("OPENROUTER_API_KEY environment variable is required")
        _client = AsyncOpenAI(
            base_url="https://openrouter.ai/api/v1",
            api_key=OPENROUTER_API_KEY,
        )
    return _client
--- a/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/core.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/core.py
@@ -0,0 +1,390 @@
 """Core agent generation functions."""
 import copy
 import json
 import logging
 import uuid
 from typing import Any
 from backend.api.features.library import db as library_db
 from backend.data.graph import Graph, Link, Node, create_graph
 from .client import AGENT_GENERATOR_MODEL, get_client
 from .prompts import DECOMPOSITION_PROMPT, GENERATION_PROMPT, PATCH_PROMPT
 from .utils import get_block_summaries, parse_json_from_llm
 logger = logging.getLogger(__name__)
 async def decompose_goal(description: str, context: str = "") -> dict[str, Any] | None:
    """Break down a goal into steps or return clarifying questions.
    Args:
        description: Natural language goal description
        context: Additional context (e.g., answers to previous questions)
    Returns:
        Dict with either:
        - {"type": "clarifying_questions", "questions": [...]}
        - {"type": "instructions", "steps": [...]}
        Or None on error
    """
    client = get_client()
    prompt = DECOMPOSITION_PROMPT.format(block_summaries=get_block_summaries())
    full_description = description
    if context:
        full_description = f"{description}\n\nAdditional context:\n{context}"
    try:
        response = await client.chat.completions.create(
            model=AGENT_GENERATOR_MODEL,
            messages=[
                {"role": "system", "content": prompt},
                {"role": "user", "content": full_description},
            ],
            temperature=0,
        )
        content = response.choices[0].message.content
        if content is None:
            logger.error("LLM returned empty content for decomposition")
            return None
        result = parse_json_from_llm(content)
        if result is None:
            logger.error(f"Failed to parse decomposition response: {content[:200]}")
            return None
        return result
    except Exception as e:
        logger.error(f"Error decomposing goal: {e}")
        return None
 async def generate_agent(instructions: dict[str, Any]) -> dict[str, Any] | None:
    """Generate agent JSON from instructions.
    Args:
        instructions: Structured instructions from decompose_goal
    Returns:
        Agent JSON dict or None on error
    """
    client = get_client()
    prompt = GENERATION_PROMPT.format(block_summaries=get_block_summaries())
    try:
        response = await client.chat.completions.create(
            model=AGENT_GENERATOR_MODEL,
            messages=[
                {"role": "system", "content": prompt},
                {"role": "user", "content": json.dumps(instructions, indent=2)},
            ],
            temperature=0,
        )
        content = response.choices[0].message.content
        if content is None:
            logger.error("LLM returned empty content for agent generation")
            return None
        result = parse_json_from_llm(content)
        if result is None:
            logger.error(f"Failed to parse agent JSON: {content[:200]}")
            return None
        # Ensure required fields
        if "id" not in result:
            result["id"] = str(uuid.uuid4())
        if "version" not in result:
            result["version"] = 1
        if "is_active" not in result:
            result["is_active"] = True
        return result
    except Exception as e:
        logger.error(f"Error generating agent: {e}")
        return None
 def json_to_graph(agent_json: dict[str, Any]) -> Graph:
    """Convert agent JSON dict to Graph model.
    Args:
        agent_json: Agent JSON with nodes and links
    Returns:
        Graph ready for saving
    """
    nodes = []
    for n in agent_json.get("nodes", []):
        node = Node(
            id=n.get("id", str(uuid.uuid4())),
            block_id=n["block_id"],
            input_default=n.get("input_default", {}),
            metadata=n.get("metadata", {}),
        )
        nodes.append(node)
    links = []
    for link_data in agent_json.get("links", []):
        link = Link(
            id=link_data.get("id", str(uuid.uuid4())),
            source_id=link_data["source_id"],
            sink_id=link_data["sink_id"],
            source_name=link_data["source_name"],
            sink_name=link_data["sink_name"],
            is_static=link_data.get("is_static", False),
        )
        links.append(link)
    return Graph(
        id=agent_json.get("id", str(uuid.uuid4())),
        version=agent_json.get("version", 1),
        is_active=agent_json.get("is_active", True),
        name=agent_json.get("name", "Generated Agent"),
        description=agent_json.get("description", ""),
        nodes=nodes,
        links=links,
    )
 def _reassign_node_ids(graph: Graph) -> None:
    """Reassign all node and link IDs to new UUIDs.
    This is needed when creating a new version to avoid unique constraint violations.
    """
    # Create mapping from old node IDs to new UUIDs
    id_map = {node.id: str(uuid.uuid4()) for node in graph.nodes}
    # Reassign node IDs
    for node in graph.nodes:
        node.id = id_map[node.id]
    # Update link references to use new node IDs
    for link in graph.links:
        link.id = str(uuid.uuid4())  # Also give links new IDs
        if link.source_id in id_map:
            link.source_id = id_map[link.source_id]
        if link.sink_id in id_map:
            link.sink_id = id_map[link.sink_id]
 async def save_agent_to_library(
    agent_json: dict[str, Any], user_id: str, is_update: bool = False
 ) -> tuple[Graph, Any]:
    """Save agent to database and user's library.
    Args:
        agent_json: Agent JSON dict
        user_id: User ID
        is_update: Whether this is an update to an existing agent
    Returns:
        Tuple of (created Graph, LibraryAgent)
    """
    from backend.data.graph import get_graph_all_versions
    graph = json_to_graph(agent_json)
    if is_update:
        # For updates, keep the same graph ID but increment version
        # and reassign node/link IDs to avoid conflicts
        if graph.id:
            existing_versions = await get_graph_all_versions(graph.id, user_id)
            if existing_versions:
                latest_version = max(v.version for v in existing_versions)
                graph.version = latest_version + 1
                # Reassign node IDs (but keep graph ID the same)
                _reassign_node_ids(graph)
                logger.info(f"Updating agent {graph.id} to version {graph.version}")
    else:
        # For new agents, always generate a fresh UUID to avoid collisions
        graph.id = str(uuid.uuid4())
        graph.version = 1
        # Reassign all node IDs as well
        _reassign_node_ids(graph)
        logger.info(f"Creating new agent with ID {graph.id}")
    # Save to database
    created_graph = await create_graph(graph, user_id)
    # Add to user's library (or update existing library agent)
    library_agents = await library_db.create_library_agent(
        graph=created_graph,
        user_id=user_id,
        create_library_agents_for_sub_graphs=False,
    )
    return created_graph, library_agents[0]
 async def get_agent_as_json(
    graph_id: str, user_id: str | None
 ) -> dict[str, Any] | None:
    """Fetch an agent and convert to JSON format for editing.
    Args:
        graph_id: Graph ID or library agent ID
        user_id: User ID
    Returns:
        Agent as JSON dict or None if not found
    """
    from backend.data.graph import get_graph
    # Try to get the graph (version=None gets the active version)
    graph = await get_graph(graph_id, version=None, user_id=user_id)
    if not graph:
        return None
    # Convert to JSON format
    nodes = []
    for node in graph.nodes:
        nodes.append(
            {
                "id": node.id,
                "block_id": node.block_id,
                "input_default": node.input_default,
                "metadata": node.metadata,
            }
        )
    links = []
    for node in graph.nodes:
        for link in node.output_links:
            links.append(
                {
                    "id": link.id,
                    "source_id": link.source_id,
                    "sink_id": link.sink_id,
                    "source_name": link.source_name,
                    "sink_name": link.sink_name,
                    "is_static": link.is_static,
                }
            )
    return {
        "id": graph.id,
        "name": graph.name,
        "description": graph.description,
        "version": graph.version,
        "is_active": graph.is_active,
        "nodes": nodes,
        "links": links,
    }
 async def generate_agent_patch(
    update_request: str, current_agent: dict[str, Any]
 ) -> dict[str, Any] | None:
    """Generate a patch to update an existing agent.
    Args:
        update_request: Natural language description of changes
        current_agent: Current agent JSON
    Returns:
        Patch dict or clarifying questions, or None on error
    """
    client = get_client()
    prompt = PATCH_PROMPT.format(
        current_agent=json.dumps(current_agent, indent=2),
        block_summaries=get_block_summaries(),
    )
    try:
        response = await client.chat.completions.create(
            model=AGENT_GENERATOR_MODEL,
            messages=[
                {"role": "system", "content": prompt},
                {"role": "user", "content": update_request},
            ],
            temperature=0,
        )
        content = response.choices[0].message.content
        if content is None:
            logger.error("LLM returned empty content for patch generation")
            return None
        return parse_json_from_llm(content)
    except Exception as e:
        logger.error(f"Error generating patch: {e}")
        return None
 def apply_agent_patch(
    current_agent: dict[str, Any], patch: dict[str, Any]
 ) -> dict[str, Any]:
    """Apply a patch to an existing agent.
    Args:
        current_agent: Current agent JSON
        patch: Patch dict with operations
    Returns:
        Updated agent JSON
    """
    agent = copy.deepcopy(current_agent)
    patches = patch.get("patches", [])
    for p in patches:
        patch_type = p.get("type")
        if patch_type == "modify":
            node_id = p.get("node_id")
            changes = p.get("changes", {})
            for node in agent.get("nodes", []):
                if node["id"] == node_id:
                    _deep_update(node, changes)
                    logger.debug(f"Modified node {node_id}")
                    break
        elif patch_type == "add":
            new_nodes = p.get("new_nodes", [])
            new_links = p.get("new_links", [])
            agent["nodes"] = agent.get("nodes", []) + new_nodes
            agent["links"] = agent.get("links", []) + new_links
            logger.debug(f"Added {len(new_nodes)} nodes, {len(new_links)} links")
        elif patch_type == "remove":
            node_ids_to_remove = set(p.get("node_ids", []))
            link_ids_to_remove = set(p.get("link_ids", []))
            # Remove nodes
            agent["nodes"] = [
                n for n in agent.get("nodes", []) if n["id"] not in node_ids_to_remove
            ]
            # Remove links (both explicit and those referencing removed nodes)
            agent["links"] = [
                link
                for link in agent.get("links", [])
                if link["id"] not in link_ids_to_remove
                and link["source_id"] not in node_ids_to_remove
                and link["sink_id"] not in node_ids_to_remove
            ]
            logger.debug(
                f"Removed {len(node_ids_to_remove)} nodes, {len(link_ids_to_remove)} links"
            )
    return agent
 def _deep_update(target: dict, source: dict) -> None:
    """Recursively update a dict with another dict."""
    for key, value in source.items():
        if key in target and isinstance(target[key], dict) and isinstance(value, dict):
            _deep_update(target[key], value)
        else:
            target[key] = value
--- a/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/fixer.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/fixer.py
@@ -0,0 +1,606 @@
 """Agent fixer - Fixes common LLM generation errors."""
 import logging
 import re
 import uuid
 from typing import Any
 from .utils import (
    ADDTODICTIONARY_BLOCK_ID,
    ADDTOLIST_BLOCK_ID,
    CODE_EXECUTION_BLOCK_ID,
    CONDITION_BLOCK_ID,
    CREATEDICT_BLOCK_ID,
    CREATELIST_BLOCK_ID,
    DATA_SAMPLING_BLOCK_ID,
    DOUBLE_CURLY_BRACES_BLOCK_IDS,
    GET_CURRENT_DATE_BLOCK_ID,
    STORE_VALUE_BLOCK_ID,
    UNIVERSAL_TYPE_CONVERTER_BLOCK_ID,
    get_blocks_info,
    is_valid_uuid,
 )
 logger = logging.getLogger(__name__)
 def fix_agent_ids(agent: dict[str, Any]) -> dict[str, Any]:
    """Fix invalid UUIDs in agent and link IDs."""
    # Fix agent ID
    if not is_valid_uuid(agent.get("id", "")):
        agent["id"] = str(uuid.uuid4())
        logger.debug(f"Fixed agent ID: {agent['id']}")
    # Fix node IDs
    id_mapping = {}  # Old ID -> New ID
    for node in agent.get("nodes", []):
        if not is_valid_uuid(node.get("id", "")):
            old_id = node.get("id", "")
            new_id = str(uuid.uuid4())
            id_mapping[old_id] = new_id
            node["id"] = new_id
            logger.debug(f"Fixed node ID: {old_id} -> {new_id}")
    # Fix link IDs and update references
    for link in agent.get("links", []):
        if not is_valid_uuid(link.get("id", "")):
            link["id"] = str(uuid.uuid4())
            logger.debug(f"Fixed link ID: {link['id']}")
        # Update source/sink IDs if they were remapped
        if link.get("source_id") in id_mapping:
            link["source_id"] = id_mapping[link["source_id"]]
        if link.get("sink_id") in id_mapping:
            link["sink_id"] = id_mapping[link["sink_id"]]
    return agent
 def fix_double_curly_braces(agent: dict[str, Any]) -> dict[str, Any]:
    """Fix single curly braces to double in template blocks."""
    for node in agent.get("nodes", []):
        if node.get("block_id") not in DOUBLE_CURLY_BRACES_BLOCK_IDS:
            continue
        input_data = node.get("input_default", {})
        for key in ("prompt", "format"):
            if key in input_data and isinstance(input_data[key], str):
                original = input_data[key]
                # Fix simple variable references: {var} -> {{var}}
                fixed = re.sub(
                    r"(?<!\{)\{([a-zA-Z_][a-zA-Z0-9_]*)\}(?!\})",
                    r"{{\1}}",
                    original,
                )
                if fixed != original:
                    input_data[key] = fixed
                    logger.debug(f"Fixed curly braces in {key}")
    return agent
 def fix_storevalue_before_condition(agent: dict[str, Any]) -> dict[str, Any]:
    """Add StoreValueBlock before ConditionBlock if needed for value2."""
    nodes = agent.get("nodes", [])
    links = agent.get("links", [])
    # Find all ConditionBlock nodes
    condition_node_ids = {
        node["id"] for node in nodes if node.get("block_id") == CONDITION_BLOCK_ID
    }
    if not condition_node_ids:
        return agent
    new_nodes = []
    new_links = []
    processed_conditions = set()
    for link in links:
        sink_id = link.get("sink_id")
        sink_name = link.get("sink_name")
        # Check if this link goes to a ConditionBlock's value2
        if sink_id in condition_node_ids and sink_name == "value2":
            source_node = next(
                (n for n in nodes if n["id"] == link.get("source_id")), None
            )
            # Skip if source is already a StoreValueBlock
            if source_node and source_node.get("block_id") == STORE_VALUE_BLOCK_ID:
                continue
            # Skip if we already processed this condition
            if sink_id in processed_conditions:
                continue
            processed_conditions.add(sink_id)
            # Create StoreValueBlock
            store_node_id = str(uuid.uuid4())
            store_node = {
                "id": store_node_id,
                "block_id": STORE_VALUE_BLOCK_ID,
                "input_default": {"data": None},
                "metadata": {"position": {"x": 0, "y": -100}},
            }
            new_nodes.append(store_node)
            # Create link: original source -> StoreValueBlock
            new_links.append(
                {
                    "id": str(uuid.uuid4()),
                    "source_id": link["source_id"],
                    "source_name": link["source_name"],
                    "sink_id": store_node_id,
                    "sink_name": "input",
                    "is_static": False,
                }
            )
            # Update original link: StoreValueBlock -> ConditionBlock
            link["source_id"] = store_node_id
            link["source_name"] = "output"
            logger.debug(f"Added StoreValueBlock before ConditionBlock {sink_id}")
    if new_nodes:
        agent["nodes"] = nodes + new_nodes
    return agent
 def fix_addtolist_blocks(agent: dict[str, Any]) -> dict[str, Any]:
    """Fix AddToList blocks by adding prerequisite empty AddToList block.
    When an AddToList block is found:
    1. Checks if there's a CreateListBlock before it
    2. Removes CreateListBlock if linked directly to AddToList
    3. Adds an empty AddToList block before the original
    4. Ensures the original has a self-referencing link
    """
    nodes = agent.get("nodes", [])
    links = agent.get("links", [])
    new_nodes = []
    original_addtolist_ids = set()
    nodes_to_remove = set()
    links_to_remove = []
    # First pass: identify CreateListBlock nodes to remove
    for link in links:
        source_node = next(
            (n for n in nodes if n.get("id") == link.get("source_id")), None
        )
        sink_node = next((n for n in nodes if n.get("id") == link.get("sink_id")), None)
        if (
            source_node
            and sink_node
            and source_node.get("block_id") == CREATELIST_BLOCK_ID
            and sink_node.get("block_id") == ADDTOLIST_BLOCK_ID
        ):
            nodes_to_remove.add(source_node.get("id"))
            links_to_remove.append(link)
            logger.debug(f"Removing CreateListBlock {source_node.get('id')}")
    # Second pass: process AddToList blocks
    filtered_nodes = []
    for node in nodes:
        if node.get("id") in nodes_to_remove:
            continue
        if node.get("block_id") == ADDTOLIST_BLOCK_ID:
            original_addtolist_ids.add(node.get("id"))
            node_id = node.get("id")
            pos = node.get("metadata", {}).get("position", {"x": 0, "y": 0})
            # Check if already has prerequisite
            has_prereq = any(
                link.get("sink_id") == node_id
                and link.get("sink_name") == "list"
                and link.get("source_name") == "updated_list"
                for link in links
            )
            if not has_prereq:
                # Remove links to "list" input (except self-reference)
                for link in links:
                    if (
                        link.get("sink_id") == node_id
                        and link.get("sink_name") == "list"
                        and link.get("source_id") != node_id
                        and link not in links_to_remove
                    ):
                        links_to_remove.append(link)
                # Create prerequisite AddToList block
                prereq_id = str(uuid.uuid4())
                prereq_node = {
                    "id": prereq_id,
                    "block_id": ADDTOLIST_BLOCK_ID,
                    "input_default": {"list": [], "entry": None, "entries": []},
                    "metadata": {
                        "position": {"x": pos.get("x", 0) - 800, "y": pos.get("y", 0)}
                    },
                }
                new_nodes.append(prereq_node)
                # Link prerequisite to original
                links.append(
                    {
                        "id": str(uuid.uuid4()),
                        "source_id": prereq_id,
                        "source_name": "updated_list",
                        "sink_id": node_id,
                        "sink_name": "list",
                        "is_static": False,
                    }
                )
                logger.debug(f"Added prerequisite AddToList block for {node_id}")
        filtered_nodes.append(node)
    # Remove marked links
    filtered_links = [link for link in links if link not in links_to_remove]
    # Add self-referencing links for original AddToList blocks
    for node in filtered_nodes + new_nodes:
        if (
            node.get("block_id") == ADDTOLIST_BLOCK_ID
            and node.get("id") in original_addtolist_ids
        ):
            node_id = node.get("id")
            has_self_ref = any(
                link["source_id"] == node_id
                and link["sink_id"] == node_id
                and link["source_name"] == "updated_list"
                and link["sink_name"] == "list"
                for link in filtered_links
            )
            if not has_self_ref:
                filtered_links.append(
                    {
                        "id": str(uuid.uuid4()),
                        "source_id": node_id,
                        "source_name": "updated_list",
                        "sink_id": node_id,
                        "sink_name": "list",
                        "is_static": False,
                    }
                )
                logger.debug(f"Added self-reference for AddToList {node_id}")
    agent["nodes"] = filtered_nodes + new_nodes
    agent["links"] = filtered_links
    return agent
 def fix_addtodictionary_blocks(agent: dict[str, Any]) -> dict[str, Any]:
    """Fix AddToDictionary blocks by removing empty CreateDictionary nodes."""
    nodes = agent.get("nodes", [])
    links = agent.get("links", [])
    nodes_to_remove = set()
    links_to_remove = []
    for link in links:
        source_node = next(
            (n for n in nodes if n.get("id") == link.get("source_id")), None
        )
        sink_node = next((n for n in nodes if n.get("id") == link.get("sink_id")), None)
        if (
            source_node
            and sink_node
            and source_node.get("block_id") == CREATEDICT_BLOCK_ID
            and sink_node.get("block_id") == ADDTODICTIONARY_BLOCK_ID
        ):
            nodes_to_remove.add(source_node.get("id"))
            links_to_remove.append(link)
            logger.debug(f"Removing CreateDictionary {source_node.get('id')}")
    agent["nodes"] = [n for n in nodes if n.get("id") not in nodes_to_remove]
    agent["links"] = [link for link in links if link not in links_to_remove]
    return agent
 def fix_code_execution_output(agent: dict[str, Any]) -> dict[str, Any]:
    """Fix CodeExecutionBlock output: change 'response' to 'stdout_logs'."""
    nodes = agent.get("nodes", [])
    links = agent.get("links", [])
    for link in links:
        source_node = next(
            (n for n in nodes if n.get("id") == link.get("source_id")), None
        )
        if (
            source_node
            and source_node.get("block_id") == CODE_EXECUTION_BLOCK_ID
            and link.get("source_name") == "response"
        ):
            link["source_name"] = "stdout_logs"
            logger.debug("Fixed CodeExecutionBlock output: response -> stdout_logs")
    return agent
 def fix_data_sampling_sample_size(agent: dict[str, Any]) -> dict[str, Any]:
    """Fix DataSamplingBlock by setting sample_size to 1 as default."""
    nodes = agent.get("nodes", [])
    links = agent.get("links", [])
    links_to_remove = []
    for node in nodes:
        if node.get("block_id") == DATA_SAMPLING_BLOCK_ID:
            node_id = node.get("id")
            input_default = node.get("input_default", {})
            # Remove links to sample_size
            for link in links:
                if (
                    link.get("sink_id") == node_id
                    and link.get("sink_name") == "sample_size"
                ):
                    links_to_remove.append(link)
            # Set default
            input_default["sample_size"] = 1
            node["input_default"] = input_default
            logger.debug(f"Fixed DataSamplingBlock {node_id} sample_size to 1")
    if links_to_remove:
        agent["links"] = [link for link in links if link not in links_to_remove]
    return agent
 def fix_node_x_coordinates(agent: dict[str, Any]) -> dict[str, Any]:
    """Fix node x-coordinates to ensure 800+ unit spacing between linked nodes."""
    nodes = agent.get("nodes", [])
    links = agent.get("links", [])
    node_lookup = {n.get("id"): n for n in nodes}
    for link in links:
        source_id = link.get("source_id")
        sink_id = link.get("sink_id")
        source_node = node_lookup.get(source_id)
        sink_node = node_lookup.get(sink_id)
        if not source_node or not sink_node:
            continue
        source_pos = source_node.get("metadata", {}).get("position", {})
        sink_pos = sink_node.get("metadata", {}).get("position", {})
        source_x = source_pos.get("x", 0)
        sink_x = sink_pos.get("x", 0)
        if abs(sink_x - source_x) < 800:
            new_x = source_x + 800
            if "metadata" not in sink_node:
                sink_node["metadata"] = {}
            if "position" not in sink_node["metadata"]:
                sink_node["metadata"]["position"] = {}
            sink_node["metadata"]["position"]["x"] = new_x
            logger.debug(f"Fixed node {sink_id} x: {sink_x} -> {new_x}")
    return agent
 def fix_getcurrentdate_offset(agent: dict[str, Any]) -> dict[str, Any]:
    """Fix GetCurrentDateBlock offset to ensure it's positive."""
    for node in agent.get("nodes", []):
        if node.get("block_id") == GET_CURRENT_DATE_BLOCK_ID:
            input_default = node.get("input_default", {})
            if "offset" in input_default:
                offset = input_default["offset"]
                if isinstance(offset, (int, float)) and offset < 0:
                    input_default["offset"] = abs(offset)
                    logger.debug(f"Fixed offset: {offset} -> {abs(offset)}")
    return agent
 def fix_ai_model_parameter(
    agent: dict[str, Any],
    blocks_info: list[dict[str, Any]],
    default_model: str = "gpt-4o",
 ) -> dict[str, Any]:
    """Add default model parameter to AI blocks if missing."""
    block_map = {b.get("id"): b for b in blocks_info}
    for node in agent.get("nodes", []):
        block_id = node.get("block_id")
        block = block_map.get(block_id)
        if not block:
            continue
        # Check if block has AI category
        categories = block.get("categories", [])
        is_ai_block = any(
            cat.get("category") == "AI" for cat in categories if isinstance(cat, dict)
        )
        if is_ai_block:
            input_default = node.get("input_default", {})
            if "model" not in input_default:
                input_default["model"] = default_model
                node["input_default"] = input_default
                logger.debug(
                    f"Added model '{default_model}' to AI block {node.get('id')}"
                )
    return agent
 def fix_link_static_properties(
    agent: dict[str, Any], blocks_info: list[dict[str, Any]]
 ) -> dict[str, Any]:
    """Fix is_static property based on source block's staticOutput."""
    block_map = {b.get("id"): b for b in blocks_info}
    node_lookup = {n.get("id"): n for n in agent.get("nodes", [])}
    for link in agent.get("links", []):
        source_node = node_lookup.get(link.get("source_id"))
        if not source_node:
            continue
        source_block = block_map.get(source_node.get("block_id"))
        if not source_block:
            continue
        static_output = source_block.get("staticOutput", False)
        if link.get("is_static") != static_output:
            link["is_static"] = static_output
            logger.debug(f"Fixed link {link.get('id')} is_static to {static_output}")
    return agent
 def fix_data_type_mismatch(
    agent: dict[str, Any], blocks_info: list[dict[str, Any]]
 ) -> dict[str, Any]:
    """Fix data type mismatches by inserting UniversalTypeConverterBlock."""
    nodes = agent.get("nodes", [])
    links = agent.get("links", [])
    block_map = {b.get("id"): b for b in blocks_info}
    node_lookup = {n.get("id"): n for n in nodes}
    def get_property_type(schema: dict, name: str) -> str | None:
        if "_#_" in name:
            parent, child = name.split("_#_", 1)
            parent_schema = schema.get(parent, {})
            if "properties" in parent_schema:
                return parent_schema["properties"].get(child, {}).get("type")
            return None
        return schema.get(name, {}).get("type")
    def are_types_compatible(src: str, sink: str) -> bool:
        if {src, sink} <= {"integer", "number"}:
            return True
        return src == sink
    type_mapping = {
        "string": "string",
        "text": "string",
        "integer": "number",
        "number": "number",
        "float": "number",
        "boolean": "boolean",
        "bool": "boolean",
        "array": "list",
        "list": "list",
        "object": "dictionary",
        "dict": "dictionary",
        "dictionary": "dictionary",
    }
    new_links = []
    nodes_to_add = []
    for link in links:
        source_node = node_lookup.get(link.get("source_id"))
        sink_node = node_lookup.get(link.get("sink_id"))
        if not source_node or not sink_node:
            new_links.append(link)
            continue
        source_block = block_map.get(source_node.get("block_id"))
        sink_block = block_map.get(sink_node.get("block_id"))
        if not source_block or not sink_block:
            new_links.append(link)
            continue
        source_outputs = source_block.get("outputSchema", {}).get("properties", {})
        sink_inputs = sink_block.get("inputSchema", {}).get("properties", {})
        source_type = get_property_type(source_outputs, link.get("source_name", ""))
        sink_type = get_property_type(sink_inputs, link.get("sink_name", ""))
        if (
            source_type
            and sink_type
            and not are_types_compatible(source_type, sink_type)
        ):
            # Insert type converter
            converter_id = str(uuid.uuid4())
            target_type = type_mapping.get(sink_type, sink_type)
            converter_node = {
                "id": converter_id,
                "block_id": UNIVERSAL_TYPE_CONVERTER_BLOCK_ID,
                "input_default": {"type": target_type},
                "metadata": {"position": {"x": 0, "y": 100}},
            }
            nodes_to_add.append(converter_node)
            # source -> converter
            new_links.append(
                {
                    "id": str(uuid.uuid4()),
                    "source_id": link["source_id"],
                    "source_name": link["source_name"],
                    "sink_id": converter_id,
                    "sink_name": "value",
                    "is_static": False,
                }
            )
            # converter -> sink
            new_links.append(
                {
                    "id": str(uuid.uuid4()),
                    "source_id": converter_id,
                    "source_name": "value",
                    "sink_id": link["sink_id"],
                    "sink_name": link["sink_name"],
                    "is_static": False,
                }
            )
            logger.debug(f"Inserted type converter: {source_type} -> {target_type}")
        else:
            new_links.append(link)
    if nodes_to_add:
        agent["nodes"] = nodes + nodes_to_add
        agent["links"] = new_links
    return agent
 def apply_all_fixes(
    agent: dict[str, Any], blocks_info: list[dict[str, Any]] | None = None
 ) -> dict[str, Any]:
    """Apply all fixes to an agent JSON.
    Args:
        agent: Agent JSON dict
        blocks_info: Optional list of block info dicts for advanced fixes
    Returns:
        Fixed agent JSON
    """
    # Basic fixes (no block info needed)
    agent = fix_agent_ids(agent)
    agent = fix_double_curly_braces(agent)
    agent = fix_storevalue_before_condition(agent)
    agent = fix_addtolist_blocks(agent)
    agent = fix_addtodictionary_blocks(agent)
    agent = fix_code_execution_output(agent)
    agent = fix_data_sampling_sample_size(agent)
    agent = fix_node_x_coordinates(agent)
    agent = fix_getcurrentdate_offset(agent)
    # Advanced fixes (require block info)
    if blocks_info is None:
        blocks_info = get_blocks_info()
    agent = fix_ai_model_parameter(agent, blocks_info)
    agent = fix_link_static_properties(agent, blocks_info)
    agent = fix_data_type_mismatch(agent, blocks_info)
    return agent
--- a/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/prompts.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/prompts.py
@@ -0,0 +1,225 @@
 """Prompt templates for agent generation."""
 DECOMPOSITION_PROMPT = """
 You are an expert AutoGPT Workflow Decomposer. Your task is to analyze a user's high-level goal and break it down into a clear, step-by-step plan using the available blocks.
 Each step should represent a distinct, automatable action suitable for execution by an AI automation system.
 ---
 FIRST: Analyze the user's goal and determine:
 1) Design-time configuration (fixed settings that won't change per run)
 2) Runtime inputs (values the agent's end-user will provide each time it runs)
 For anything that can vary per run (email addresses, names, dates, search terms, etc.):
 - DO NOT ask for the actual value
 - Instead, define it as an Agent Input with a clear name, type, and description
 Only ask clarifying questions about design-time config that affects how you build the workflow:
 - Which external service to use (e.g., "Gmail vs Outlook", "Notion vs Google Docs")
 - Required formats or structures (e.g., "CSV, JSON, or PDF output?")
 - Business rules that must be hard-coded
 IMPORTANT CLARIFICATIONS POLICY:
 - Ask no more than five essential questions
 - Do not ask for concrete values that can be provided at runtime as Agent Inputs
 - Do not ask for API keys or credentials; the platform handles those directly
 - If there is enough information to infer reasonable defaults, prefer to propose defaults
 ---
 GUIDELINES:
 1. List each step as a numbered item
 2. Describe the action clearly and specify inputs/outputs
 3. Ensure steps are in logical, sequential order
 4. Mention block names naturally (e.g., "Use GetWeatherByLocationBlock to...")
 5. Help the user reach their goal efficiently
 ---
 RULES:
 1. OUTPUT FORMAT: Only output either clarifying questions OR step-by-step instructions, not both
 2. USE ONLY THE BLOCKS PROVIDED
 3. ALL required_input fields must be provided
 4. Data types of linked properties must match
 5. Write expert-level prompts for AI-related blocks
 ---
 CRITICAL BLOCK RESTRICTIONS:
 1. AddToListBlock: Outputs updated list EVERY addition, not after all additions
 2. SendEmailBlock: Draft the email for user review; set SMTP config based on email type
 3. ConditionBlock: value2 is reference, value1 is contrast
 4. CodeExecutionBlock: DO NOT USE - use AI blocks instead
 5. ReadCsvBlock: Only use the 'rows' output, not 'row'
 ---
 OUTPUT FORMAT:
 If more information is needed:
 ```json
 {{
  "type": "clarifying_questions",
  "questions": [
    {{
      "question": "Which email provider should be used? (Gmail, Outlook, custom SMTP)",
      "keyword": "email_provider",
      "example": "Gmail"
    }}
  ]
 }}
 ```
 If ready to proceed:
 ```json
 {{
  "type": "instructions",
  "steps": [
    {{
      "step_number": 1,
      "block_name": "AgentShortTextInputBlock",
      "description": "Get the URL of the content to analyze.",
      "inputs": [{{"name": "name", "value": "URL"}}],
      "outputs": [{{"name": "result", "description": "The URL entered by user"}}]
    }}
  ]
 }}
 ```
 ---
 AVAILABLE BLOCKS:
 {block_summaries}
 """
 GENERATION_PROMPT = """
 You are an expert AI workflow builder. Generate a valid agent JSON from the given instructions.
 ---
 NODES:
 Each node must include:
 - `id`: Unique UUID v4 (e.g. `a8f5b1e2-c3d4-4e5f-8a9b-0c1d2e3f4a5b`)
 - `block_id`: The block identifier (must match an Allowed Block)
 - `input_default`: Dict of inputs (can be empty if no static inputs needed)
 - `metadata`: Must contain:
  - `position`: {{"x": number, "y": number}} - adjacent nodes should differ by 800+ in X
  - `customized_name`: Clear name describing this block's purpose in the workflow
 ---
 LINKS:
 Each link connects a source node's output to a sink node's input:
 - `id`: MUST be UUID v4 (NOT "link-1", "link-2", etc.)
 - `source_id`: ID of the source node
 - `source_name`: Output field name from the source block
 - `sink_id`: ID of the sink node
 - `sink_name`: Input field name on the sink block
 - `is_static`: true only if source block has static_output: true
 CRITICAL: All IDs must be valid UUID v4 format!
 ---
 AGENT (GRAPH):
 Wrap nodes and links in:
 - `id`: UUID of the agent
 - `name`: Short, generic name (avoid specific company names, URLs)
 - `description`: Short, generic description
 - `nodes`: List of all nodes
 - `links`: List of all links
 - `version`: 1
 - `is_active`: true
 ---
 TIPS:
 - All required_input fields must be provided via input_default or a valid link
 - Ensure consistent source_id and sink_id references
 - Avoid dangling links
 - Input/output pins must match block schemas
 - Do not invent unknown block_ids
 ---
 ALLOWED BLOCKS:
 {block_summaries}
 ---
 Generate the complete agent JSON. Output ONLY valid JSON, no explanation.
 """
 PATCH_PROMPT = """
 You are an expert at modifying AutoGPT agent workflows. Given the current agent and a modification request, generate a JSON patch to update the agent.
 CURRENT AGENT:
 {current_agent}
 AVAILABLE BLOCKS:
 {block_summaries}
 ---
 PATCH FORMAT:
 Return a JSON object with the following structure:
 ```json
 {{
  "type": "patch",
  "intent": "Brief description of what the patch does",
  "patches": [
    {{
      "type": "modify",
      "node_id": "uuid-of-node-to-modify",
      "changes": {{
        "input_default": {{"field": "new_value"}},
        "metadata": {{"customized_name": "New Name"}}
      }}
    }},
    {{
      "type": "add",
      "new_nodes": [
        {{
          "id": "new-uuid",
          "block_id": "block-uuid",
          "input_default": {{}},
          "metadata": {{"position": {{"x": 0, "y": 0}}, "customized_name": "Name"}}
        }}
      ],
      "new_links": [
        {{
          "id": "link-uuid",
          "source_id": "source-node-id",
          "source_name": "output_field",
          "sink_id": "sink-node-id",
          "sink_name": "input_field"
        }}
      ]
    }},
    {{
      "type": "remove",
      "node_ids": ["uuid-of-node-to-remove"],
      "link_ids": ["uuid-of-link-to-remove"]
    }}
  ]
 }}
 ```
 If you need more information, return:
 ```json
 {{
  "type": "clarifying_questions",
  "questions": [
    {{
      "question": "What specific change do you want?",
      "keyword": "change_type",
      "example": "Add error handling"
    }}
  ]
 }}
 ```
 Generate the minimal patch needed. Output ONLY valid JSON.
 """
--- a/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/utils.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/utils.py
@@ -0,0 +1,213 @@
 """Utilities for agent generation."""
 import json
 import re
 from typing import Any
 from backend.data.block import get_blocks
 # UUID validation regex
 UUID_REGEX = re.compile(
    r"^[a-f0-9]{8}-[a-f0-9]{4}-4[a-f0-9]{3}-[89ab][a-f0-9]{3}-[a-f0-9]{12}$"
 )
 # Block IDs for various fixes
 STORE_VALUE_BLOCK_ID = "1ff065e9-88e8-4358-9d82-8dc91f622ba9"
 CONDITION_BLOCK_ID = "715696a0-e1da-45c8-b209-c2fa9c3b0be6"
 ADDTOLIST_BLOCK_ID = "aeb08fc1-2fc1-4141-bc8e-f758f183a822"
 ADDTODICTIONARY_BLOCK_ID = "31d1064e-7446-4693-a7d4-65e5ca1180d1"
 CREATELIST_BLOCK_ID = "a912d5c7-6e00-4542-b2a9-8034136930e4"
 CREATEDICT_BLOCK_ID = "b924ddf4-de4f-4b56-9a85-358930dcbc91"
 CODE_EXECUTION_BLOCK_ID = "0b02b072-abe7-11ef-8372-fb5d162dd712"
 DATA_SAMPLING_BLOCK_ID = "4a448883-71fa-49cf-91cf-70d793bd7d87"
 UNIVERSAL_TYPE_CONVERTER_BLOCK_ID = "95d1b990-ce13-4d88-9737-ba5c2070c97b"
 GET_CURRENT_DATE_BLOCK_ID = "b29c1b50-5d0e-4d9f-8f9d-1b0e6fcbf0b1"
 DOUBLE_CURLY_BRACES_BLOCK_IDS = [
    "44f6c8ad-d75c-4ae1-8209-aad1c0326928",  # FillTextTemplateBlock
    "6ab085e2-20b3-4055-bc3e-08036e01eca6",
    "90f8c45e-e983-4644-aa0b-b4ebe2f531bc",
    "363ae599-353e-4804-937e-b2ee3cef3da4",  # AgentOutputBlock
    "3b191d9f-356f-482d-8238-ba04b6d18381",
    "db7d8f02-2f44-4c55-ab7a-eae0941f0c30",
    "3a7c4b8d-6e2f-4a5d-b9c1-f8d23c5a9b0e",
    "ed1ae7a0-b770-4089-b520-1f0005fad19a",
    "a892b8d9-3e4e-4e9c-9c1e-75f8efcf1bfa",
    "b29c1b50-5d0e-4d9f-8f9d-1b0e6fcbf0b1",
    "716a67b3-6760-42e7-86dc-18645c6e00fc",
    "530cf046-2ce0-4854-ae2c-659db17c7a46",
    "ed55ac19-356e-4243-a6cb-bc599e9b716f",
    "1f292d4a-41a4-4977-9684-7c8d560b9f91",  # LLM blocks
    "32a87eab-381e-4dd4-bdb8-4c47151be35a",
 ]
 def is_valid_uuid(value: str) -> bool:
    """Check if a string is a valid UUID v4."""
    return isinstance(value, str) and UUID_REGEX.match(value) is not None
 def _compact_schema(schema: dict) -> dict[str, str]:
    """Extract compact type info from a JSON schema properties dict.
    Returns a dict of {field_name: type_string} for essential info only.
    """
    props = schema.get("properties", {})
    result = {}
    for name, prop in props.items():
        # Skip internal/complex fields
        if name.startswith("_"):
            continue
        # Get type string
        type_str = prop.get("type", "any")
        # Handle anyOf/oneOf (optional types)
        if "anyOf" in prop:
            types = [t.get("type", "?") for t in prop["anyOf"] if t.get("type")]
            type_str = "|".join(types) if types else "any"
        elif "allOf" in prop:
            type_str = "object"
        # Add array item type if present
        if type_str == "array" and "items" in prop:
            items = prop["items"]
            if isinstance(items, dict):
                item_type = items.get("type", "any")
                type_str = f"array[{item_type}]"
        result[name] = type_str
    return result
 def get_block_summaries(include_schemas: bool = True) -> str:
    """Generate compact block summaries for prompts.
    Args:
        include_schemas: Whether to include input/output type info
    Returns:
        Formatted string of block summaries (compact format)
    """
    blocks = get_blocks()
    summaries = []
    for block_id, block_cls in blocks.items():
        block = block_cls()
        name = block.name
        desc = getattr(block, "description", "") or ""
        # Truncate description
        if len(desc) > 150:
            desc = desc[:147] + "..."
        if not include_schemas:
            summaries.append(f"- {name} (id: {block_id}): {desc}")
        else:
            # Compact format with type info only
            inputs = {}
            outputs = {}
            required = []
            if hasattr(block, "input_schema"):
                try:
                    schema = block.input_schema.jsonschema()
                    inputs = _compact_schema(schema)
                    required = schema.get("required", [])
                except Exception:
                    pass
            if hasattr(block, "output_schema"):
                try:
                    schema = block.output_schema.jsonschema()
                    outputs = _compact_schema(schema)
                except Exception:
                    pass
            # Build compact line format
            # Format: NAME (id): desc | in: {field:type, ...} [required] | out: {field:type}
            in_str = ", ".join(f"{k}:{v}" for k, v in inputs.items())
            out_str = ", ".join(f"{k}:{v}" for k, v in outputs.items())
            req_str = f" req=[{','.join(required)}]" if required else ""
            static = " [static]" if getattr(block, "static_output", False) else ""
            line = f"- {name} (id: {block_id}): {desc}"
            if in_str:
                line += f"\n  in: {{{in_str}}}{req_str}"
            if out_str:
                line += f"\n  out: {{{out_str}}}{static}"
            summaries.append(line)
    return "\n".join(summaries)
 def get_blocks_info() -> list[dict[str, Any]]:
    """Get block information with schemas for validation and fixing."""
    blocks = get_blocks()
    blocks_info = []
    for block_id, block_cls in blocks.items():
        block = block_cls()
        blocks_info.append(
            {
                "id": block_id,
                "name": block.name,
                "description": getattr(block, "description", ""),
                "categories": getattr(block, "categories", []),
                "staticOutput": getattr(block, "static_output", False),
                "inputSchema": (
                    block.input_schema.jsonschema()
                    if hasattr(block, "input_schema")
                    else {}
                ),
                "outputSchema": (
                    block.output_schema.jsonschema()
                    if hasattr(block, "output_schema")
                    else {}
                ),
            }
        )
    return blocks_info
 def parse_json_from_llm(text: str) -> dict[str, Any] | None:
    """Extract JSON from LLM response (handles markdown code blocks)."""
    if not text:
        return None
    # Try fenced code block
    match = re.search(r"```(?:json)?\s*([\s\S]*?)```", text, re.IGNORECASE)
    if match:
        try:
            return json.loads(match.group(1).strip())
        except json.JSONDecodeError:
            pass
    # Try raw text
    try:
        return json.loads(text.strip())
    except json.JSONDecodeError:
        pass
    # Try finding {...} span
    start = text.find("{")
    end = text.rfind("}")
    if start != -1 and end > start:
        try:
            return json.loads(text[start : end + 1])
        except json.JSONDecodeError:
            pass
    # Try finding [...] span
    start = text.find("[")
    end = text.rfind("]")
    if start != -1 and end > start:
        try:
            return json.loads(text[start : end + 1])
        except json.JSONDecodeError:
            pass
    return None
--- a/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/validator.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/validator.py
@@ -0,0 +1,279 @@
 """Agent validator - Validates agent structure and connections."""
 import logging
 import re
 from typing import Any
 from .utils import get_blocks_info
 logger = logging.getLogger(__name__)
 class AgentValidator:
    """Validator for AutoGPT agents with detailed error reporting."""
    def __init__(self):
        self.errors: list[str] = []
    def add_error(self, error: str) -> None:
        """Add an error message."""
        self.errors.append(error)
    def validate_block_existence(
        self, agent: dict[str, Any], blocks_info: list[dict[str, Any]]
    ) -> bool:
        """Validate all block IDs exist in the blocks library."""
        valid = True
        valid_block_ids = {b.get("id") for b in blocks_info if b.get("id")}
        for node in agent.get("nodes", []):
            block_id = node.get("block_id")
            node_id = node.get("id")
            if not block_id:
                self.add_error(f"Node '{node_id}' is missing 'block_id' field.")
                valid = False
                continue
            if block_id not in valid_block_ids:
                self.add_error(
                    f"Node '{node_id}' references block_id '{block_id}' which does not exist."
                )
                valid = False
        return valid
    def validate_link_node_references(self, agent: dict[str, Any]) -> bool:
        """Validate all node IDs referenced in links exist."""
        valid = True
        valid_node_ids = {n.get("id") for n in agent.get("nodes", []) if n.get("id")}
        for link in agent.get("links", []):
            link_id = link.get("id", "Unknown")
            source_id = link.get("source_id")
            sink_id = link.get("sink_id")
            if not source_id:
                self.add_error(f"Link '{link_id}' is missing 'source_id'.")
                valid = False
            elif source_id not in valid_node_ids:
                self.add_error(
                    f"Link '{link_id}' references non-existent source_id '{source_id}'."
                )
                valid = False
            if not sink_id:
                self.add_error(f"Link '{link_id}' is missing 'sink_id'.")
                valid = False
            elif sink_id not in valid_node_ids:
                self.add_error(
                    f"Link '{link_id}' references non-existent sink_id '{sink_id}'."
                )
                valid = False
        return valid
    def validate_required_inputs(
        self, agent: dict[str, Any], blocks_info: list[dict[str, Any]]
    ) -> bool:
        """Validate required inputs are provided."""
        valid = True
        block_map = {b.get("id"): b for b in blocks_info}
        for node in agent.get("nodes", []):
            block_id = node.get("block_id")
            block = block_map.get(block_id)
            if not block:
                continue
            required_inputs = block.get("inputSchema", {}).get("required", [])
            input_defaults = node.get("input_default", {})
            node_id = node.get("id")
            # Get linked inputs
            linked_inputs = {
                link["sink_name"]
                for link in agent.get("links", [])
                if link.get("sink_id") == node_id
            }
            for req_input in required_inputs:
                if (
                    req_input not in input_defaults
                    and req_input not in linked_inputs
                    and req_input != "credentials"
                ):
                    block_name = block.get("name", "Unknown Block")
                    self.add_error(
                        f"Node '{node_id}' ({block_name}) is missing required input '{req_input}'."
                    )
                    valid = False
        return valid
    def validate_data_type_compatibility(
        self, agent: dict[str, Any], blocks_info: list[dict[str, Any]]
    ) -> bool:
        """Validate linked data types are compatible."""
        valid = True
        block_map = {b.get("id"): b for b in blocks_info}
        node_lookup = {n.get("id"): n for n in agent.get("nodes", [])}
        def get_type(schema: dict, name: str) -> str | None:
            if "_#_" in name:
                parent, child = name.split("_#_", 1)
                parent_schema = schema.get(parent, {})
                if "properties" in parent_schema:
                    return parent_schema["properties"].get(child, {}).get("type")
                return None
            return schema.get(name, {}).get("type")
        def are_compatible(src: str, sink: str) -> bool:
            if {src, sink} <= {"integer", "number"}:
                return True
            return src == sink
        for link in agent.get("links", []):
            source_node = node_lookup.get(link.get("source_id"))
            sink_node = node_lookup.get(link.get("sink_id"))
            if not source_node or not sink_node:
                continue
            source_block = block_map.get(source_node.get("block_id"))
            sink_block = block_map.get(sink_node.get("block_id"))
            if not source_block or not sink_block:
                continue
            source_outputs = source_block.get("outputSchema", {}).get("properties", {})
            sink_inputs = sink_block.get("inputSchema", {}).get("properties", {})
            source_type = get_type(source_outputs, link.get("source_name", ""))
            sink_type = get_type(sink_inputs, link.get("sink_name", ""))
            if source_type and sink_type and not are_compatible(source_type, sink_type):
                self.add_error(
                    f"Type mismatch: {source_block.get('name')} output '{link['source_name']}' "
                    f"({source_type}) -> {sink_block.get('name')} input '{link['sink_name']}' ({sink_type})."
                )
                valid = False
        return valid
    def validate_nested_sink_links(
        self, agent: dict[str, Any], blocks_info: list[dict[str, Any]]
    ) -> bool:
        """Validate nested sink links (with _#_ notation)."""
        valid = True
        block_map = {b.get("id"): b for b in blocks_info}
        node_lookup = {n.get("id"): n for n in agent.get("nodes", [])}
        for link in agent.get("links", []):
            sink_name = link.get("sink_name", "")
            if "_#_" in sink_name:
                parent, child = sink_name.split("_#_", 1)
                sink_node = node_lookup.get(link.get("sink_id"))
                if not sink_node:
                    continue
                block = block_map.get(sink_node.get("block_id"))
                if not block:
                    continue
                input_props = block.get("inputSchema", {}).get("properties", {})
                parent_schema = input_props.get(parent)
                if not parent_schema:
                    self.add_error(
                        f"Invalid nested link '{sink_name}': parent '{parent}' not found."
                    )
                    valid = False
                    continue
                if not parent_schema.get("additionalProperties"):
                    if not (
                        isinstance(parent_schema, dict)
                        and "properties" in parent_schema
                        and child in parent_schema.get("properties", {})
                    ):
                        self.add_error(
                            f"Invalid nested link '{sink_name}': child '{child}' not found in '{parent}'."
                        )
                        valid = False
        return valid
    def validate_prompt_spaces(self, agent: dict[str, Any]) -> bool:
        """Validate prompts don't have spaces in template variables."""
        valid = True
        for node in agent.get("nodes", []):
            input_default = node.get("input_default", {})
            prompt = input_default.get("prompt", "")
            if not isinstance(prompt, str):
                continue
            # Find {{...}} with spaces
            matches = re.finditer(r"\{\{([^}]+)\}\}", prompt)
            for match in matches:
                content = match.group(1)
                if " " in content:
                    self.add_error(
                        f"Node '{node.get('id')}' has spaces in template variable: "
                        f"'{{{{{content}}}}}' should be '{{{{{content.replace(' ', '_')}}}}}'."
                    )
                    valid = False
        return valid
    def validate(
        self, agent: dict[str, Any], blocks_info: list[dict[str, Any]] | None = None
    ) -> tuple[bool, str | None]:
        """Run all validations.
        Returns:
            Tuple of (is_valid, error_message)
        """
        self.errors = []
        if blocks_info is None:
            blocks_info = get_blocks_info()
        checks = [
            self.validate_block_existence(agent, blocks_info),
            self.validate_link_node_references(agent),
            self.validate_required_inputs(agent, blocks_info),
            self.validate_data_type_compatibility(agent, blocks_info),
            self.validate_nested_sink_links(agent, blocks_info),
            self.validate_prompt_spaces(agent),
        ]
        all_passed = all(checks)
        if all_passed:
            logger.info("Agent validation successful")
            return True, None
        error_message = "Agent validation failed:\n"
        for i, error in enumerate(self.errors, 1):
            error_message += f"{i}. {error}\n"
        logger.warning(f"Agent validation failed with {len(self.errors)} errors")
        return False, error_message
 def validate_agent(
    agent: dict[str, Any], blocks_info: list[dict[str, Any]] | None = None
 ) -> tuple[bool, str | None]:
    """Convenience function to validate an agent.
    Returns:
        Tuple of (is_valid, error_message)
    """
    validator = AgentValidator()
    return validator.validate(agent, blocks_info)
--- a/autogpt_platform/backend/backend/api/features/chat/tools/agent_output.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/agent_output.py
@@ -0,0 +1,448 @@
 """Tool for retrieving agent execution outputs from user's library."""
 import logging
 import re
 from datetime import datetime, timedelta, timezone
 from typing import Any
 from langfuse import observe
 from pydantic import BaseModel, field_validator
 from backend.api.features.chat.model import ChatSession
 from backend.api.features.library import db as library_db
 from backend.api.features.library.model import LibraryAgent
 from backend.data import execution as execution_db
 from backend.data.execution import ExecutionStatus, GraphExecution, GraphExecutionMeta
 from .base import BaseTool
 from .models import (
    AgentOutputResponse,
    ErrorResponse,
    ExecutionOutputInfo,
    NoResultsResponse,
    ToolResponseBase,
 )
 from .utils import fetch_graph_from_store_slug
 logger = logging.getLogger(__name__)
 class AgentOutputInput(BaseModel):
    """Input parameters for the agent_output tool."""
    agent_name: str = ""
    library_agent_id: str = ""
    store_slug: str = ""
    execution_id: str = ""
    run_time: str = "latest"
    @field_validator(
        "agent_name",
        "library_agent_id",
        "store_slug",
        "execution_id",
        "run_time",
        mode="before",
    )
    @classmethod
    def strip_strings(cls, v: Any) -> Any:
        """Strip whitespace from string fields."""
        return v.strip() if isinstance(v, str) else v
 def parse_time_expression(
    time_expr: str | None,
 ) -> tuple[datetime | None, datetime | None]:
    """
    Parse time expression into datetime range (start, end).
    Supports: "latest", "yesterday", "today", "last week", "last 7 days",
    "last month", "last 30 days", ISO date "YYYY-MM-DD", ISO datetime.
    """
    if not time_expr or time_expr.lower() == "latest":
        return None, None
    now = datetime.now(timezone.utc)
    today_start = now.replace(hour=0, minute=0, second=0, microsecond=0)
    expr = time_expr.lower().strip()
    # Relative time expressions lookup
    relative_times: dict[str, tuple[datetime, datetime]] = {
        "yesterday": (today_start - timedelta(days=1), today_start),
        "today": (today_start, now),
        "last week": (now - timedelta(days=7), now),
        "last 7 days": (now - timedelta(days=7), now),
        "last month": (now - timedelta(days=30), now),
        "last 30 days": (now - timedelta(days=30), now),
    }
    if expr in relative_times:
        return relative_times[expr]
    # Try ISO date format (YYYY-MM-DD)
    date_match = re.match(r"^(\d{4})-(\d{2})-(\d{2})$", expr)
    if date_match:
        try:
            year, month, day = map(int, date_match.groups())
            start = datetime(year, month, day, 0, 0, 0, tzinfo=timezone.utc)
            return start, start + timedelta(days=1)
        except ValueError:
            # Invalid date components (e.g., month=13, day=32)
            pass
    # Try ISO datetime
    try:
        parsed = datetime.fromisoformat(expr.replace("Z", "+00:00"))
        if parsed.tzinfo is None:
            parsed = parsed.replace(tzinfo=timezone.utc)
        return parsed - timedelta(hours=1), parsed + timedelta(hours=1)
    except ValueError:
        return None, None
 class AgentOutputTool(BaseTool):
    """Tool for retrieving execution outputs from user's library agents."""
    @property
    def name(self) -> str:
        return "view_agent_output"
    @property
    def description(self) -> str:
        return """Retrieve execution outputs from agents in the user's library.
        Identify the agent using one of:
        - agent_name: Fuzzy search in user's library
        - library_agent_id: Exact library agent ID
        - store_slug: Marketplace format 'username/agent-name'
        Select which run to retrieve using:
        - execution_id: Specific execution ID
        - run_time: 'latest' (default), 'yesterday', 'last week', or ISO date 'YYYY-MM-DD'
        """
    @property
    def parameters(self) -> dict[str, Any]:
        return {
            "type": "object",
            "properties": {
                "agent_name": {
                    "type": "string",
                    "description": "Agent name to search for in user's library (fuzzy match)",
                },
                "library_agent_id": {
                    "type": "string",
                    "description": "Exact library agent ID",
                },
                "store_slug": {
                    "type": "string",
                    "description": "Marketplace identifier: 'username/agent-slug'",
                },
                "execution_id": {
                    "type": "string",
                    "description": "Specific execution ID to retrieve",
                },
                "run_time": {
                    "type": "string",
                    "description": (
                        "Time filter: 'latest', 'yesterday', 'last week', or 'YYYY-MM-DD'"
                    ),
                },
            },
            "required": [],
        }
    @property
    def requires_auth(self) -> bool:
        return True
    async def _resolve_agent(
        self,
        user_id: str,
        agent_name: str | None,
        library_agent_id: str | None,
        store_slug: str | None,
    ) -> tuple[LibraryAgent | None, str | None]:
        """
        Resolve agent from provided identifiers.
        Returns (library_agent, error_message).
        """
        # Priority 1: Exact library agent ID
        if library_agent_id:
            try:
                agent = await library_db.get_library_agent(library_agent_id, user_id)
                return agent, None
            except Exception as e:
                logger.warning(f"Failed to get library agent by ID: {e}")
                return None, f"Library agent '{library_agent_id}' not found"
        # Priority 2: Store slug (username/agent-name)
        if store_slug and "/" in store_slug:
            username, agent_slug = store_slug.split("/", 1)
            graph, _ = await fetch_graph_from_store_slug(username, agent_slug)
            if not graph:
                return None, f"Agent '{store_slug}' not found in marketplace"
            # Find in user's library by graph_id
            agent = await library_db.get_library_agent_by_graph_id(user_id, graph.id)
            if not agent:
                return (
                    None,
                    f"Agent '{store_slug}' is not in your library. "
                    "Add it first to see outputs.",
                )
            return agent, None
        # Priority 3: Fuzzy name search in library
        if agent_name:
            try:
                response = await library_db.list_library_agents(
                    user_id=user_id,
                    search_term=agent_name,
                    page_size=5,
                )
                if not response.agents:
                    return (
                        None,
                        f"No agents matching '{agent_name}' found in your library",
                    )
                # Return best match (first result from search)
                return response.agents[0], None
            except Exception as e:
                logger.error(f"Error searching library agents: {e}")
                return None, f"Error searching for agent: {e}"
        return (
            None,
            "Please specify an agent name, library_agent_id, or store_slug",
        )
    async def _get_execution(
        self,
        user_id: str,
        graph_id: str,
        execution_id: str | None,
        time_start: datetime | None,
        time_end: datetime | None,
    ) -> tuple[GraphExecution | None, list[GraphExecutionMeta], str | None]:
        """
        Fetch execution(s) based on filters.
        Returns (single_execution, available_executions_meta, error_message).
        """
        # If specific execution_id provided, fetch it directly
        if execution_id:
            execution = await execution_db.get_graph_execution(
                user_id=user_id,
                execution_id=execution_id,
                include_node_executions=False,
            )
            if not execution:
                return None, [], f"Execution '{execution_id}' not found"
            return execution, [], None
        # Get completed executions with time filters
        executions = await execution_db.get_graph_executions(
            graph_id=graph_id,
            user_id=user_id,
            statuses=[ExecutionStatus.COMPLETED],
            created_time_gte=time_start,
            created_time_lte=time_end,
            limit=10,
        )
        if not executions:
            return None, [], None  # No error, just no executions
        # If only one execution, fetch full details
        if len(executions) == 1:
            full_execution = await execution_db.get_graph_execution(
                user_id=user_id,
                execution_id=executions[0].id,
                include_node_executions=False,
            )
            return full_execution, [], None
        # Multiple executions - return latest with full details, plus list of available
        full_execution = await execution_db.get_graph_execution(
            user_id=user_id,
            execution_id=executions[0].id,
            include_node_executions=False,
        )
        return full_execution, executions, None
    def _build_response(
        self,
        agent: LibraryAgent,
        execution: GraphExecution | None,
        available_executions: list[GraphExecutionMeta],
        session_id: str | None,
    ) -> AgentOutputResponse:
        """Build the response based on execution data."""
        library_agent_link = f"/library/agents/{agent.id}"
        if not execution:
            return AgentOutputResponse(
                message=f"No completed executions found for agent '{agent.name}'",
                session_id=session_id,
                agent_name=agent.name,
                agent_id=agent.graph_id,
                library_agent_id=agent.id,
                library_agent_link=library_agent_link,
                total_executions=0,
            )
        execution_info = ExecutionOutputInfo(
            execution_id=execution.id,
            status=execution.status.value,
            started_at=execution.started_at,
            ended_at=execution.ended_at,
            outputs=dict(execution.outputs),
            inputs_summary=execution.inputs if execution.inputs else None,
        )
        available_list = None
        if len(available_executions) > 1:
            available_list = [
                {
                    "id": e.id,
                    "status": e.status.value,
                    "started_at": e.started_at.isoformat() if e.started_at else None,
                }
                for e in available_executions[:5]
            ]
        message = f"Found execution outputs for agent '{agent.name}'"
        if len(available_executions) > 1:
            message += (
                f". Showing latest of {len(available_executions)} matching executions."
            )
        return AgentOutputResponse(
            message=message,
            session_id=session_id,
            agent_name=agent.name,
            agent_id=agent.graph_id,
            library_agent_id=agent.id,
            library_agent_link=library_agent_link,
            execution=execution_info,
            available_executions=available_list,
            total_executions=len(available_executions) if available_executions else 1,
        )
    @observe(as_type="tool", name="view_agent_output")
    async def _execute(
        self,
        user_id: str | None,
        session: ChatSession,
        **kwargs,
    ) -> ToolResponseBase:
        """Execute the agent_output tool."""
        session_id = session.session_id
        # Parse and validate input
        try:
            input_data = AgentOutputInput(**kwargs)
        except Exception as e:
            logger.error(f"Invalid input: {e}")
            return ErrorResponse(
                message="Invalid input parameters",
                error=str(e),
                session_id=session_id,
            )
        # Ensure user_id is present (should be guaranteed by requires_auth)
        if not user_id:
            return ErrorResponse(
                message="User authentication required",
                session_id=session_id,
            )
        # Check if at least one identifier is provided
        if not any(
            [
                input_data.agent_name,
                input_data.library_agent_id,
                input_data.store_slug,
                input_data.execution_id,
            ]
        ):
            return ErrorResponse(
                message=(
                    "Please specify at least one of: agent_name, "
                    "library_agent_id, store_slug, or execution_id"
                ),
                session_id=session_id,
            )
        # If only execution_id provided, we need to find the agent differently
        if (
            input_data.execution_id
            and not input_data.agent_name
            and not input_data.library_agent_id
            and not input_data.store_slug
        ):
            # Fetch execution directly to get graph_id
            execution = await execution_db.get_graph_execution(
                user_id=user_id,
                execution_id=input_data.execution_id,
                include_node_executions=False,
            )
            if not execution:
                return ErrorResponse(
                    message=f"Execution '{input_data.execution_id}' not found",
                    session_id=session_id,
                )
            # Find library agent by graph_id
            agent = await library_db.get_library_agent_by_graph_id(
                user_id, execution.graph_id
            )
            if not agent:
                return NoResultsResponse(
                    message=(
                        f"Execution found but agent not in your library. "
                        f"Graph ID: {execution.graph_id}"
                    ),
                    session_id=session_id,
                    suggestions=["Add the agent to your library to see more details"],
                )
            return self._build_response(agent, execution, [], session_id)
        # Resolve agent from identifiers
        agent, error = await self._resolve_agent(
            user_id=user_id,
            agent_name=input_data.agent_name or None,
            library_agent_id=input_data.library_agent_id or None,
            store_slug=input_data.store_slug or None,
        )
        if error or not agent:
            return NoResultsResponse(
                message=error or "Agent not found",
                session_id=session_id,
                suggestions=[
                    "Check the agent name or ID",
                    "Make sure the agent is in your library",
                ],
            )
        # Parse time expression
        time_start, time_end = parse_time_expression(input_data.run_time)
        # Fetch execution(s)
        execution, available_executions, exec_error = await self._get_execution(
            user_id=user_id,
            graph_id=agent.graph_id,
            execution_id=input_data.execution_id or None,
            time_start=time_start,
            time_end=time_end,
        )
        if exec_error:
            return ErrorResponse(
                message=exec_error,
                session_id=session_id,
            )
        return self._build_response(agent, execution, available_executions, session_id)
--- a/autogpt_platform/backend/backend/api/features/chat/tools/agent_search.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/agent_search.py
@@ -0,0 +1,151 @@
 """Shared agent search functionality for find_agent and find_library_agent tools."""
 import logging
 from typing import Literal
 from backend.api.features.library import db as library_db
 from backend.api.features.store import db as store_db
 from backend.util.exceptions import DatabaseError, NotFoundError
 from .models import (
    AgentInfo,
    AgentsFoundResponse,
    ErrorResponse,
    NoResultsResponse,
    ToolResponseBase,
 )
 logger = logging.getLogger(__name__)
 SearchSource = Literal["marketplace", "library"]
 async def search_agents(
    query: str,
    source: SearchSource,
    session_id: str | None,
    user_id: str | None = None,
 ) -> ToolResponseBase:
    """
    Search for agents in marketplace or user library.
    Args:
        query: Search query string
        source: "marketplace" or "library"
        session_id: Chat session ID
        user_id: User ID (required for library search)
    Returns:
        AgentsFoundResponse, NoResultsResponse, or ErrorResponse
    """
    if not query:
        return ErrorResponse(
            message="Please provide a search query", session_id=session_id
        )
    if source == "library" and not user_id:
        return ErrorResponse(
            message="User authentication required to search library",
            session_id=session_id,
        )
    agents: list[AgentInfo] = []
    try:
        if source == "marketplace":
            logger.info(f"Searching marketplace for: {query}")
            results = await store_db.get_store_agents(search_query=query, page_size=5)
            for agent in results.agents:
                agents.append(
                    AgentInfo(
                        id=f"{agent.creator}/{agent.slug}",
                        name=agent.agent_name,
                        description=agent.description or "",
                        source="marketplace",
                        in_library=False,
                        creator=agent.creator,
                        category="general",
                        rating=agent.rating,
                        runs=agent.runs,
                        is_featured=False,
                    )
                )
        else:  # library
            logger.info(f"Searching user library for: {query}")
            results = await library_db.list_library_agents(
                user_id=user_id,  # type: ignore[arg-type]
                search_term=query,
                page_size=10,
            )
            for agent in results.agents:
                agents.append(
                    AgentInfo(
                        id=agent.id,
                        name=agent.name,
                        description=agent.description or "",
                        source="library",
                        in_library=True,
                        creator=agent.creator_name,
                        status=agent.status.value,
                        can_access_graph=agent.can_access_graph,
                        has_external_trigger=agent.has_external_trigger,
                        new_output=agent.new_output,
                        graph_id=agent.graph_id,
                    )
                )
        logger.info(f"Found {len(agents)} agents in {source}")
    except NotFoundError:
        pass
    except DatabaseError as e:
        logger.error(f"Error searching {source}: {e}", exc_info=True)
        return ErrorResponse(
            message=f"Failed to search {source}. Please try again.",
            error=str(e),
            session_id=session_id,
        )
    if not agents:
        suggestions = (
            [
                "Try more general terms",
                "Browse categories in the marketplace",
                "Check spelling",
            ]
            if source == "marketplace"
            else [
                "Try different keywords",
                "Use find_agent to search the marketplace",
                "Check your library at /library",
            ]
        )
        no_results_msg = (
            f"No agents found matching '{query}'. Try different keywords or browse the marketplace."
            if source == "marketplace"
            else f"No agents matching '{query}' found in your library."
        )
        return NoResultsResponse(
            message=no_results_msg, session_id=session_id, suggestions=suggestions
        )
    title = f"Found {len(agents)} agent{'s' if len(agents) != 1 else ''} "
    title += (
        f"for '{query}'"
        if source == "marketplace"
        else f"in your library for '{query}'"
    )
    message = (
        "Now you have found some options for the user to choose from. "
        "You can add a link to a recommended agent at: /marketplace/agent/agent_id "
        "Please ask the user if they would like to use any of these agents."
        if source == "marketplace"
        else "Found agents in the user's library. You can provide a link to view an agent at: "
        "/library/agents/{agent_id}. Use agent_output to get execution results, or run_agent to execute."
    )
    return AgentsFoundResponse(
        message=message,
        title=title,
        agents=agents,
        count=len(agents),
        session_id=session_id,
    )
--- a/autogpt_platform/backend/backend/api/features/chat/tools/base.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/base.py
@@ -6,7 +6,7 @@ from typing import Any
 from openai.types.chat import ChatCompletionToolParam
 from backend.api.features.chat.model import ChatSession
-from backend.api.features.chat.response_model import StreamToolExecutionResult
+from backend.api.features.chat.response_model import StreamToolOutputAvailable
 from .models import ErrorResponse, NeedLoginResponse, ToolResponseBase
@@ -53,7 +53,7 @@ class BaseTool:
        session: ChatSession,
        tool_call_id: str,
        **kwargs,
-    ) -> StreamToolExecutionResult:
+    ) -> StreamToolOutputAvailable:
        """Execute the tool with authentication check.
        Args:
@@ -69,10 +69,10 @@ class BaseTool:
            logger.error(
                f"Attempted tool call for {self.name} but user not authenticated"
            )
-            return StreamToolExecutionResult(
+            return StreamToolOutputAvailable(
-                tool_id=tool_call_id,
+                toolCallId=tool_call_id,
-                tool_name=self.name,
+                toolName=self.name,
-                result=NeedLoginResponse(
+                output=NeedLoginResponse(
                    message=f"Please sign in to use {self.name}",
                    session_id=session.session_id,
                ).model_dump_json(),
@@ -81,17 +81,17 @@ class BaseTool:
        try:
            result = await self._execute(user_id, session, **kwargs)
-            return StreamToolExecutionResult(
+            return StreamToolOutputAvailable(
-                tool_id=tool_call_id,
+                toolCallId=tool_call_id,
-                tool_name=self.name,
+                toolName=self.name,
-                result=result.model_dump_json(),
+                output=result.model_dump_json(),
            )
        except Exception as e:
            logger.error(f"Error in {self.name}: {e}", exc_info=True)
-            return StreamToolExecutionResult(
+            return StreamToolOutputAvailable(
-                tool_id=tool_call_id,
+                toolCallId=tool_call_id,
-                tool_name=self.name,
+                toolName=self.name,
-                result=ErrorResponse(
+                output=ErrorResponse(
                    message=f"An error occurred while executing {self.name}",
                    error=str(e),
                    session_id=session.session_id,
--- a/autogpt_platform/backend/backend/api/features/chat/tools/create_agent.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/create_agent.py
@@ -0,0 +1,282 @@
 """CreateAgentTool - Creates agents from natural language descriptions."""
 import logging
 from typing import Any
 from langfuse import observe
 from backend.api.features.chat.model import ChatSession
 from .agent_generator import (
    apply_all_fixes,
    decompose_goal,
    generate_agent,
    get_blocks_info,
    save_agent_to_library,
    validate_agent,
 )
 from .base import BaseTool
 from .models import (
    AgentPreviewResponse,
    AgentSavedResponse,
    ClarificationNeededResponse,
    ClarifyingQuestion,
    ErrorResponse,
    ToolResponseBase,
 )
 logger = logging.getLogger(__name__)
 # Maximum retries for agent generation with validation feedback
 MAX_GENERATION_RETRIES = 2
 class CreateAgentTool(BaseTool):
    """Tool for creating agents from natural language descriptions."""
    @property
    def name(self) -> str:
        return "create_agent"
    @property
    def description(self) -> str:
        return (
            "Create a new agent workflow from a natural language description. "
            "First generates a preview, then saves to library if save=true."
        )
    @property
    def requires_auth(self) -> bool:
        return True
    @property
    def parameters(self) -> dict[str, Any]:
        return {
            "type": "object",
            "properties": {
                "description": {
                    "type": "string",
                    "description": (
                        "Natural language description of what the agent should do. "
                        "Be specific about inputs, outputs, and the workflow steps."
                    ),
                },
                "context": {
                    "type": "string",
                    "description": (
                        "Additional context or answers to previous clarifying questions. "
                        "Include any preferences or constraints mentioned by the user."
                    ),
                },
                "save": {
                    "type": "boolean",
                    "description": (
                        "Whether to save the agent to the user's library. "
                        "Default is true. Set to false for preview only."
                    ),
                    "default": True,
                },
            },
            "required": ["description"],
        }
    @observe(as_type="tool", name="create_agent")
    async def _execute(
        self,
        user_id: str | None,
        session: ChatSession,
        **kwargs,
    ) -> ToolResponseBase:
        """Execute the create_agent tool.
        Flow:
        1. Decompose the description into steps (may return clarifying questions)
        2. Generate agent JSON from the steps
        3. Apply fixes to correct common LLM errors
        4. Preview or save based on the save parameter
        """
        description = kwargs.get("description", "").strip()
        context = kwargs.get("context", "")
        save = kwargs.get("save", True)
        session_id = session.session_id if session else None
        if not description:
            return ErrorResponse(
                message="Please provide a description of what the agent should do.",
                error="Missing description parameter",
                session_id=session_id,
            )
        # Step 1: Decompose goal into steps
        try:
            decomposition_result = await decompose_goal(description, context)
        except ValueError as e:
            # Handle missing API key or configuration errors
            return ErrorResponse(
                message=f"Agent generation is not configured: {str(e)}",
                error="configuration_error",
                session_id=session_id,
            )
        if decomposition_result is None:
            return ErrorResponse(
                message="Failed to analyze the goal. Please try rephrasing.",
                error="Decomposition failed",
                session_id=session_id,
            )
        # Check if LLM returned clarifying questions
        if decomposition_result.get("type") == "clarifying_questions":
            questions = decomposition_result.get("questions", [])
            return ClarificationNeededResponse(
                message=(
                    "I need some more information to create this agent. "
                    "Please answer the following questions:"
                ),
                questions=[
                    ClarifyingQuestion(
                        question=q.get("question", ""),
                        keyword=q.get("keyword", ""),
                        example=q.get("example"),
                    )
                    for q in questions
                ],
                session_id=session_id,
            )
        # Check for unachievable/vague goals
        if decomposition_result.get("type") == "unachievable_goal":
            suggested = decomposition_result.get("suggested_goal", "")
            reason = decomposition_result.get("reason", "")
            return ErrorResponse(
                message=(
                    f"This goal cannot be accomplished with the available blocks. "
                    f"{reason} "
                    f"Suggestion: {suggested}"
                ),
                error="unachievable_goal",
                details={"suggested_goal": suggested, "reason": reason},
                session_id=session_id,
            )
        if decomposition_result.get("type") == "vague_goal":
            suggested = decomposition_result.get("suggested_goal", "")
            return ErrorResponse(
                message=(
                    f"The goal is too vague to create a specific workflow. "
                    f"Suggestion: {suggested}"
                ),
                error="vague_goal",
                details={"suggested_goal": suggested},
                session_id=session_id,
            )
        # Step 2: Generate agent JSON with retry on validation failure
        blocks_info = get_blocks_info()
        agent_json = None
        validation_errors = None
        for attempt in range(MAX_GENERATION_RETRIES + 1):
            # Generate agent (include validation errors from previous attempt)
            if attempt == 0:
                agent_json = await generate_agent(decomposition_result)
            else:
                # Retry with validation error feedback
                logger.info(
                    f"Retry {attempt}/{MAX_GENERATION_RETRIES} with validation feedback"
                )
                retry_instructions = {
                    **decomposition_result,
                    "previous_errors": validation_errors,
                    "retry_instructions": (
                        "The previous generation had validation errors. "
                        "Please fix these issues in the new generation:\n"
                        f"{validation_errors}"
                    ),
                }
                agent_json = await generate_agent(retry_instructions)
            if agent_json is None:
                if attempt == MAX_GENERATION_RETRIES:
                    return ErrorResponse(
                        message="Failed to generate the agent. Please try again.",
                        error="Generation failed",
                        session_id=session_id,
                    )
                continue
            # Step 3: Apply fixes to correct common errors
            agent_json = apply_all_fixes(agent_json, blocks_info)
            # Step 4: Validate the agent
            is_valid, validation_errors = validate_agent(agent_json, blocks_info)
            if is_valid:
                logger.info(f"Agent generated successfully on attempt {attempt + 1}")
                break
            logger.warning(
                f"Validation failed on attempt {attempt + 1}: {validation_errors}"
            )
            if attempt == MAX_GENERATION_RETRIES:
                # Return error with validation details
                return ErrorResponse(
                    message=(
                        f"Generated agent has validation errors after {MAX_GENERATION_RETRIES + 1} attempts. "
                        f"Please try rephrasing your request or simplify the workflow."
                    ),
                    error="validation_failed",
                    details={"validation_errors": validation_errors},
                    session_id=session_id,
                )
        agent_name = agent_json.get("name", "Generated Agent")
        agent_description = agent_json.get("description", "")
        node_count = len(agent_json.get("nodes", []))
        link_count = len(agent_json.get("links", []))
        # Step 4: Preview or save
        if not save:
            return AgentPreviewResponse(
                message=(
                    f"I've generated an agent called '{agent_name}' with {node_count} blocks. "
                    f"Review it and call create_agent with save=true to save it to your library."
                ),
                agent_json=agent_json,
                agent_name=agent_name,
                description=agent_description,
                node_count=node_count,
                link_count=link_count,
                session_id=session_id,
            )
        # Save to library
        if not user_id:
            return ErrorResponse(
                message="You must be logged in to save agents.",
                error="auth_required",
                session_id=session_id,
            )
        try:
            created_graph, library_agent = await save_agent_to_library(
                agent_json, user_id
            )
            return AgentSavedResponse(
                message=f"Agent '{created_graph.name}' has been saved to your library!",
                agent_id=created_graph.id,
                agent_name=created_graph.name,
                library_agent_id=library_agent.id,
                library_agent_link=f"/library/{library_agent.id}",
                agent_page_link=f"/build?flowID={created_graph.id}",
                session_id=session_id,
            )
        except Exception as e:
            return ErrorResponse(
                message=f"Failed to save the agent: {str(e)}",
                error="save_failed",
                details={"exception": str(e)},
                session_id=session_id,
            )
--- a/autogpt_platform/backend/backend/api/features/chat/tools/edit_agent.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/edit_agent.py
@@ -0,0 +1,297 @@
 """EditAgentTool - Edits existing agents using natural language."""
 import logging
 from typing import Any
 from langfuse import observe
 from backend.api.features.chat.model import ChatSession
 from .agent_generator import (
    apply_agent_patch,
    apply_all_fixes,
    generate_agent_patch,
    get_agent_as_json,
    get_blocks_info,
    save_agent_to_library,
    validate_agent,
 )
 from .base import BaseTool
 from .models import (
    AgentPreviewResponse,
    AgentSavedResponse,
    ClarificationNeededResponse,
    ClarifyingQuestion,
    ErrorResponse,
    ToolResponseBase,
 )
 logger = logging.getLogger(__name__)
 # Maximum retries for patch generation with validation feedback
 MAX_GENERATION_RETRIES = 2
 class EditAgentTool(BaseTool):
    """Tool for editing existing agents using natural language."""
    @property
    def name(self) -> str:
        return "edit_agent"
    @property
    def description(self) -> str:
        return (
            "Edit an existing agent from the user's library using natural language. "
            "Generates a patch to update the agent while preserving unchanged parts."
        )
    @property
    def requires_auth(self) -> bool:
        return True
    @property
    def parameters(self) -> dict[str, Any]:
        return {
            "type": "object",
            "properties": {
                "agent_id": {
                    "type": "string",
                    "description": (
                        "The ID of the agent to edit. "
                        "Can be a graph ID or library agent ID."
                    ),
                },
                "changes": {
                    "type": "string",
                    "description": (
                        "Natural language description of what changes to make. "
                        "Be specific about what to add, remove, or modify."
                    ),
                },
                "context": {
                    "type": "string",
                    "description": (
                        "Additional context or answers to previous clarifying questions."
                    ),
                },
                "save": {
                    "type": "boolean",
                    "description": (
                        "Whether to save the changes. "
                        "Default is true. Set to false for preview only."
                    ),
                    "default": True,
                },
            },
            "required": ["agent_id", "changes"],
        }
    @observe(as_type="tool", name="edit_agent")
    async def _execute(
        self,
        user_id: str | None,
        session: ChatSession,
        **kwargs,
    ) -> ToolResponseBase:
        """Execute the edit_agent tool.
        Flow:
        1. Fetch the current agent
        2. Generate a patch based on the requested changes
        3. Apply the patch to create an updated agent
        4. Preview or save based on the save parameter
        """
        agent_id = kwargs.get("agent_id", "").strip()
        changes = kwargs.get("changes", "").strip()
        context = kwargs.get("context", "")
        save = kwargs.get("save", True)
        session_id = session.session_id if session else None
        if not agent_id:
            return ErrorResponse(
                message="Please provide the agent ID to edit.",
                error="Missing agent_id parameter",
                session_id=session_id,
            )
        if not changes:
            return ErrorResponse(
                message="Please describe what changes you want to make.",
                error="Missing changes parameter",
                session_id=session_id,
            )
        # Step 1: Fetch current agent
        current_agent = await get_agent_as_json(agent_id, user_id)
        if current_agent is None:
            return ErrorResponse(
                message=f"Could not find agent with ID '{agent_id}' in your library.",
                error="agent_not_found",
                session_id=session_id,
            )
        # Build the update request with context
        update_request = changes
        if context:
            update_request = f"{changes}\n\nAdditional context:\n{context}"
        # Step 2: Generate patch with retry on validation failure
        blocks_info = get_blocks_info()
        updated_agent = None
        validation_errors = None
        intent = "Applied requested changes"
        for attempt in range(MAX_GENERATION_RETRIES + 1):
            # Generate patch (include validation errors from previous attempt)
            try:
                if attempt == 0:
                    patch_result = await generate_agent_patch(
                        update_request, current_agent
                    )
                else:
                    # Retry with validation error feedback
                    logger.info(
                        f"Retry {attempt}/{MAX_GENERATION_RETRIES} with validation feedback"
                    )
                    retry_request = (
                        f"{update_request}\n\n"
                        f"IMPORTANT: The previous edit had validation errors. "
                        f"Please fix these issues:\n{validation_errors}"
                    )
                    patch_result = await generate_agent_patch(
                        retry_request, current_agent
                    )
            except ValueError as e:
                # Handle missing API key or configuration errors
                return ErrorResponse(
                    message=f"Agent generation is not configured: {str(e)}",
                    error="configuration_error",
                    session_id=session_id,
                )
            if patch_result is None:
                if attempt == MAX_GENERATION_RETRIES:
                    return ErrorResponse(
                        message="Failed to generate changes. Please try rephrasing.",
                        error="Patch generation failed",
                        session_id=session_id,
                    )
                continue
            # Check if LLM returned clarifying questions
            if patch_result.get("type") == "clarifying_questions":
                questions = patch_result.get("questions", [])
                return ClarificationNeededResponse(
                    message=(
                        "I need some more information about the changes. "
                        "Please answer the following questions:"
                    ),
                    questions=[
                        ClarifyingQuestion(
                            question=q.get("question", ""),
                            keyword=q.get("keyword", ""),
                            example=q.get("example"),
                        )
                        for q in questions
                    ],
                    session_id=session_id,
                )
            # Step 3: Apply patch and fixes
            try:
                updated_agent = apply_agent_patch(current_agent, patch_result)
                updated_agent = apply_all_fixes(updated_agent, blocks_info)
            except Exception as e:
                if attempt == MAX_GENERATION_RETRIES:
                    return ErrorResponse(
                        message=f"Failed to apply changes: {str(e)}",
                        error="patch_apply_failed",
                        details={"exception": str(e)},
                        session_id=session_id,
                    )
                validation_errors = str(e)
                continue
            # Step 4: Validate the updated agent
            is_valid, validation_errors = validate_agent(updated_agent, blocks_info)
            if is_valid:
                logger.info(f"Agent edited successfully on attempt {attempt + 1}")
                intent = patch_result.get("intent", "Applied requested changes")
                break
            logger.warning(
                f"Validation failed on attempt {attempt + 1}: {validation_errors}"
            )
            if attempt == MAX_GENERATION_RETRIES:
                # Return error with validation details
                return ErrorResponse(
                    message=(
                        f"Updated agent has validation errors after "
                        f"{MAX_GENERATION_RETRIES + 1} attempts. "
                        f"Please try rephrasing your request or simplify the changes."
                    ),
                    error="validation_failed",
                    details={"validation_errors": validation_errors},
                    session_id=session_id,
                )
        # At this point, updated_agent is guaranteed to be set (we return on all failure paths)
        assert updated_agent is not None
        agent_name = updated_agent.get("name", "Updated Agent")
        agent_description = updated_agent.get("description", "")
        node_count = len(updated_agent.get("nodes", []))
        link_count = len(updated_agent.get("links", []))
        # Step 5: Preview or save
        if not save:
            return AgentPreviewResponse(
                message=(
                    f"I've updated the agent. Changes: {intent}. "
                    f"The agent now has {node_count} blocks. "
                    f"Review it and call edit_agent with save=true to save the changes."
                ),
                agent_json=updated_agent,
                agent_name=agent_name,
                description=agent_description,
                node_count=node_count,
                link_count=link_count,
                session_id=session_id,
            )
        # Save to library (creates a new version)
        if not user_id:
            return ErrorResponse(
                message="You must be logged in to save agents.",
                error="auth_required",
                session_id=session_id,
            )
        try:
            created_graph, library_agent = await save_agent_to_library(
                updated_agent, user_id, is_update=True
            )
            return AgentSavedResponse(
                message=(
                    f"Updated agent '{created_graph.name}' has been saved to your library! "
                    f"Changes: {intent}"
                ),
                agent_id=created_graph.id,
                agent_name=created_graph.name,
                library_agent_id=library_agent.id,
                library_agent_link=f"/library/{library_agent.id}",
                agent_page_link=f"/build?flowID={created_graph.id}",
                session_id=session_id,
            )
        except Exception as e:
            return ErrorResponse(
                message=f"Failed to save the updated agent: {str(e)}",
                error="save_failed",
                details={"exception": str(e)},
                session_id=session_id,
            )
--- a/autogpt_platform/backend/backend/api/features/chat/tools/find_agent.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/find_agent.py
@@ -1,26 +1,18 @@
-"""Tool for discovering agents from marketplace and user library."""
+"""Tool for discovering agents from marketplace."""
 import logging
 from typing import Any
 from langfuse import observe
 from backend.api.features.chat.model import ChatSession
 from backend.api.features.store import db as store_db
 from backend.util.exceptions import DatabaseError, NotFoundError
 from .agent_search import search_agents
 from .base import BaseTool
-from .models import (
+from .models import ToolResponseBase
    AgentCarouselResponse,
    AgentInfo,
    ErrorResponse,
    NoResultsResponse,
    ToolResponseBase,
 )
 logger = logging.getLogger(__name__)
 class FindAgentTool(BaseTool):
-    """Tool for discovering agents based on user needs."""
+    """Tool for discovering agents from the marketplace."""
    @property
    def name(self) -> str:
@@ -45,85 +37,13 @@ class FindAgentTool(BaseTool):
            "required": ["query"],
        }
    @observe(as_type="tool", name="find_agent")
    async def _execute(
-        self,
+        self, user_id: str | None, session: ChatSession, **kwargs
        user_id: str | None,
        session: ChatSession,
        **kwargs,
    ) -> ToolResponseBase:
-        """Search for agents in the marketplace.
+        return await search_agents(
-
+            query=kwargs.get("query", "").strip(),
-        Args:
+            source="marketplace",
-            user_id: User ID (may be anonymous)
+            session_id=session.session_id,
-            session_id: Chat session ID
+            user_id=user_id,
            query: Search query
        Returns:
            AgentCarouselResponse: List of agents found in the marketplace
            NoResultsResponse: No agents found in the marketplace
            ErrorResponse: Error message
        """
        query = kwargs.get("query", "").strip()
        session_id = session.session_id
        if not query:
            return ErrorResponse(
                message="Please provide a search query",
                session_id=session_id,
            )
        agents = []
        try:
            logger.info(f"Searching marketplace for: {query}")
            store_results = await store_db.get_store_agents(
                search_query=query,
                page_size=5,
            )
            logger.info(f"Find agents tool found {len(store_results.agents)} agents")
            for agent in store_results.agents:
                agent_id = f"{agent.creator}/{agent.slug}"
                logger.info(f"Building agent ID = {agent_id}")
                agents.append(
                    AgentInfo(
                        id=agent_id,
                        name=agent.agent_name,
                        description=agent.description or "",
                        source="marketplace",
                        in_library=False,
                        creator=agent.creator,
                        category="general",
                        rating=agent.rating,
                        runs=agent.runs,
                        is_featured=False,
                    ),
                )
        except NotFoundError:
            pass
        except DatabaseError as e:
            logger.error(f"Error searching agents: {e}", exc_info=True)
            return ErrorResponse(
                message="Failed to search for agents. Please try again.",
                error=str(e),
                session_id=session_id,
            )
        if not agents:
            return NoResultsResponse(
                message=f"No agents found matching '{query}'. Try different keywords or browse the marketplace. If you have 3 consecutive find_agent tool calls results and found no agents. Please stop trying and ask the user if there is anything else you can help with.",
                session_id=session_id,
                suggestions=[
                    "Try more general terms",
                    "Browse categories in the marketplace",
                    "Check spelling",
                ],
            )
        # Return formatted carousel
        title = (
            f"Found {len(agents)} agent{'s' if len(agents) != 1 else ''} for '{query}'"
        )
        return AgentCarouselResponse(
            message="Now you have found some options for the user to choose from. You can add a link to a recommended agent at: /marketplace/agent/agent_id Please ask the user if they would like to use any of these agents. If they do, please call the get_agent_details tool for this agent.",
            title=title,
            agents=agents,
            count=len(agents),
            session_id=session_id,
        )
--- a/autogpt_platform/backend/backend/api/features/chat/tools/find_block.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/find_block.py
@@ -0,0 +1,194 @@
 import logging
 from typing import Any
 from langfuse import observe
 from prisma.enums import ContentType
 from backend.api.features.chat.model import ChatSession
 from backend.api.features.chat.tools.base import BaseTool, ToolResponseBase
 from backend.api.features.chat.tools.models import (
    BlockInfoSummary,
    BlockInputFieldInfo,
    BlockListResponse,
    ErrorResponse,
    NoResultsResponse,
 )
 from backend.api.features.store.hybrid_search import unified_hybrid_search
 from backend.data.block import get_block
 logger = logging.getLogger(__name__)
 class FindBlockTool(BaseTool):
    """Tool for searching available blocks."""
    @property
    def name(self) -> str:
        return "find_block"
    @property
    def description(self) -> str:
        return (
            "Search for available blocks by name or description. "
            "Blocks are reusable components that perform specific tasks like "
            "sending emails, making API calls, processing text, etc. "
            "IMPORTANT: Use this tool FIRST to get the block's 'id' before calling run_block. "
            "The response includes each block's id, required_inputs, and input_schema."
        )
    @property
    def parameters(self) -> dict[str, Any]:
        return {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": (
                        "Search query to find blocks by name or description. "
                        "Use keywords like 'email', 'http', 'text', 'ai', etc."
                    ),
                },
            },
            "required": ["query"],
        }
    @property
    def requires_auth(self) -> bool:
        return True
    @observe(as_type="tool", name="find_block")
    async def _execute(
        self,
        user_id: str | None,
        session: ChatSession,
        **kwargs,
    ) -> ToolResponseBase:
        """Search for blocks matching the query.
        Args:
            user_id: User ID (required)
            session: Chat session
            query: Search query
        Returns:
            BlockListResponse: List of matching blocks
            NoResultsResponse: No blocks found
            ErrorResponse: Error message
        """
        query = kwargs.get("query", "").strip()
        session_id = session.session_id
        if not query:
            return ErrorResponse(
                message="Please provide a search query",
                session_id=session_id,
            )
        try:
            # Search for blocks using hybrid search
            results, total = await unified_hybrid_search(
                query=query,
                content_types=[ContentType.BLOCK],
                page=1,
                page_size=10,
            )
            if not results:
                return NoResultsResponse(
                    message=f"No blocks found for '{query}'",
                    suggestions=[
                        "Try broader keywords like 'email', 'http', 'text', 'ai'",
                        "Check spelling of technical terms",
                    ],
                    session_id=session_id,
                )
            # Enrich results with full block information
            blocks: list[BlockInfoSummary] = []
            for result in results:
                block_id = result["content_id"]
                block = get_block(block_id)
                if block:
                    # Get input/output schemas
                    input_schema = {}
                    output_schema = {}
                    try:
                        input_schema = block.input_schema.jsonschema()
                    except Exception:
                        pass
                    try:
                        output_schema = block.output_schema.jsonschema()
                    except Exception:
                        pass
                    # Get categories from block instance
                    categories = []
                    if hasattr(block, "categories") and block.categories:
                        categories = [cat.value for cat in block.categories]
                    # Extract required inputs for easier use
                    required_inputs: list[BlockInputFieldInfo] = []
                    if input_schema:
                        properties = input_schema.get("properties", {})
                        required_fields = set(input_schema.get("required", []))
                        # Get credential field names to exclude from required inputs
                        credentials_fields = set(
                            block.input_schema.get_credentials_fields().keys()
                        )
                        for field_name, field_schema in properties.items():
                            # Skip credential fields - they're handled separately
                            if field_name in credentials_fields:
                                continue
                            required_inputs.append(
                                BlockInputFieldInfo(
                                    name=field_name,
                                    type=field_schema.get("type", "string"),
                                    description=field_schema.get("description", ""),
                                    required=field_name in required_fields,
                                    default=field_schema.get("default"),
                                )
                            )
                    blocks.append(
                        BlockInfoSummary(
                            id=block_id,
                            name=block.name,
                            description=block.description or "",
                            categories=categories,
                            input_schema=input_schema,
                            output_schema=output_schema,
                            required_inputs=required_inputs,
                        )
                    )
            if not blocks:
                return NoResultsResponse(
                    message=f"No blocks found for '{query}'",
                    suggestions=[
                        "Try broader keywords like 'email', 'http', 'text', 'ai'",
                    ],
                    session_id=session_id,
                )
            return BlockListResponse(
                message=(
                    f"Found {len(blocks)} block(s) matching '{query}'. "
                    "To execute a block, use run_block with the block's 'id' field "
                    "and provide 'input_data' matching the block's input_schema."
                ),
                blocks=blocks,
                count=len(blocks),
                query=query,
                session_id=session_id,
            )
        except Exception as e:
            logger.error(f"Error searching blocks: {e}", exc_info=True)
            return ErrorResponse(
                message="Failed to search blocks",
                error=str(e),
                session_id=session_id,
            )
--- a/autogpt_platform/backend/backend/api/features/chat/tools/find_library_agent.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/find_library_agent.py
@@ -0,0 +1,55 @@
 """Tool for searching agents in the user's library."""
 from typing import Any
 from langfuse import observe
 from backend.api.features.chat.model import ChatSession
 from .agent_search import search_agents
 from .base import BaseTool
 from .models import ToolResponseBase
 class FindLibraryAgentTool(BaseTool):
    """Tool for searching agents in the user's library."""
    @property
    def name(self) -> str:
        return "find_library_agent"
    @property
    def description(self) -> str:
        return (
            "Search for agents in the user's library. Use this to find agents "
            "the user has already added to their library, including agents they "
            "created or added from the marketplace."
        )
    @property
    def parameters(self) -> dict[str, Any]:
        return {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": "Search query to find agents by name or description.",
                },
            },
            "required": ["query"],
        }
    @property
    def requires_auth(self) -> bool:
        return True
    @observe(as_type="tool", name="find_library_agent")
    async def _execute(
        self, user_id: str | None, session: ChatSession, **kwargs
    ) -> ToolResponseBase:
        return await search_agents(
            query=kwargs.get("query", "").strip(),
            source="library",
            session_id=session.session_id,
            user_id=user_id,
        )
--- a/autogpt_platform/backend/backend/api/features/chat/tools/get_doc_page.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/get_doc_page.py
@@ -0,0 +1,151 @@
 """GetDocPageTool - Fetch full content of a documentation page."""
 import logging
 from pathlib import Path
 from typing import Any
 from langfuse import observe
 from backend.api.features.chat.model import ChatSession
 from backend.api.features.chat.tools.base import BaseTool
 from backend.api.features.chat.tools.models import (
    DocPageResponse,
    ErrorResponse,
    ToolResponseBase,
 )
 logger = logging.getLogger(__name__)
 # Base URL for documentation (can be configured)
 DOCS_BASE_URL = "https://docs.agpt.co"
 class GetDocPageTool(BaseTool):
    """Tool for fetching full content of a documentation page."""
    @property
    def name(self) -> str:
        return "get_doc_page"
    @property
    def description(self) -> str:
        return (
            "Get the full content of a documentation page by its path. "
            "Use this after search_docs to read the complete content of a relevant page."
        )
    @property
    def parameters(self) -> dict[str, Any]:
        return {
            "type": "object",
            "properties": {
                "path": {
                    "type": "string",
                    "description": (
                        "The path to the documentation file, as returned by search_docs. "
                        "Example: 'platform/block-sdk-guide.md'"
                    ),
                },
            },
            "required": ["path"],
        }
    @property
    def requires_auth(self) -> bool:
        return False  # Documentation is public
    def _get_docs_root(self) -> Path:
        """Get the documentation root directory."""
        this_file = Path(__file__)
        project_root = this_file.parent.parent.parent.parent.parent.parent.parent.parent
        return project_root / "docs"
    def _extract_title(self, content: str, fallback: str) -> str:
        """Extract title from markdown content."""
        lines = content.split("\n")
        for line in lines:
            if line.startswith("# "):
                return line[2:].strip()
        return fallback
    def _make_doc_url(self, path: str) -> str:
        """Create a URL for a documentation page."""
        url_path = path.rsplit(".", 1)[0] if "." in path else path
        return f"{DOCS_BASE_URL}/{url_path}"
    @observe(as_type="tool", name="get_doc_page")
    async def _execute(
        self,
        user_id: str | None,
        session: ChatSession,
        **kwargs,
    ) -> ToolResponseBase:
        """Fetch full content of a documentation page.
        Args:
            user_id: User ID (not required for docs)
            session: Chat session
            path: Path to the documentation file
        Returns:
            DocPageResponse: Full document content
            ErrorResponse: Error message
        """
        path = kwargs.get("path", "").strip()
        session_id = session.session_id if session else None
        if not path:
            return ErrorResponse(
                message="Please provide a documentation path.",
                error="Missing path parameter",
                session_id=session_id,
            )
        # Sanitize path to prevent directory traversal
        if ".." in path or path.startswith("/"):
            return ErrorResponse(
                message="Invalid documentation path.",
                error="invalid_path",
                session_id=session_id,
            )
        docs_root = self._get_docs_root()
        full_path = docs_root / path
        if not full_path.exists():
            return ErrorResponse(
                message=f"Documentation page not found: {path}",
                error="not_found",
                session_id=session_id,
            )
        # Ensure the path is within docs root
        try:
            full_path.resolve().relative_to(docs_root.resolve())
        except ValueError:
            return ErrorResponse(
                message="Invalid documentation path.",
                error="invalid_path",
                session_id=session_id,
            )
        try:
            content = full_path.read_text(encoding="utf-8")
            title = self._extract_title(content, path)
            return DocPageResponse(
                message=f"Retrieved documentation page: {title}",
                title=title,
                path=path,
                content=content,
                doc_url=self._make_doc_url(path),
                session_id=session_id,
            )
        except Exception as e:
            logger.error(f"Failed to read documentation page {path}: {e}")
            return ErrorResponse(
                message=f"Failed to read documentation page: {str(e)}",
                error="read_failed",
                session_id=session_id,
            )
--- a/autogpt_platform/backend/backend/api/features/chat/tools/models.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/models.py
@@ -1,5 +1,6 @@
 """Pydantic models for tool responses."""
 from datetime import datetime
 from enum import Enum
 from typing import Any
@@ -11,14 +12,22 @@ from backend.data.model import CredentialsMetaInput
 class ResponseType(str, Enum):
    """Types of tool responses."""
-    AGENT_CAROUSEL = "agent_carousel"
+    AGENTS_FOUND = "agents_found"
    AGENT_DETAILS = "agent_details"
    SETUP_REQUIREMENTS = "setup_requirements"
    EXECUTION_STARTED = "execution_started"
    NEED_LOGIN = "need_login"
    ERROR = "error"
    NO_RESULTS = "no_results"
-    SUCCESS = "success"
+    AGENT_OUTPUT = "agent_output"
    UNDERSTANDING_UPDATED = "understanding_updated"
    AGENT_PREVIEW = "agent_preview"
    AGENT_SAVED = "agent_saved"
    CLARIFICATION_NEEDED = "clarification_needed"
    BLOCK_LIST = "block_list"
    BLOCK_OUTPUT = "block_output"
    DOC_SEARCH_RESULTS = "doc_search_results"
    DOC_PAGE = "doc_page"
 # Base response model
@@ -51,14 +60,14 @@ class AgentInfo(BaseModel):
    graph_id: str | None = None
-class AgentCarouselResponse(ToolResponseBase):
+class AgentsFoundResponse(ToolResponseBase):
    """Response for find_agent tool."""
-    type: ResponseType = ResponseType.AGENT_CAROUSEL
+    type: ResponseType = ResponseType.AGENTS_FOUND
    title: str = "Available Agents"
    agents: list[AgentInfo]
    count: int
-    name: str = "agent_carousel"
+    name: str = "agents_found"
 class NoResultsResponse(ToolResponseBase):
@@ -173,3 +182,155 @@ class ErrorResponse(ToolResponseBase):
    type: ResponseType = ResponseType.ERROR
    error: str | None = None
    details: dict[str, Any] | None = None
 # Agent output models
 class ExecutionOutputInfo(BaseModel):
    """Summary of a single execution's outputs."""
    execution_id: str
    status: str
    started_at: datetime | None = None
    ended_at: datetime | None = None
    outputs: dict[str, list[Any]]
    inputs_summary: dict[str, Any] | None = None
 class AgentOutputResponse(ToolResponseBase):
    """Response for agent_output tool."""
    type: ResponseType = ResponseType.AGENT_OUTPUT
    agent_name: str
    agent_id: str
    library_agent_id: str | None = None
    library_agent_link: str | None = None
    execution: ExecutionOutputInfo | None = None
    available_executions: list[dict[str, Any]] | None = None
    total_executions: int = 0
 # Business understanding models
 class UnderstandingUpdatedResponse(ToolResponseBase):
    """Response for add_understanding tool."""
    type: ResponseType = ResponseType.UNDERSTANDING_UPDATED
    updated_fields: list[str] = Field(default_factory=list)
    current_understanding: dict[str, Any] = Field(default_factory=dict)
 # Agent generation models
 class ClarifyingQuestion(BaseModel):
    """A question that needs user clarification."""
    question: str
    keyword: str
    example: str | None = None
 class AgentPreviewResponse(ToolResponseBase):
    """Response for previewing a generated agent before saving."""
    type: ResponseType = ResponseType.AGENT_PREVIEW
    agent_json: dict[str, Any]
    agent_name: str
    description: str
    node_count: int
    link_count: int = 0
 class AgentSavedResponse(ToolResponseBase):
    """Response when an agent is saved to the library."""
    type: ResponseType = ResponseType.AGENT_SAVED
    agent_id: str
    agent_name: str
    library_agent_id: str
    library_agent_link: str
    agent_page_link: str  # Link to the agent builder/editor page
 class ClarificationNeededResponse(ToolResponseBase):
    """Response when the LLM needs more information from the user."""
    type: ResponseType = ResponseType.CLARIFICATION_NEEDED
    questions: list[ClarifyingQuestion] = Field(default_factory=list)
 # Documentation search models
 class DocSearchResult(BaseModel):
    """A single documentation search result."""
    title: str
    path: str
    section: str
    snippet: str  # Short excerpt for UI display
    score: float
    doc_url: str | None = None
 class DocSearchResultsResponse(ToolResponseBase):
    """Response for search_docs tool."""
    type: ResponseType = ResponseType.DOC_SEARCH_RESULTS
    results: list[DocSearchResult]
    count: int
    query: str
 class DocPageResponse(ToolResponseBase):
    """Response for get_doc_page tool."""
    type: ResponseType = ResponseType.DOC_PAGE
    title: str
    path: str
    content: str  # Full document content
    doc_url: str | None = None
 # Block models
 class BlockInputFieldInfo(BaseModel):
    """Information about a block input field."""
    name: str
    type: str
    description: str = ""
    required: bool = False
    default: Any | None = None
 class BlockInfoSummary(BaseModel):
    """Summary of a block for search results."""
    id: str
    name: str
    description: str
    categories: list[str]
    input_schema: dict[str, Any]
    output_schema: dict[str, Any]
    required_inputs: list[BlockInputFieldInfo] = Field(
        default_factory=list,
        description="List of required input fields for this block",
    )
 class BlockListResponse(ToolResponseBase):
    """Response for find_block tool."""
    type: ResponseType = ResponseType.BLOCK_LIST
    blocks: list[BlockInfoSummary]
    count: int
    query: str
    usage_hint: str = Field(
        default="To execute a block, call run_block with block_id set to the block's "
        "'id' field and input_data containing the required fields from input_schema."
    )
 class BlockOutputResponse(ToolResponseBase):
    """Response for run_block tool."""
    type: ResponseType = ResponseType.BLOCK_OUTPUT
    block_id: str
    block_name: str
    outputs: dict[str, list[Any]]
    success: bool = True
--- a/Show More
+++ b/Show More