refactor(frontend): streamline NodeDataViewer component and execution results handling

### Changes - Removed unused `NodeExecutionResult` type and `executionResults` prop from `NodeDataViewerProps`. - Simplified the logic for resolving execution results by directly using the `useNodeStore` hook. - Updated the component to ensure consistent handling of data types and improved readability. ### Impact - Enhances code clarity and maintainability by reducing unnecessary complexity in the component. - Ensures that the latest execution results are effectively utilized in the data viewer. ### Testing - Verified that the component functions correctly with the updated logic and maintains expected behavior.
Merge branch 'dev' into abhi/show-all-execution-node
2026-01-25 06:58:21 -05:00 · 2026-01-25 12:25:49 +05:30 · 2026-01-25 12:17:28 +05:30 · 2026-01-25 12:17:12 +05:30 · 2026-01-25 12:03:22 +05:30 · 2026-01-25 11:54:05 +05:30
582 changed files with 46520 additions and 6135 deletions
--- a/.claude/skills/vercel-react-best-practices/AGENTS.md
+++ b/.claude/skills/vercel-react-best-practices/AGENTS.md
--- a/.claude/skills/vercel-react-best-practices/SKILL.md
+++ b/.claude/skills/vercel-react-best-practices/SKILL.md
@@ -0,0 +1,125 @@
+---
+name: vercel-react-best-practices
+description: React and Next.js performance optimization guidelines from Vercel Engineering. This skill should be used when writing, reviewing, or refactoring React/Next.js code to ensure optimal performance patterns. Triggers on tasks involving React components, Next.js pages, data fetching, bundle optimization, or performance improvements.
+license: MIT
+metadata:
+  author: vercel
+  version: "1.0.0"
+---
+
+# Vercel React Best Practices
+
+Comprehensive performance optimization guide for React and Next.js applications, maintained by Vercel. Contains 45 rules across 8 categories, prioritized by impact to guide automated refactoring and code generation.
+
+## When to Apply
+
+Reference these guidelines when:
+- Writing new React components or Next.js pages
+- Implementing data fetching (client or server-side)
+- Reviewing code for performance issues
+- Refactoring existing React/Next.js code
+- Optimizing bundle size or load times
+
+## Rule Categories by Priority
+
+| Priority | Category | Impact | Prefix |
+|----------|----------|--------|--------|
+| 1 | Eliminating Waterfalls | CRITICAL | `async-` |
+| 2 | Bundle Size Optimization | CRITICAL | `bundle-` |
+| 3 | Server-Side Performance | HIGH | `server-` |
+| 4 | Client-Side Data Fetching | MEDIUM-HIGH | `client-` |
+| 5 | Re-render Optimization | MEDIUM | `rerender-` |
+| 6 | Rendering Performance | MEDIUM | `rendering-` |
+| 7 | JavaScript Performance | LOW-MEDIUM | `js-` |
+| 8 | Advanced Patterns | LOW | `advanced-` |
+
+## Quick Reference
+
+### 1. Eliminating Waterfalls (CRITICAL)
+
+- `async-defer-await` - Move await into branches where actually used
+- `async-parallel` - Use Promise.all() for independent operations
+- `async-dependencies` - Use better-all for partial dependencies
+- `async-api-routes` - Start promises early, await late in API routes
+- `async-suspense-boundaries` - Use Suspense to stream content
+
+### 2. Bundle Size Optimization (CRITICAL)
+
+- `bundle-barrel-imports` - Import directly, avoid barrel files
+- `bundle-dynamic-imports` - Use next/dynamic for heavy components
+- `bundle-defer-third-party` - Load analytics/logging after hydration
+- `bundle-conditional` - Load modules only when feature is activated
+- `bundle-preload` - Preload on hover/focus for perceived speed
+
+### 3. Server-Side Performance (HIGH)
+
+- `server-cache-react` - Use React.cache() for per-request deduplication
+- `server-cache-lru` - Use LRU cache for cross-request caching
+- `server-serialization` - Minimize data passed to client components
+- `server-parallel-fetching` - Restructure components to parallelize fetches
+- `server-after-nonblocking` - Use after() for non-blocking operations
+
+### 4. Client-Side Data Fetching (MEDIUM-HIGH)
+
+- `client-swr-dedup` - Use SWR for automatic request deduplication
+- `client-event-listeners` - Deduplicate global event listeners
+
+### 5. Re-render Optimization (MEDIUM)
+
+- `rerender-defer-reads` - Don't subscribe to state only used in callbacks
+- `rerender-memo` - Extract expensive work into memoized components
+- `rerender-dependencies` - Use primitive dependencies in effects
+- `rerender-derived-state` - Subscribe to derived booleans, not raw values
+- `rerender-functional-setstate` - Use functional setState for stable callbacks
+- `rerender-lazy-state-init` - Pass function to useState for expensive values
+- `rerender-transitions` - Use startTransition for non-urgent updates
+
+### 6. Rendering Performance (MEDIUM)
+
+- `rendering-animate-svg-wrapper` - Animate div wrapper, not SVG element
+- `rendering-content-visibility` - Use content-visibility for long lists
+- `rendering-hoist-jsx` - Extract static JSX outside components
+- `rendering-svg-precision` - Reduce SVG coordinate precision
+- `rendering-hydration-no-flicker` - Use inline script for client-only data
+- `rendering-activity` - Use Activity component for show/hide
+- `rendering-conditional-render` - Use ternary, not && for conditionals
+
+### 7. JavaScript Performance (LOW-MEDIUM)
+
+- `js-batch-dom-css` - Group CSS changes via classes or cssText
+- `js-index-maps` - Build Map for repeated lookups
+- `js-cache-property-access` - Cache object properties in loops
+- `js-cache-function-results` - Cache function results in module-level Map
+- `js-cache-storage` - Cache localStorage/sessionStorage reads
+- `js-combine-iterations` - Combine multiple filter/map into one loop
+- `js-length-check-first` - Check array length before expensive comparison
+- `js-early-exit` - Return early from functions
+- `js-hoist-regexp` - Hoist RegExp creation outside loops
+- `js-min-max-loop` - Use loop for min/max instead of sort
+- `js-set-map-lookups` - Use Set/Map for O(1) lookups
+- `js-tosorted-immutable` - Use toSorted() for immutability
+
+### 8. Advanced Patterns (LOW)
+
+- `advanced-event-handler-refs` - Store event handlers in refs
+- `advanced-use-latest` - useLatest for stable callback refs
+
+## How to Use
+
+Read individual rule files for detailed explanations and code examples:
+
+```
+rules/async-parallel.md
+rules/bundle-barrel-imports.md
+rules/_sections.md
+```
+
+Each rule file contains:
+- Brief explanation of why it matters
+- Incorrect code example with explanation
+- Correct code example with explanation
+- Additional context and references
+
+## Full Compiled Document
+
+For the complete guide with all rules expanded: `AGENTS.md`
--- a/.claude/skills/vercel-react-best-practices/rules/advanced-event-handler-refs.md
+++ b/.claude/skills/vercel-react-best-practices/rules/advanced-event-handler-refs.md
@@ -0,0 +1,55 @@
+---
+title: Store Event Handlers in Refs
+impact: LOW
+impactDescription: stable subscriptions
+tags: advanced, hooks, refs, event-handlers, optimization
+---
+
+## Store Event Handlers in Refs
+
+Store callbacks in refs when used in effects that shouldn't re-subscribe on callback changes.
+
+**Incorrect (re-subscribes on every render):**
+
+```tsx
+function useWindowEvent(event: string, handler: () => void) {
+  useEffect(() => {
+    window.addEventListener(event, handler)
+    return () => window.removeEventListener(event, handler)
+  }, [event, handler])
+}
+```
+
+**Correct (stable subscription):**
+
+```tsx
+function useWindowEvent(event: string, handler: () => void) {
+  const handlerRef = useRef(handler)
+  useEffect(() => {
+    handlerRef.current = handler
+  }, [handler])
+
+  useEffect(() => {
+    const listener = () => handlerRef.current()
+    window.addEventListener(event, listener)
+    return () => window.removeEventListener(event, listener)
+  }, [event])
+}
+```
+
+**Alternative: use `useEffectEvent` if you're on latest React:**
+
+```tsx
+import { useEffectEvent } from 'react'
+
+function useWindowEvent(event: string, handler: () => void) {
+  const onEvent = useEffectEvent(handler)
+
+  useEffect(() => {
+    window.addEventListener(event, onEvent)
+    return () => window.removeEventListener(event, onEvent)
+  }, [event])
+}
+```
+
+`useEffectEvent` provides a cleaner API for the same pattern: it creates a stable function reference that always calls the latest version of the handler.
--- a/.claude/skills/vercel-react-best-practices/rules/advanced-use-latest.md
+++ b/.claude/skills/vercel-react-best-practices/rules/advanced-use-latest.md
@@ -0,0 +1,49 @@
+---
+title: useLatest for Stable Callback Refs
+impact: LOW
+impactDescription: prevents effect re-runs
+tags: advanced, hooks, useLatest, refs, optimization
+---
+
+## useLatest for Stable Callback Refs
+
+Access latest values in callbacks without adding them to dependency arrays. Prevents effect re-runs while avoiding stale closures.
+
+**Implementation:**
+
+```typescript
+function useLatest<T>(value: T) {
+  const ref = useRef(value)
+  useEffect(() => {
+    ref.current = value
+  }, [value])
+  return ref
+}
+```
+
+**Incorrect (effect re-runs on every callback change):**
+
+```tsx
+function SearchInput({ onSearch }: { onSearch: (q: string) => void }) {
+  const [query, setQuery] = useState('')
+
+  useEffect(() => {
+    const timeout = setTimeout(() => onSearch(query), 300)
+    return () => clearTimeout(timeout)
+  }, [query, onSearch])
+}
+```
+
+**Correct (stable effect, fresh callback):**
+
+```tsx
+function SearchInput({ onSearch }: { onSearch: (q: string) => void }) {
+  const [query, setQuery] = useState('')
+  const onSearchRef = useLatest(onSearch)
+
+  useEffect(() => {
+    const timeout = setTimeout(() => onSearchRef.current(query), 300)
+    return () => clearTimeout(timeout)
+  }, [query])
+}
+```
--- a/.claude/skills/vercel-react-best-practices/rules/async-api-routes.md
+++ b/.claude/skills/vercel-react-best-practices/rules/async-api-routes.md
@@ -0,0 +1,38 @@
+---
+title: Prevent Waterfall Chains in API Routes
+impact: CRITICAL
+impactDescription: 2-10× improvement
+tags: api-routes, server-actions, waterfalls, parallelization
+---
+
+## Prevent Waterfall Chains in API Routes
+
+In API routes and Server Actions, start independent operations immediately, even if you don't await them yet.
+
+**Incorrect (config waits for auth, data waits for both):**
+
+```typescript
+export async function GET(request: Request) {
+  const session = await auth()
+  const config = await fetchConfig()
+  const data = await fetchData(session.user.id)
+  return Response.json({ data, config })
+}
+```
+
+**Correct (auth and config start immediately):**
+
+```typescript
+export async function GET(request: Request) {
+  const sessionPromise = auth()
+  const configPromise = fetchConfig()
+  const session = await sessionPromise
+  const [config, data] = await Promise.all([
+    configPromise,
+    fetchData(session.user.id)
+  ])
+  return Response.json({ data, config })
+}
+```
+
+For operations with more complex dependency chains, use `better-all` to automatically maximize parallelism (see Dependency-Based Parallelization).
--- a/.claude/skills/vercel-react-best-practices/rules/async-defer-await.md
+++ b/.claude/skills/vercel-react-best-practices/rules/async-defer-await.md
@@ -0,0 +1,80 @@
+---
+title: Defer Await Until Needed
+impact: HIGH
+impactDescription: avoids blocking unused code paths
+tags: async, await, conditional, optimization
+---
+
+## Defer Await Until Needed
+
+Move `await` operations into the branches where they're actually used to avoid blocking code paths that don't need them.
+
+**Incorrect (blocks both branches):**
+
+```typescript
+async function handleRequest(userId: string, skipProcessing: boolean) {
+  const userData = await fetchUserData(userId)
+  
+  if (skipProcessing) {
+    // Returns immediately but still waited for userData
+    return { skipped: true }
+  }
+  
+  // Only this branch uses userData
+  return processUserData(userData)
+}
+```
+
+**Correct (only blocks when needed):**
+
+```typescript
+async function handleRequest(userId: string, skipProcessing: boolean) {
+  if (skipProcessing) {
+    // Returns immediately without waiting
+    return { skipped: true }
+  }
+  
+  // Fetch only when needed
+  const userData = await fetchUserData(userId)
+  return processUserData(userData)
+}
+```
+
+**Another example (early return optimization):**
+
+```typescript
+// Incorrect: always fetches permissions
+async function updateResource(resourceId: string, userId: string) {
+  const permissions = await fetchPermissions(userId)
+  const resource = await getResource(resourceId)
+  
+  if (!resource) {
+    return { error: 'Not found' }
+  }
+  
+  if (!permissions.canEdit) {
+    return { error: 'Forbidden' }
+  }
+  
+  return await updateResourceData(resource, permissions)
+}
+
+// Correct: fetches only when needed
+async function updateResource(resourceId: string, userId: string) {
+  const resource = await getResource(resourceId)
+  
+  if (!resource) {
+    return { error: 'Not found' }
+  }
+  
+  const permissions = await fetchPermissions(userId)
+  
+  if (!permissions.canEdit) {
+    return { error: 'Forbidden' }
+  }
+  
+  return await updateResourceData(resource, permissions)
+}
+```
+
+This optimization is especially valuable when the skipped branch is frequently taken, or when the deferred operation is expensive.
--- a/.claude/skills/vercel-react-best-practices/rules/async-dependencies.md
+++ b/.claude/skills/vercel-react-best-practices/rules/async-dependencies.md
@@ -0,0 +1,36 @@
+---
+title: Dependency-Based Parallelization
+impact: CRITICAL
+impactDescription: 2-10× improvement
+tags: async, parallelization, dependencies, better-all
+---
+
+## Dependency-Based Parallelization
+
+For operations with partial dependencies, use `better-all` to maximize parallelism. It automatically starts each task at the earliest possible moment.
+
+**Incorrect (profile waits for config unnecessarily):**
+
+```typescript
+const [user, config] = await Promise.all([
+  fetchUser(),
+  fetchConfig()
+])
+const profile = await fetchProfile(user.id)
+```
+
+**Correct (config and profile run in parallel):**
+
+```typescript
+import { all } from 'better-all'
+
+const { user, config, profile } = await all({
+  async user() { return fetchUser() },
+  async config() { return fetchConfig() },
+  async profile() {
+    return fetchProfile((await this.$.user).id)
+  }
+})
+```
+
+Reference: [https://github.com/shuding/better-all](https://github.com/shuding/better-all)
--- a/.claude/skills/vercel-react-best-practices/rules/async-parallel.md
+++ b/.claude/skills/vercel-react-best-practices/rules/async-parallel.md
@@ -0,0 +1,28 @@
+---
+title: Promise.all() for Independent Operations
+impact: CRITICAL
+impactDescription: 2-10× improvement
+tags: async, parallelization, promises, waterfalls
+---
+
+## Promise.all() for Independent Operations
+
+When async operations have no interdependencies, execute them concurrently using `Promise.all()`.
+
+**Incorrect (sequential execution, 3 round trips):**
+
+```typescript
+const user = await fetchUser()
+const posts = await fetchPosts()
+const comments = await fetchComments()
+```
+
+**Correct (parallel execution, 1 round trip):**
+
+```typescript
+const [user, posts, comments] = await Promise.all([
+  fetchUser(),
+  fetchPosts(),
+  fetchComments()
+])
+```
--- a/.claude/skills/vercel-react-best-practices/rules/async-suspense-boundaries.md
+++ b/.claude/skills/vercel-react-best-practices/rules/async-suspense-boundaries.md
@@ -0,0 +1,99 @@
+---
+title: Strategic Suspense Boundaries
+impact: HIGH
+impactDescription: faster initial paint
+tags: async, suspense, streaming, layout-shift
+---
+
+## Strategic Suspense Boundaries
+
+Instead of awaiting data in async components before returning JSX, use Suspense boundaries to show the wrapper UI faster while data loads.
+
+**Incorrect (wrapper blocked by data fetching):**
+
+```tsx
+async function Page() {
+  const data = await fetchData() // Blocks entire page
+  
+  return (
+    <div>
+      <div>Sidebar</div>
+      <div>Header</div>
+      <div>
+        <DataDisplay data={data} />
+      </div>
+      <div>Footer</div>
+    </div>
+  )
+}
+```
+
+The entire layout waits for data even though only the middle section needs it.
+
+**Correct (wrapper shows immediately, data streams in):**
+
+```tsx
+function Page() {
+  return (
+    <div>
+      <div>Sidebar</div>
+      <div>Header</div>
+      <div>
+        <Suspense fallback={<Skeleton />}>
+          <DataDisplay />
+        </Suspense>
+      </div>
+      <div>Footer</div>
+    </div>
+  )
+}
+
+async function DataDisplay() {
+  const data = await fetchData() // Only blocks this component
+  return <div>{data.content}</div>
+}
+```
+
+Sidebar, Header, and Footer render immediately. Only DataDisplay waits for data.
+
+**Alternative (share promise across components):**
+
+```tsx
+function Page() {
+  // Start fetch immediately, but don't await
+  const dataPromise = fetchData()
+  
+  return (
+    <div>
+      <div>Sidebar</div>
+      <div>Header</div>
+      <Suspense fallback={<Skeleton />}>
+        <DataDisplay dataPromise={dataPromise} />
+        <DataSummary dataPromise={dataPromise} />
+      </Suspense>
+      <div>Footer</div>
+    </div>
+  )
+}
+
+function DataDisplay({ dataPromise }: { dataPromise: Promise<Data> }) {
+  const data = use(dataPromise) // Unwraps the promise
+  return <div>{data.content}</div>
+}
+
+function DataSummary({ dataPromise }: { dataPromise: Promise<Data> }) {
+  const data = use(dataPromise) // Reuses the same promise
+  return <div>{data.summary}</div>
+}
+```
+
+Both components share the same promise, so only one fetch occurs. Layout renders immediately while both components wait together.
+
+**When NOT to use this pattern:**
+
+- Critical data needed for layout decisions (affects positioning)
+- SEO-critical content above the fold
+- Small, fast queries where suspense overhead isn't worth it
+- When you want to avoid layout shift (loading → content jump)
+
+**Trade-off:** Faster initial paint vs potential layout shift. Choose based on your UX priorities.
--- a/.claude/skills/vercel-react-best-practices/rules/bundle-barrel-imports.md
+++ b/.claude/skills/vercel-react-best-practices/rules/bundle-barrel-imports.md
@@ -0,0 +1,59 @@
+---
+title: Avoid Barrel File Imports
+impact: CRITICAL
+impactDescription: 200-800ms import cost, slow builds
+tags: bundle, imports, tree-shaking, barrel-files, performance
+---
+
+## Avoid Barrel File Imports
+
+Import directly from source files instead of barrel files to avoid loading thousands of unused modules. **Barrel files** are entry points that re-export multiple modules (e.g., `index.js` that does `export * from './module'`).
+
+Popular icon and component libraries can have **up to 10,000 re-exports** in their entry file. For many React packages, **it takes 200-800ms just to import them**, affecting both development speed and production cold starts.
+
+**Why tree-shaking doesn't help:** When a library is marked as external (not bundled), the bundler can't optimize it. If you bundle it to enable tree-shaking, builds become substantially slower analyzing the entire module graph.
+
+**Incorrect (imports entire library):**
+
+```tsx
+import { Check, X, Menu } from 'lucide-react'
+// Loads 1,583 modules, takes ~2.8s extra in dev
+// Runtime cost: 200-800ms on every cold start
+
+import { Button, TextField } from '@mui/material'
+// Loads 2,225 modules, takes ~4.2s extra in dev
+```
+
+**Correct (imports only what you need):**
+
+```tsx
+import Check from 'lucide-react/dist/esm/icons/check'
+import X from 'lucide-react/dist/esm/icons/x'
+import Menu from 'lucide-react/dist/esm/icons/menu'
+// Loads only 3 modules (~2KB vs ~1MB)
+
+import Button from '@mui/material/Button'
+import TextField from '@mui/material/TextField'
+// Loads only what you use
+```
+
+**Alternative (Next.js 13.5+):**
+
+```js
+// next.config.js - use optimizePackageImports
+module.exports = {
+  experimental: {
+    optimizePackageImports: ['lucide-react', '@mui/material']
+  }
+}
+
+// Then you can keep the ergonomic barrel imports:
+import { Check, X, Menu } from 'lucide-react'
+// Automatically transformed to direct imports at build time
+```
+
+Direct imports provide 15-70% faster dev boot, 28% faster builds, 40% faster cold starts, and significantly faster HMR.
+
+Libraries commonly affected: `lucide-react`, `@mui/material`, `@mui/icons-material`, `@tabler/icons-react`, `react-icons`, `@headlessui/react`, `@radix-ui/react-*`, `lodash`, `ramda`, `date-fns`, `rxjs`, `react-use`.
+
+Reference: [How we optimized package imports in Next.js](https://vercel.com/blog/how-we-optimized-package-imports-in-next-js)
--- a/.claude/skills/vercel-react-best-practices/rules/bundle-conditional.md
+++ b/.claude/skills/vercel-react-best-practices/rules/bundle-conditional.md
@@ -0,0 +1,31 @@
+---
+title: Conditional Module Loading
+impact: HIGH
+impactDescription: loads large data only when needed
+tags: bundle, conditional-loading, lazy-loading
+---
+
+## Conditional Module Loading
+
+Load large data or modules only when a feature is activated.
+
+**Example (lazy-load animation frames):**
+
+```tsx
+function AnimationPlayer({ enabled }: { enabled: boolean }) {
+  const [frames, setFrames] = useState<Frame[] | null>(null)
+
+  useEffect(() => {
+    if (enabled && !frames && typeof window !== 'undefined') {
+      import('./animation-frames.js')
+        .then(mod => setFrames(mod.frames))
+        .catch(() => setEnabled(false))
+    }
+  }, [enabled, frames])
+
+  if (!frames) return <Skeleton />
+  return <Canvas frames={frames} />
+}
+```
+
+The `typeof window !== 'undefined'` check prevents bundling this module for SSR, optimizing server bundle size and build speed.
--- a/.claude/skills/vercel-react-best-practices/rules/bundle-defer-third-party.md
+++ b/.claude/skills/vercel-react-best-practices/rules/bundle-defer-third-party.md
@@ -0,0 +1,49 @@
+---
+title: Defer Non-Critical Third-Party Libraries
+impact: MEDIUM
+impactDescription: loads after hydration
+tags: bundle, third-party, analytics, defer
+---
+
+## Defer Non-Critical Third-Party Libraries
+
+Analytics, logging, and error tracking don't block user interaction. Load them after hydration.
+
+**Incorrect (blocks initial bundle):**
+
+```tsx
+import { Analytics } from '@vercel/analytics/react'
+
+export default function RootLayout({ children }) {
+  return (
+    <html>
+      <body>
+        {children}
+        <Analytics />
+      </body>
+    </html>
+  )
+}
+```
+
+**Correct (loads after hydration):**
+
+```tsx
+import dynamic from 'next/dynamic'
+
+const Analytics = dynamic(
+  () => import('@vercel/analytics/react').then(m => m.Analytics),
+  { ssr: false }
+)
+
+export default function RootLayout({ children }) {
+  return (
+    <html>
+      <body>
+        {children}
+        <Analytics />
+      </body>
+    </html>
+  )
+}
+```
--- a/.claude/skills/vercel-react-best-practices/rules/bundle-dynamic-imports.md
+++ b/.claude/skills/vercel-react-best-practices/rules/bundle-dynamic-imports.md
@@ -0,0 +1,35 @@
+---
+title: Dynamic Imports for Heavy Components
+impact: CRITICAL
+impactDescription: directly affects TTI and LCP
+tags: bundle, dynamic-import, code-splitting, next-dynamic
+---
+
+## Dynamic Imports for Heavy Components
+
+Use `next/dynamic` to lazy-load large components not needed on initial render.
+
+**Incorrect (Monaco bundles with main chunk ~300KB):**
+
+```tsx
+import { MonacoEditor } from './monaco-editor'
+
+function CodePanel({ code }: { code: string }) {
+  return <MonacoEditor value={code} />
+}
+```
+
+**Correct (Monaco loads on demand):**
+
+```tsx
+import dynamic from 'next/dynamic'
+
+const MonacoEditor = dynamic(
+  () => import('./monaco-editor').then(m => m.MonacoEditor),
+  { ssr: false }
+)
+
+function CodePanel({ code }: { code: string }) {
+  return <MonacoEditor value={code} />
+}
+```
--- a/.claude/skills/vercel-react-best-practices/rules/bundle-preload.md
+++ b/.claude/skills/vercel-react-best-practices/rules/bundle-preload.md
@@ -0,0 +1,50 @@
+---
+title: Preload Based on User Intent
+impact: MEDIUM
+impactDescription: reduces perceived latency
+tags: bundle, preload, user-intent, hover
+---
+
+## Preload Based on User Intent
+
+Preload heavy bundles before they're needed to reduce perceived latency.
+
+**Example (preload on hover/focus):**
+
+```tsx
+function EditorButton({ onClick }: { onClick: () => void }) {
+  const preload = () => {
+    if (typeof window !== 'undefined') {
+      void import('./monaco-editor')
+    }
+  }
+
+  return (
+    <button
+      onMouseEnter={preload}
+      onFocus={preload}
+      onClick={onClick}
+    >
+      Open Editor
+    </button>
+  )
+}
+```
+
+**Example (preload when feature flag is enabled):**
+
+```tsx
+function FlagsProvider({ children, flags }: Props) {
+  useEffect(() => {
+    if (flags.editorEnabled && typeof window !== 'undefined') {
+      void import('./monaco-editor').then(mod => mod.init())
+    }
+  }, [flags.editorEnabled])
+
+  return <FlagsContext.Provider value={flags}>
+    {children}
+  </FlagsContext.Provider>
+}
+```
+
+The `typeof window !== 'undefined'` check prevents bundling preloaded modules for SSR, optimizing server bundle size and build speed.
--- a/.claude/skills/vercel-react-best-practices/rules/client-event-listeners.md
+++ b/.claude/skills/vercel-react-best-practices/rules/client-event-listeners.md
@@ -0,0 +1,74 @@
+---
+title: Deduplicate Global Event Listeners
+impact: LOW
+impactDescription: single listener for N components
+tags: client, swr, event-listeners, subscription
+---
+
+## Deduplicate Global Event Listeners
+
+Use `useSWRSubscription()` to share global event listeners across component instances.
+
+**Incorrect (N instances = N listeners):**
+
+```tsx
+function useKeyboardShortcut(key: string, callback: () => void) {
+  useEffect(() => {
+    const handler = (e: KeyboardEvent) => {
+      if (e.metaKey && e.key === key) {
+        callback()
+      }
+    }
+    window.addEventListener('keydown', handler)
+    return () => window.removeEventListener('keydown', handler)
+  }, [key, callback])
+}
+```
+
+When using the `useKeyboardShortcut` hook multiple times, each instance will register a new listener.
+
+**Correct (N instances = 1 listener):**
+
+```tsx
+import useSWRSubscription from 'swr/subscription'
+
+// Module-level Map to track callbacks per key
+const keyCallbacks = new Map<string, Set<() => void>>()
+
+function useKeyboardShortcut(key: string, callback: () => void) {
+  // Register this callback in the Map
+  useEffect(() => {
+    if (!keyCallbacks.has(key)) {
+      keyCallbacks.set(key, new Set())
+    }
+    keyCallbacks.get(key)!.add(callback)
+
+    return () => {
+      const set = keyCallbacks.get(key)
+      if (set) {
+        set.delete(callback)
+        if (set.size === 0) {
+          keyCallbacks.delete(key)
+        }
+      }
+    }
+  }, [key, callback])
+
+  useSWRSubscription('global-keydown', () => {
+    const handler = (e: KeyboardEvent) => {
+      if (e.metaKey && keyCallbacks.has(e.key)) {
+        keyCallbacks.get(e.key)!.forEach(cb => cb())
+      }
+    }
+    window.addEventListener('keydown', handler)
+    return () => window.removeEventListener('keydown', handler)
+  })
+}
+
+function Profile() {
+  // Multiple shortcuts will share the same listener
+  useKeyboardShortcut('p', () => { /* ... */ }) 
+  useKeyboardShortcut('k', () => { /* ... */ })
+  // ...
+}
+```
--- a/.claude/skills/vercel-react-best-practices/rules/client-swr-dedup.md
+++ b/.claude/skills/vercel-react-best-practices/rules/client-swr-dedup.md
@@ -0,0 +1,56 @@
+---
+title: Use SWR for Automatic Deduplication
+impact: MEDIUM-HIGH
+impactDescription: automatic deduplication
+tags: client, swr, deduplication, data-fetching
+---
+
+## Use SWR for Automatic Deduplication
+
+SWR enables request deduplication, caching, and revalidation across component instances.
+
+**Incorrect (no deduplication, each instance fetches):**
+
+```tsx
+function UserList() {
+  const [users, setUsers] = useState([])
+  useEffect(() => {
+    fetch('/api/users')
+      .then(r => r.json())
+      .then(setUsers)
+  }, [])
+}
+```
+
+**Correct (multiple instances share one request):**
+
+```tsx
+import useSWR from 'swr'
+
+function UserList() {
+  const { data: users } = useSWR('/api/users', fetcher)
+}
+```
+
+**For immutable data:**
+
+```tsx
+import { useImmutableSWR } from '@/lib/swr'
+
+function StaticContent() {
+  const { data } = useImmutableSWR('/api/config', fetcher)
+}
+```
+
+**For mutations:**
+
+```tsx
+import { useSWRMutation } from 'swr/mutation'
+
+function UpdateButton() {
+  const { trigger } = useSWRMutation('/api/user', updateUser)
+  return <button onClick={() => trigger()}>Update</button>
+}
+```
+
+Reference: [https://swr.vercel.app](https://swr.vercel.app)
--- a/.claude/skills/vercel-react-best-practices/rules/js-batch-dom-css.md
+++ b/.claude/skills/vercel-react-best-practices/rules/js-batch-dom-css.md
@@ -0,0 +1,82 @@
+---
+title: Batch DOM CSS Changes
+impact: MEDIUM
+impactDescription: reduces reflows/repaints
+tags: javascript, dom, css, performance, reflow
+---
+
+## Batch DOM CSS Changes
+
+Avoid changing styles one property at a time. Group multiple CSS changes together via classes or `cssText` to minimize browser reflows.
+
+**Incorrect (multiple reflows):**
+
+```typescript
+function updateElementStyles(element: HTMLElement) {
+  // Each line triggers a reflow
+  element.style.width = '100px'
+  element.style.height = '200px'
+  element.style.backgroundColor = 'blue'
+  element.style.border = '1px solid black'
+}
+```
+
+**Correct (add class - single reflow):**
+
+```typescript
+// CSS file
+.highlighted-box {
+  width: 100px;
+  height: 200px;
+  background-color: blue;
+  border: 1px solid black;
+}
+
+// JavaScript
+function updateElementStyles(element: HTMLElement) {
+  element.classList.add('highlighted-box')
+}
+```
+
+**Correct (change cssText - single reflow):**
+
+```typescript
+function updateElementStyles(element: HTMLElement) {
+  element.style.cssText = `
+    width: 100px;
+    height: 200px;
+    background-color: blue;
+    border: 1px solid black;
+  `
+}
+```
+
+**React example:**
+
+```tsx
+// Incorrect: changing styles one by one
+function Box({ isHighlighted }: { isHighlighted: boolean }) {
+  const ref = useRef<HTMLDivElement>(null)
+  
+  useEffect(() => {
+    if (ref.current && isHighlighted) {
+      ref.current.style.width = '100px'
+      ref.current.style.height = '200px'
+      ref.current.style.backgroundColor = 'blue'
+    }
+  }, [isHighlighted])
+  
+  return <div ref={ref}>Content</div>
+}
+
+// Correct: toggle class
+function Box({ isHighlighted }: { isHighlighted: boolean }) {
+  return (
+    <div className={isHighlighted ? 'highlighted-box' : ''}>
+      Content
+    </div>
+  )
+}
+```
+
+Prefer CSS classes over inline styles when possible. Classes are cached by the browser and provide better separation of concerns.
--- a/.claude/skills/vercel-react-best-practices/rules/js-cache-function-results.md
+++ b/.claude/skills/vercel-react-best-practices/rules/js-cache-function-results.md
@@ -0,0 +1,80 @@
+---
+title: Cache Repeated Function Calls
+impact: MEDIUM
+impactDescription: avoid redundant computation
+tags: javascript, cache, memoization, performance
+---
+
+## Cache Repeated Function Calls
+
+Use a module-level Map to cache function results when the same function is called repeatedly with the same inputs during render.
+
+**Incorrect (redundant computation):**
+
+```typescript
+function ProjectList({ projects }: { projects: Project[] }) {
+  return (
+    <div>
+      {projects.map(project => {
+        // slugify() called 100+ times for same project names
+        const slug = slugify(project.name)
+        
+        return <ProjectCard key={project.id} slug={slug} />
+      })}
+    </div>
+  )
+}
+```
+
+**Correct (cached results):**
+
+```typescript
+// Module-level cache
+const slugifyCache = new Map<string, string>()
+
+function cachedSlugify(text: string): string {
+  if (slugifyCache.has(text)) {
+    return slugifyCache.get(text)!
+  }
+  const result = slugify(text)
+  slugifyCache.set(text, result)
+  return result
+}
+
+function ProjectList({ projects }: { projects: Project[] }) {
+  return (
+    <div>
+      {projects.map(project => {
+        // Computed only once per unique project name
+        const slug = cachedSlugify(project.name)
+        
+        return <ProjectCard key={project.id} slug={slug} />
+      })}
+    </div>
+  )
+}
+```
+
+**Simpler pattern for single-value functions:**
+
+```typescript
+let isLoggedInCache: boolean | null = null
+
+function isLoggedIn(): boolean {
+  if (isLoggedInCache !== null) {
+    return isLoggedInCache
+  }
+  
+  isLoggedInCache = document.cookie.includes('auth=')
+  return isLoggedInCache
+}
+
+// Clear cache when auth changes
+function onAuthChange() {
+  isLoggedInCache = null
+}
+```
+
+Use a Map (not a hook) so it works everywhere: utilities, event handlers, not just React components.
+
+Reference: [How we made the Vercel Dashboard twice as fast](https://vercel.com/blog/how-we-made-the-vercel-dashboard-twice-as-fast)
--- a/.claude/skills/vercel-react-best-practices/rules/js-cache-property-access.md
+++ b/.claude/skills/vercel-react-best-practices/rules/js-cache-property-access.md
@@ -0,0 +1,28 @@
+---
+title: Cache Property Access in Loops
+impact: LOW-MEDIUM
+impactDescription: reduces lookups
+tags: javascript, loops, optimization, caching
+---
+
+## Cache Property Access in Loops
+
+Cache object property lookups in hot paths.
+
+**Incorrect (3 lookups × N iterations):**
+
+```typescript
+for (let i = 0; i < arr.length; i++) {
+  process(obj.config.settings.value)
+}
+```
+
+**Correct (1 lookup total):**
+
+```typescript
+const value = obj.config.settings.value
+const len = arr.length
+for (let i = 0; i < len; i++) {
+  process(value)
+}
+```
--- a/.claude/skills/vercel-react-best-practices/rules/js-cache-storage.md
+++ b/.claude/skills/vercel-react-best-practices/rules/js-cache-storage.md
@@ -0,0 +1,70 @@
+---
+title: Cache Storage API Calls
+impact: LOW-MEDIUM
+impactDescription: reduces expensive I/O
+tags: javascript, localStorage, storage, caching, performance
+---
+
+## Cache Storage API Calls
+
+`localStorage`, `sessionStorage`, and `document.cookie` are synchronous and expensive. Cache reads in memory.
+
+**Incorrect (reads storage on every call):**
+
+```typescript
+function getTheme() {
+  return localStorage.getItem('theme') ?? 'light'
+}
+// Called 10 times = 10 storage reads
+```
+
+**Correct (Map cache):**
+
+```typescript
+const storageCache = new Map<string, string | null>()
+
+function getLocalStorage(key: string) {
+  if (!storageCache.has(key)) {
+    storageCache.set(key, localStorage.getItem(key))
+  }
+  return storageCache.get(key)
+}
+
+function setLocalStorage(key: string, value: string) {
+  localStorage.setItem(key, value)
+  storageCache.set(key, value)  // keep cache in sync
+}
+```
+
+Use a Map (not a hook) so it works everywhere: utilities, event handlers, not just React components.
+
+**Cookie caching:**
+
+```typescript
+let cookieCache: Record<string, string> | null = null
+
+function getCookie(name: string) {
+  if (!cookieCache) {
+    cookieCache = Object.fromEntries(
+      document.cookie.split('; ').map(c => c.split('='))
+    )
+  }
+  return cookieCache[name]
+}
+```
+
+**Important (invalidate on external changes):**
+
+If storage can change externally (another tab, server-set cookies), invalidate cache:
+
+```typescript
+window.addEventListener('storage', (e) => {
+  if (e.key) storageCache.delete(e.key)
+})
+
+document.addEventListener('visibilitychange', () => {
+  if (document.visibilityState === 'visible') {
+    storageCache.clear()
+  }
+})
+```
--- a/.claude/skills/vercel-react-best-practices/rules/js-combine-iterations.md
+++ b/.claude/skills/vercel-react-best-practices/rules/js-combine-iterations.md
@@ -0,0 +1,32 @@
+---
+title: Combine Multiple Array Iterations
+impact: LOW-MEDIUM
+impactDescription: reduces iterations
+tags: javascript, arrays, loops, performance
+---
+
+## Combine Multiple Array Iterations
+
+Multiple `.filter()` or `.map()` calls iterate the array multiple times. Combine into one loop.
+
+**Incorrect (3 iterations):**
+
+```typescript
+const admins = users.filter(u => u.isAdmin)
+const testers = users.filter(u => u.isTester)
+const inactive = users.filter(u => !u.isActive)
+```
+
+**Correct (1 iteration):**
+
+```typescript
+const admins: User[] = []
+const testers: User[] = []
+const inactive: User[] = []
+
+for (const user of users) {
+  if (user.isAdmin) admins.push(user)
+  if (user.isTester) testers.push(user)
+  if (!user.isActive) inactive.push(user)
+}
+```
--- a/.claude/skills/vercel-react-best-practices/rules/js-early-exit.md
+++ b/.claude/skills/vercel-react-best-practices/rules/js-early-exit.md
@@ -0,0 +1,50 @@
+---
+title: Early Return from Functions
+impact: LOW-MEDIUM
+impactDescription: avoids unnecessary computation
+tags: javascript, functions, optimization, early-return
+---
+
+## Early Return from Functions
+
+Return early when result is determined to skip unnecessary processing.
+
+**Incorrect (processes all items even after finding answer):**
+
+```typescript
+function validateUsers(users: User[]) {
+  let hasError = false
+  let errorMessage = ''
+  
+  for (const user of users) {
+    if (!user.email) {
+      hasError = true
+      errorMessage = 'Email required'
+    }
+    if (!user.name) {
+      hasError = true
+      errorMessage = 'Name required'
+    }
+    // Continues checking all users even after error found
+  }
+  
+  return hasError ? { valid: false, error: errorMessage } : { valid: true }
+}
+```
+
+**Correct (returns immediately on first error):**
+
+```typescript
+function validateUsers(users: User[]) {
+  for (const user of users) {
+    if (!user.email) {
+      return { valid: false, error: 'Email required' }
+    }
+    if (!user.name) {
+      return { valid: false, error: 'Name required' }
+    }
+  }
+
+  return { valid: true }
+}
+```
--- a/.claude/skills/vercel-react-best-practices/rules/js-hoist-regexp.md
+++ b/.claude/skills/vercel-react-best-practices/rules/js-hoist-regexp.md
@@ -0,0 +1,45 @@
+---
+title: Hoist RegExp Creation
+impact: LOW-MEDIUM
+impactDescription: avoids recreation
+tags: javascript, regexp, optimization, memoization
+---
+
+## Hoist RegExp Creation
+
+Don't create RegExp inside render. Hoist to module scope or memoize with `useMemo()`.
+
+**Incorrect (new RegExp every render):**
+
+```tsx
+function Highlighter({ text, query }: Props) {
+  const regex = new RegExp(`(${query})`, 'gi')
+  const parts = text.split(regex)
+  return <>{parts.map((part, i) => ...)}</>
+}
+```
+
+**Correct (memoize or hoist):**
+
+```tsx
+const EMAIL_REGEX = /^[^\s@]+@[^\s@]+\.[^\s@]+$/
+
+function Highlighter({ text, query }: Props) {
+  const regex = useMemo(
+    () => new RegExp(`(${escapeRegex(query)})`, 'gi'),
+    [query]
+  )
+  const parts = text.split(regex)
+  return <>{parts.map((part, i) => ...)}</>
+}
+```
+
+**Warning (global regex has mutable state):**
+
+Global regex (`/g`) has mutable `lastIndex` state:
+
+```typescript
+const regex = /foo/g
+regex.test('foo')  // true, lastIndex = 3
+regex.test('foo')  // false, lastIndex = 0
+```
--- a/.claude/skills/vercel-react-best-practices/rules/js-index-maps.md
+++ b/.claude/skills/vercel-react-best-practices/rules/js-index-maps.md
@@ -0,0 +1,37 @@
+---
+title: Build Index Maps for Repeated Lookups
+impact: LOW-MEDIUM
+impactDescription: 1M ops to 2K ops
+tags: javascript, map, indexing, optimization, performance
+---
+
+## Build Index Maps for Repeated Lookups
+
+Multiple `.find()` calls by the same key should use a Map.
+
+**Incorrect (O(n) per lookup):**
+
+```typescript
+function processOrders(orders: Order[], users: User[]) {
+  return orders.map(order => ({
+    ...order,
+    user: users.find(u => u.id === order.userId)
+  }))
+}
+```
+
+**Correct (O(1) per lookup):**
+
+```typescript
+function processOrders(orders: Order[], users: User[]) {
+  const userById = new Map(users.map(u => [u.id, u]))
+
+  return orders.map(order => ({
+    ...order,
+    user: userById.get(order.userId)
+  }))
+}
+```
+
+Build map once (O(n)), then all lookups are O(1).
+For 1000 orders × 1000 users: 1M ops → 2K ops.
--- a/.claude/skills/vercel-react-best-practices/rules/js-length-check-first.md
+++ b/.claude/skills/vercel-react-best-practices/rules/js-length-check-first.md
@@ -0,0 +1,49 @@
+---
+title: Early Length Check for Array Comparisons
+impact: MEDIUM-HIGH
+impactDescription: avoids expensive operations when lengths differ
+tags: javascript, arrays, performance, optimization, comparison
+---
+
+## Early Length Check for Array Comparisons
+
+When comparing arrays with expensive operations (sorting, deep equality, serialization), check lengths first. If lengths differ, the arrays cannot be equal.
+
+In real-world applications, this optimization is especially valuable when the comparison runs in hot paths (event handlers, render loops).
+
+**Incorrect (always runs expensive comparison):**
+
+```typescript
+function hasChanges(current: string[], original: string[]) {
+  // Always sorts and joins, even when lengths differ
+  return current.sort().join() !== original.sort().join()
+}
+```
+
+Two O(n log n) sorts run even when `current.length` is 5 and `original.length` is 100. There is also overhead of joining the arrays and comparing the strings.
+
+**Correct (O(1) length check first):**
+
+```typescript
+function hasChanges(current: string[], original: string[]) {
+  // Early return if lengths differ
+  if (current.length !== original.length) {
+    return true
+  }
+  // Only sort/join when lengths match
+  const currentSorted = current.toSorted()
+  const originalSorted = original.toSorted()
+  for (let i = 0; i < currentSorted.length; i++) {
+    if (currentSorted[i] !== originalSorted[i]) {
+      return true
+    }
+  }
+  return false
+}
+```
+
+This new approach is more efficient because:
+- It avoids the overhead of sorting and joining the arrays when lengths differ
+- It avoids consuming memory for the joined strings (especially important for large arrays)
+- It avoids mutating the original arrays
+- It returns early when a difference is found
--- a/.claude/skills/vercel-react-best-practices/rules/js-min-max-loop.md
+++ b/.claude/skills/vercel-react-best-practices/rules/js-min-max-loop.md
@@ -0,0 +1,82 @@
+---
+title: Use Loop for Min/Max Instead of Sort
+impact: LOW
+impactDescription: O(n) instead of O(n log n)
+tags: javascript, arrays, performance, sorting, algorithms
+---
+
+## Use Loop for Min/Max Instead of Sort
+
+Finding the smallest or largest element only requires a single pass through the array. Sorting is wasteful and slower.
+
+**Incorrect (O(n log n) - sort to find latest):**
+
+```typescript
+interface Project {
+  id: string
+  name: string
+  updatedAt: number
+}
+
+function getLatestProject(projects: Project[]) {
+  const sorted = [...projects].sort((a, b) => b.updatedAt - a.updatedAt)
+  return sorted[0]
+}
+```
+
+Sorts the entire array just to find the maximum value.
+
+**Incorrect (O(n log n) - sort for oldest and newest):**
+
+```typescript
+function getOldestAndNewest(projects: Project[]) {
+  const sorted = [...projects].sort((a, b) => a.updatedAt - b.updatedAt)
+  return { oldest: sorted[0], newest: sorted[sorted.length - 1] }
+}
+```
+
+Still sorts unnecessarily when only min/max are needed.
+
+**Correct (O(n) - single loop):**
+
+```typescript
+function getLatestProject(projects: Project[]) {
+  if (projects.length === 0) return null
+  
+  let latest = projects[0]
+  
+  for (let i = 1; i < projects.length; i++) {
+    if (projects[i].updatedAt > latest.updatedAt) {
+      latest = projects[i]
+    }
+  }
+  
+  return latest
+}
+
+function getOldestAndNewest(projects: Project[]) {
+  if (projects.length === 0) return { oldest: null, newest: null }
+  
+  let oldest = projects[0]
+  let newest = projects[0]
+  
+  for (let i = 1; i < projects.length; i++) {
+    if (projects[i].updatedAt < oldest.updatedAt) oldest = projects[i]
+    if (projects[i].updatedAt > newest.updatedAt) newest = projects[i]
+  }
+  
+  return { oldest, newest }
+}
+```
+
+Single pass through the array, no copying, no sorting.
+
+**Alternative (Math.min/Math.max for small arrays):**
+
+```typescript
+const numbers = [5, 2, 8, 1, 9]
+const min = Math.min(...numbers)
+const max = Math.max(...numbers)
+```
+
+This works for small arrays but can be slower for very large arrays due to spread operator limitations. Use the loop approach for reliability.
--- a/.claude/skills/vercel-react-best-practices/rules/js-set-map-lookups.md
+++ b/.claude/skills/vercel-react-best-practices/rules/js-set-map-lookups.md
@@ -0,0 +1,24 @@
+---
+title: Use Set/Map for O(1) Lookups
+impact: LOW-MEDIUM
+impactDescription: O(n) to O(1)
+tags: javascript, set, map, data-structures, performance
+---
+
+## Use Set/Map for O(1) Lookups
+
+Convert arrays to Set/Map for repeated membership checks.
+
+**Incorrect (O(n) per check):**
+
+```typescript
+const allowedIds = ['a', 'b', 'c', ...]
+items.filter(item => allowedIds.includes(item.id))
+```
+
+**Correct (O(1) per check):**
+
+```typescript
+const allowedIds = new Set(['a', 'b', 'c', ...])
+items.filter(item => allowedIds.has(item.id))
+```
--- a/.claude/skills/vercel-react-best-practices/rules/js-tosorted-immutable.md
+++ b/.claude/skills/vercel-react-best-practices/rules/js-tosorted-immutable.md
@@ -0,0 +1,57 @@
+---
+title: Use toSorted() Instead of sort() for Immutability
+impact: MEDIUM-HIGH
+impactDescription: prevents mutation bugs in React state
+tags: javascript, arrays, immutability, react, state, mutation
+---
+
+## Use toSorted() Instead of sort() for Immutability
+
+`.sort()` mutates the array in place, which can cause bugs with React state and props. Use `.toSorted()` to create a new sorted array without mutation.
+
+**Incorrect (mutates original array):**
+
+```typescript
+function UserList({ users }: { users: User[] }) {
+  // Mutates the users prop array!
+  const sorted = useMemo(
+    () => users.sort((a, b) => a.name.localeCompare(b.name)),
+    [users]
+  )
+  return <div>{sorted.map(renderUser)}</div>
+}
+```
+
+**Correct (creates new array):**
+
+```typescript
+function UserList({ users }: { users: User[] }) {
+  // Creates new sorted array, original unchanged
+  const sorted = useMemo(
+    () => users.toSorted((a, b) => a.name.localeCompare(b.name)),
+    [users]
+  )
+  return <div>{sorted.map(renderUser)}</div>
+}
+```
+
+**Why this matters in React:**
+
+1. Props/state mutations break React's immutability model - React expects props and state to be treated as read-only
+2. Causes stale closure bugs - Mutating arrays inside closures (callbacks, effects) can lead to unexpected behavior
+
+**Browser support (fallback for older browsers):**
+
+`.toSorted()` is available in all modern browsers (Chrome 110+, Safari 16+, Firefox 115+, Node.js 20+). For older environments, use spread operator:
+
+```typescript
+// Fallback for older browsers
+const sorted = [...items].sort((a, b) => a.value - b.value)
+```
+
+**Other immutable array methods:**
+
+- `.toSorted()` - immutable sort
+- `.toReversed()` - immutable reverse
+- `.toSpliced()` - immutable splice
+- `.with()` - immutable element replacement
--- a/.claude/skills/vercel-react-best-practices/rules/rendering-activity.md
+++ b/.claude/skills/vercel-react-best-practices/rules/rendering-activity.md
@@ -0,0 +1,26 @@
+---
+title: Use Activity Component for Show/Hide
+impact: MEDIUM
+impactDescription: preserves state/DOM
+tags: rendering, activity, visibility, state-preservation
+---
+
+## Use Activity Component for Show/Hide
+
+Use React's `<Activity>` to preserve state/DOM for expensive components that frequently toggle visibility.
+
+**Usage:**
+
+```tsx
+import { Activity } from 'react'
+
+function Dropdown({ isOpen }: Props) {
+  return (
+    <Activity mode={isOpen ? 'visible' : 'hidden'}>
+      <ExpensiveMenu />
+    </Activity>
+  )
+}
+```
+
+Avoids expensive re-renders and state loss.
--- a/.claude/skills/vercel-react-best-practices/rules/rendering-animate-svg-wrapper.md
+++ b/.claude/skills/vercel-react-best-practices/rules/rendering-animate-svg-wrapper.md
@@ -0,0 +1,47 @@
+---
+title: Animate SVG Wrapper Instead of SVG Element
+impact: LOW
+impactDescription: enables hardware acceleration
+tags: rendering, svg, css, animation, performance
+---
+
+## Animate SVG Wrapper Instead of SVG Element
+
+Many browsers don't have hardware acceleration for CSS3 animations on SVG elements. Wrap SVG in a `<div>` and animate the wrapper instead.
+
+**Incorrect (animating SVG directly - no hardware acceleration):**
+
+```tsx
+function LoadingSpinner() {
+  return (
+    <svg 
+      className="animate-spin"
+      width="24" 
+      height="24" 
+      viewBox="0 0 24 24"
+    >
+      <circle cx="12" cy="12" r="10" stroke="currentColor" />
+    </svg>
+  )
+}
+```
+
+**Correct (animating wrapper div - hardware accelerated):**
+
+```tsx
+function LoadingSpinner() {
+  return (
+    <div className="animate-spin">
+      <svg 
+        width="24" 
+        height="24" 
+        viewBox="0 0 24 24"
+      >
+        <circle cx="12" cy="12" r="10" stroke="currentColor" />
+      </svg>
+    </div>
+  )
+}
+```
+
+This applies to all CSS transforms and transitions (`transform`, `opacity`, `translate`, `scale`, `rotate`). The wrapper div allows browsers to use GPU acceleration for smoother animations.
--- a/.claude/skills/vercel-react-best-practices/rules/rendering-conditional-render.md
+++ b/.claude/skills/vercel-react-best-practices/rules/rendering-conditional-render.md
@@ -0,0 +1,40 @@
+---
+title: Use Explicit Conditional Rendering
+impact: LOW
+impactDescription: prevents rendering 0 or NaN
+tags: rendering, conditional, jsx, falsy-values
+---
+
+## Use Explicit Conditional Rendering
+
+Use explicit ternary operators (`? :`) instead of `&&` for conditional rendering when the condition can be `0`, `NaN`, or other falsy values that render.
+
+**Incorrect (renders "0" when count is 0):**
+
+```tsx
+function Badge({ count }: { count: number }) {
+  return (
+    <div>
+      {count && <span className="badge">{count}</span>}
+    </div>
+  )
+}
+
+// When count = 0, renders: <div>0</div>
+// When count = 5, renders: <div><span class="badge">5</span></div>
+```
+
+**Correct (renders nothing when count is 0):**
+
+```tsx
+function Badge({ count }: { count: number }) {
+  return (
+    <div>
+      {count > 0 ? <span className="badge">{count}</span> : null}
+    </div>
+  )
+}
+
+// When count = 0, renders: <div></div>
+// When count = 5, renders: <div><span class="badge">5</span></div>
+```
--- a/.claude/skills/vercel-react-best-practices/rules/rendering-content-visibility.md
+++ b/.claude/skills/vercel-react-best-practices/rules/rendering-content-visibility.md
@@ -0,0 +1,38 @@
+---
+title: CSS content-visibility for Long Lists
+impact: HIGH
+impactDescription: faster initial render
+tags: rendering, css, content-visibility, long-lists
+---
+
+## CSS content-visibility for Long Lists
+
+Apply `content-visibility: auto` to defer off-screen rendering.
+
+**CSS:**
+
+```css
+.message-item {
+  content-visibility: auto;
+  contain-intrinsic-size: 0 80px;
+}
+```
+
+**Example:**
+
+```tsx
+function MessageList({ messages }: { messages: Message[] }) {
+  return (
+    <div className="overflow-y-auto h-screen">
+      {messages.map(msg => (
+        <div key={msg.id} className="message-item">
+          <Avatar user={msg.author} />
+          <div>{msg.content}</div>
+        </div>
+      ))}
+    </div>
+  )
+}
+```
+
+For 1000 messages, browser skips layout/paint for ~990 off-screen items (10× faster initial render).
--- a/.claude/skills/vercel-react-best-practices/rules/rendering-hoist-jsx.md
+++ b/.claude/skills/vercel-react-best-practices/rules/rendering-hoist-jsx.md
@@ -0,0 +1,46 @@
+---
+title: Hoist Static JSX Elements
+impact: LOW
+impactDescription: avoids re-creation
+tags: rendering, jsx, static, optimization
+---
+
+## Hoist Static JSX Elements
+
+Extract static JSX outside components to avoid re-creation.
+
+**Incorrect (recreates element every render):**
+
+```tsx
+function LoadingSkeleton() {
+  return <div className="animate-pulse h-20 bg-gray-200" />
+}
+
+function Container() {
+  return (
+    <div>
+      {loading && <LoadingSkeleton />}
+    </div>
+  )
+}
+```
+
+**Correct (reuses same element):**
+
+```tsx
+const loadingSkeleton = (
+  <div className="animate-pulse h-20 bg-gray-200" />
+)
+
+function Container() {
+  return (
+    <div>
+      {loading && loadingSkeleton}
+    </div>
+  )
+}
+```
+
+This is especially helpful for large and static SVG nodes, which can be expensive to recreate on every render.
+
+**Note:** If your project has [React Compiler](https://react.dev/learn/react-compiler) enabled, the compiler automatically hoists static JSX elements and optimizes component re-renders, making manual hoisting unnecessary.
--- a/.claude/skills/vercel-react-best-practices/rules/rendering-hydration-no-flicker.md
+++ b/.claude/skills/vercel-react-best-practices/rules/rendering-hydration-no-flicker.md
@@ -0,0 +1,82 @@
+---
+title: Prevent Hydration Mismatch Without Flickering
+impact: MEDIUM
+impactDescription: avoids visual flicker and hydration errors
+tags: rendering, ssr, hydration, localStorage, flicker
+---
+
+## Prevent Hydration Mismatch Without Flickering
+
+When rendering content that depends on client-side storage (localStorage, cookies), avoid both SSR breakage and post-hydration flickering by injecting a synchronous script that updates the DOM before React hydrates.
+
+**Incorrect (breaks SSR):**
+
+```tsx
+function ThemeWrapper({ children }: { children: ReactNode }) {
+  // localStorage is not available on server - throws error
+  const theme = localStorage.getItem('theme') || 'light'
+  
+  return (
+    <div className={theme}>
+      {children}
+    </div>
+  )
+}
+```
+
+Server-side rendering will fail because `localStorage` is undefined.
+
+**Incorrect (visual flickering):**
+
+```tsx
+function ThemeWrapper({ children }: { children: ReactNode }) {
+  const [theme, setTheme] = useState('light')
+  
+  useEffect(() => {
+    // Runs after hydration - causes visible flash
+    const stored = localStorage.getItem('theme')
+    if (stored) {
+      setTheme(stored)
+    }
+  }, [])
+  
+  return (
+    <div className={theme}>
+      {children}
+    </div>
+  )
+}
+```
+
+Component first renders with default value (`light`), then updates after hydration, causing a visible flash of incorrect content.
+
+**Correct (no flicker, no hydration mismatch):**
+
+```tsx
+function ThemeWrapper({ children }: { children: ReactNode }) {
+  return (
+    <>
+      <div id="theme-wrapper">
+        {children}
+      </div>
+      <script
+        dangerouslySetInnerHTML={{
+          __html: `
+            (function() {
+              try {
+                var theme = localStorage.getItem('theme') || 'light';
+                var el = document.getElementById('theme-wrapper');
+                if (el) el.className = theme;
+              } catch (e) {}
+            })();
+          `,
+        }}
+      />
+    </>
+  )
+}
+```
+
+The inline script executes synchronously before showing the element, ensuring the DOM already has the correct value. No flickering, no hydration mismatch.
+
+This pattern is especially useful for theme toggles, user preferences, authentication states, and any client-only data that should render immediately without flashing default values.
--- a/.claude/skills/vercel-react-best-practices/rules/rendering-svg-precision.md
+++ b/.claude/skills/vercel-react-best-practices/rules/rendering-svg-precision.md
@@ -0,0 +1,28 @@
+---
+title: Optimize SVG Precision
+impact: LOW
+impactDescription: reduces file size
+tags: rendering, svg, optimization, svgo
+---
+
+## Optimize SVG Precision
+
+Reduce SVG coordinate precision to decrease file size. The optimal precision depends on the viewBox size, but in general reducing precision should be considered.
+
+**Incorrect (excessive precision):**
+
+```svg
+<path d="M 10.293847 20.847362 L 30.938472 40.192837" />
+```
+
+**Correct (1 decimal place):**
+
+```svg
+<path d="M 10.3 20.8 L 30.9 40.2" />
+```
+
+**Automate with SVGO:**
+
+```bash
+npx svgo --precision=1 --multipass icon.svg
+```
--- a/.claude/skills/vercel-react-best-practices/rules/rerender-defer-reads.md
+++ b/.claude/skills/vercel-react-best-practices/rules/rerender-defer-reads.md
@@ -0,0 +1,39 @@
+---
+title: Defer State Reads to Usage Point
+impact: MEDIUM
+impactDescription: avoids unnecessary subscriptions
+tags: rerender, searchParams, localStorage, optimization
+---
+
+## Defer State Reads to Usage Point
+
+Don't subscribe to dynamic state (searchParams, localStorage) if you only read it inside callbacks.
+
+**Incorrect (subscribes to all searchParams changes):**
+
+```tsx
+function ShareButton({ chatId }: { chatId: string }) {
+  const searchParams = useSearchParams()
+
+  const handleShare = () => {
+    const ref = searchParams.get('ref')
+    shareChat(chatId, { ref })
+  }
+
+  return <button onClick={handleShare}>Share</button>
+}
+```
+
+**Correct (reads on demand, no subscription):**
+
+```tsx
+function ShareButton({ chatId }: { chatId: string }) {
+  const handleShare = () => {
+    const params = new URLSearchParams(window.location.search)
+    const ref = params.get('ref')
+    shareChat(chatId, { ref })
+  }
+
+  return <button onClick={handleShare}>Share</button>
+}
+```
--- a/.claude/skills/vercel-react-best-practices/rules/rerender-dependencies.md
+++ b/.claude/skills/vercel-react-best-practices/rules/rerender-dependencies.md
@@ -0,0 +1,45 @@
+---
+title: Narrow Effect Dependencies
+impact: LOW
+impactDescription: minimizes effect re-runs
+tags: rerender, useEffect, dependencies, optimization
+---
+
+## Narrow Effect Dependencies
+
+Specify primitive dependencies instead of objects to minimize effect re-runs.
+
+**Incorrect (re-runs on any user field change):**
+
+```tsx
+useEffect(() => {
+  console.log(user.id)
+}, [user])
+```
+
+**Correct (re-runs only when id changes):**
+
+```tsx
+useEffect(() => {
+  console.log(user.id)
+}, [user.id])
+```
+
+**For derived state, compute outside effect:**
+
+```tsx
+// Incorrect: runs on width=767, 766, 765...
+useEffect(() => {
+  if (width < 768) {
+    enableMobileMode()
+  }
+}, [width])
+
+// Correct: runs only on boolean transition
+const isMobile = width < 768
+useEffect(() => {
+  if (isMobile) {
+    enableMobileMode()
+  }
+}, [isMobile])
+```
--- a/.claude/skills/vercel-react-best-practices/rules/rerender-derived-state.md
+++ b/.claude/skills/vercel-react-best-practices/rules/rerender-derived-state.md
@@ -0,0 +1,29 @@
+---
+title: Subscribe to Derived State
+impact: MEDIUM
+impactDescription: reduces re-render frequency
+tags: rerender, derived-state, media-query, optimization
+---
+
+## Subscribe to Derived State
+
+Subscribe to derived boolean state instead of continuous values to reduce re-render frequency.
+
+**Incorrect (re-renders on every pixel change):**
+
+```tsx
+function Sidebar() {
+  const width = useWindowWidth()  // updates continuously
+  const isMobile = width < 768
+  return <nav className={isMobile ? 'mobile' : 'desktop'}>
+}
+```
+
+**Correct (re-renders only when boolean changes):**
+
+```tsx
+function Sidebar() {
+  const isMobile = useMediaQuery('(max-width: 767px)')
+  return <nav className={isMobile ? 'mobile' : 'desktop'}>
+}
+```
--- a/.claude/skills/vercel-react-best-practices/rules/rerender-functional-setstate.md
+++ b/.claude/skills/vercel-react-best-practices/rules/rerender-functional-setstate.md
@@ -0,0 +1,74 @@
+---
+title: Use Functional setState Updates
+impact: MEDIUM
+impactDescription: prevents stale closures and unnecessary callback recreations
+tags: react, hooks, useState, useCallback, callbacks, closures
+---
+
+## Use Functional setState Updates
+
+When updating state based on the current state value, use the functional update form of setState instead of directly referencing the state variable. This prevents stale closures, eliminates unnecessary dependencies, and creates stable callback references.
+
+**Incorrect (requires state as dependency):**
+
+```tsx
+function TodoList() {
+  const [items, setItems] = useState(initialItems)
+  
+  // Callback must depend on items, recreated on every items change
+  const addItems = useCallback((newItems: Item[]) => {
+    setItems([...items, ...newItems])
+  }, [items])  // ❌ items dependency causes recreations
+  
+  // Risk of stale closure if dependency is forgotten
+  const removeItem = useCallback((id: string) => {
+    setItems(items.filter(item => item.id !== id))
+  }, [])  // ❌ Missing items dependency - will use stale items!
+  
+  return <ItemsEditor items={items} onAdd={addItems} onRemove={removeItem} />
+}
+```
+
+The first callback is recreated every time `items` changes, which can cause child components to re-render unnecessarily. The second callback has a stale closure bug—it will always reference the initial `items` value.
+
+**Correct (stable callbacks, no stale closures):**
+
+```tsx
+function TodoList() {
+  const [items, setItems] = useState(initialItems)
+  
+  // Stable callback, never recreated
+  const addItems = useCallback((newItems: Item[]) => {
+    setItems(curr => [...curr, ...newItems])
+  }, [])  // ✅ No dependencies needed
+  
+  // Always uses latest state, no stale closure risk
+  const removeItem = useCallback((id: string) => {
+    setItems(curr => curr.filter(item => item.id !== id))
+  }, [])  // ✅ Safe and stable
+  
+  return <ItemsEditor items={items} onAdd={addItems} onRemove={removeItem} />
+}
+```
+
+**Benefits:**
+
+1. **Stable callback references** - Callbacks don't need to be recreated when state changes
+2. **No stale closures** - Always operates on the latest state value
+3. **Fewer dependencies** - Simplifies dependency arrays and reduces memory leaks
+4. **Prevents bugs** - Eliminates the most common source of React closure bugs
+
+**When to use functional updates:**
+
+- Any setState that depends on the current state value
+- Inside useCallback/useMemo when state is needed
+- Event handlers that reference state
+- Async operations that update state
+
+**When direct updates are fine:**
+
+- Setting state to a static value: `setCount(0)`
+- Setting state from props/arguments only: `setName(newName)`
+- State doesn't depend on previous value
+
+**Note:** If your project has [React Compiler](https://react.dev/learn/react-compiler) enabled, the compiler can automatically optimize some cases, but functional updates are still recommended for correctness and to prevent stale closure bugs.
--- a/.claude/skills/vercel-react-best-practices/rules/rerender-lazy-state-init.md
+++ b/.claude/skills/vercel-react-best-practices/rules/rerender-lazy-state-init.md
@@ -0,0 +1,58 @@
+---
+title: Use Lazy State Initialization
+impact: MEDIUM
+impactDescription: wasted computation on every render
+tags: react, hooks, useState, performance, initialization
+---
+
+## Use Lazy State Initialization
+
+Pass a function to `useState` for expensive initial values. Without the function form, the initializer runs on every render even though the value is only used once.
+
+**Incorrect (runs on every render):**
+
+```tsx
+function FilteredList({ items }: { items: Item[] }) {
+  // buildSearchIndex() runs on EVERY render, even after initialization
+  const [searchIndex, setSearchIndex] = useState(buildSearchIndex(items))
+  const [query, setQuery] = useState('')
+  
+  // When query changes, buildSearchIndex runs again unnecessarily
+  return <SearchResults index={searchIndex} query={query} />
+}
+
+function UserProfile() {
+  // JSON.parse runs on every render
+  const [settings, setSettings] = useState(
+    JSON.parse(localStorage.getItem('settings') || '{}')
+  )
+  
+  return <SettingsForm settings={settings} onChange={setSettings} />
+}
+```
+
+**Correct (runs only once):**
+
+```tsx
+function FilteredList({ items }: { items: Item[] }) {
+  // buildSearchIndex() runs ONLY on initial render
+  const [searchIndex, setSearchIndex] = useState(() => buildSearchIndex(items))
+  const [query, setQuery] = useState('')
+  
+  return <SearchResults index={searchIndex} query={query} />
+}
+
+function UserProfile() {
+  // JSON.parse runs only on initial render
+  const [settings, setSettings] = useState(() => {
+    const stored = localStorage.getItem('settings')
+    return stored ? JSON.parse(stored) : {}
+  })
+  
+  return <SettingsForm settings={settings} onChange={setSettings} />
+}
+```
+
+Use lazy initialization when computing initial values from localStorage/sessionStorage, building data structures (indexes, maps), reading from the DOM, or performing heavy transformations.
+
+For simple primitives (`useState(0)`), direct references (`useState(props.value)`), or cheap literals (`useState({})`), the function form is unnecessary.
--- a/.claude/skills/vercel-react-best-practices/rules/rerender-memo.md
+++ b/.claude/skills/vercel-react-best-practices/rules/rerender-memo.md
@@ -0,0 +1,44 @@
+---
+title: Extract to Memoized Components
+impact: MEDIUM
+impactDescription: enables early returns
+tags: rerender, memo, useMemo, optimization
+---
+
+## Extract to Memoized Components
+
+Extract expensive work into memoized components to enable early returns before computation.
+
+**Incorrect (computes avatar even when loading):**
+
+```tsx
+function Profile({ user, loading }: Props) {
+  const avatar = useMemo(() => {
+    const id = computeAvatarId(user)
+    return <Avatar id={id} />
+  }, [user])
+
+  if (loading) return <Skeleton />
+  return <div>{avatar}</div>
+}
+```
+
+**Correct (skips computation when loading):**
+
+```tsx
+const UserAvatar = memo(function UserAvatar({ user }: { user: User }) {
+  const id = useMemo(() => computeAvatarId(user), [user])
+  return <Avatar id={id} />
+})
+
+function Profile({ user, loading }: Props) {
+  if (loading) return <Skeleton />
+  return (
+    <div>
+      <UserAvatar user={user} />
+    </div>
+  )
+}
+```
+
+**Note:** If your project has [React Compiler](https://react.dev/learn/react-compiler) enabled, manual memoization with `memo()` and `useMemo()` is not necessary. The compiler automatically optimizes re-renders.
--- a/.claude/skills/vercel-react-best-practices/rules/rerender-transitions.md
+++ b/.claude/skills/vercel-react-best-practices/rules/rerender-transitions.md
@@ -0,0 +1,40 @@
+---
+title: Use Transitions for Non-Urgent Updates
+impact: MEDIUM
+impactDescription: maintains UI responsiveness
+tags: rerender, transitions, startTransition, performance
+---
+
+## Use Transitions for Non-Urgent Updates
+
+Mark frequent, non-urgent state updates as transitions to maintain UI responsiveness.
+
+**Incorrect (blocks UI on every scroll):**
+
+```tsx
+function ScrollTracker() {
+  const [scrollY, setScrollY] = useState(0)
+  useEffect(() => {
+    const handler = () => setScrollY(window.scrollY)
+    window.addEventListener('scroll', handler, { passive: true })
+    return () => window.removeEventListener('scroll', handler)
+  }, [])
+}
+```
+
+**Correct (non-blocking updates):**
+
+```tsx
+import { startTransition } from 'react'
+
+function ScrollTracker() {
+  const [scrollY, setScrollY] = useState(0)
+  useEffect(() => {
+    const handler = () => {
+      startTransition(() => setScrollY(window.scrollY))
+    }
+    window.addEventListener('scroll', handler, { passive: true })
+    return () => window.removeEventListener('scroll', handler)
+  }, [])
+}
+```
--- a/.claude/skills/vercel-react-best-practices/rules/server-after-nonblocking.md
+++ b/.claude/skills/vercel-react-best-practices/rules/server-after-nonblocking.md
@@ -0,0 +1,73 @@
+---
+title: Use after() for Non-Blocking Operations
+impact: MEDIUM
+impactDescription: faster response times
+tags: server, async, logging, analytics, side-effects
+---
+
+## Use after() for Non-Blocking Operations
+
+Use Next.js's `after()` to schedule work that should execute after a response is sent. This prevents logging, analytics, and other side effects from blocking the response.
+
+**Incorrect (blocks response):**
+
+```tsx
+import { logUserAction } from '@/app/utils'
+
+export async function POST(request: Request) {
+  // Perform mutation
+  await updateDatabase(request)
+  
+  // Logging blocks the response
+  const userAgent = request.headers.get('user-agent') || 'unknown'
+  await logUserAction({ userAgent })
+  
+  return new Response(JSON.stringify({ status: 'success' }), {
+    status: 200,
+    headers: { 'Content-Type': 'application/json' }
+  })
+}
+```
+
+**Correct (non-blocking):**
+
+```tsx
+import { after } from 'next/server'
+import { headers, cookies } from 'next/headers'
+import { logUserAction } from '@/app/utils'
+
+export async function POST(request: Request) {
+  // Perform mutation
+  await updateDatabase(request)
+  
+  // Log after response is sent
+  after(async () => {
+    const userAgent = (await headers()).get('user-agent') || 'unknown'
+    const sessionCookie = (await cookies()).get('session-id')?.value || 'anonymous'
+    
+    logUserAction({ sessionCookie, userAgent })
+  })
+  
+  return new Response(JSON.stringify({ status: 'success' }), {
+    status: 200,
+    headers: { 'Content-Type': 'application/json' }
+  })
+}
+```
+
+The response is sent immediately while logging happens in the background.
+
+**Common use cases:**
+
+- Analytics tracking
+- Audit logging
+- Sending notifications
+- Cache invalidation
+- Cleanup tasks
+
+**Important notes:**
+
+- `after()` runs even if the response fails or redirects
+- Works in Server Actions, Route Handlers, and Server Components
+
+Reference: [https://nextjs.org/docs/app/api-reference/functions/after](https://nextjs.org/docs/app/api-reference/functions/after)
--- a/.claude/skills/vercel-react-best-practices/rules/server-cache-lru.md
+++ b/.claude/skills/vercel-react-best-practices/rules/server-cache-lru.md
@@ -0,0 +1,41 @@
+---
+title: Cross-Request LRU Caching
+impact: HIGH
+impactDescription: caches across requests
+tags: server, cache, lru, cross-request
+---
+
+## Cross-Request LRU Caching
+
+`React.cache()` only works within one request. For data shared across sequential requests (user clicks button A then button B), use an LRU cache.
+
+**Implementation:**
+
+```typescript
+import { LRUCache } from 'lru-cache'
+
+const cache = new LRUCache<string, any>({
+  max: 1000,
+  ttl: 5 * 60 * 1000  // 5 minutes
+})
+
+export async function getUser(id: string) {
+  const cached = cache.get(id)
+  if (cached) return cached
+
+  const user = await db.user.findUnique({ where: { id } })
+  cache.set(id, user)
+  return user
+}
+
+// Request 1: DB query, result cached
+// Request 2: cache hit, no DB query
+```
+
+Use when sequential user actions hit multiple endpoints needing the same data within seconds.
+
+**With Vercel's [Fluid Compute](https://vercel.com/docs/fluid-compute):** LRU caching is especially effective because multiple concurrent requests can share the same function instance and cache. This means the cache persists across requests without needing external storage like Redis.
+
+**In traditional serverless:** Each invocation runs in isolation, so consider Redis for cross-process caching.
+
+Reference: [https://github.com/isaacs/node-lru-cache](https://github.com/isaacs/node-lru-cache)
--- a/.claude/skills/vercel-react-best-practices/rules/server-cache-react.md
+++ b/.claude/skills/vercel-react-best-practices/rules/server-cache-react.md
@@ -0,0 +1,26 @@
+---
+title: Per-Request Deduplication with React.cache()
+impact: MEDIUM
+impactDescription: deduplicates within request
+tags: server, cache, react-cache, deduplication
+---
+
+## Per-Request Deduplication with React.cache()
+
+Use `React.cache()` for server-side request deduplication. Authentication and database queries benefit most.
+
+**Usage:**
+
+```typescript
+import { cache } from 'react'
+
+export const getCurrentUser = cache(async () => {
+  const session = await auth()
+  if (!session?.user?.id) return null
+  return await db.user.findUnique({
+    where: { id: session.user.id }
+  })
+})
+```
+
+Within a single request, multiple calls to `getCurrentUser()` execute the query only once.
--- a/.claude/skills/vercel-react-best-practices/rules/server-parallel-fetching.md
+++ b/.claude/skills/vercel-react-best-practices/rules/server-parallel-fetching.md
@@ -0,0 +1,79 @@
+---
+title: Parallel Data Fetching with Component Composition
+impact: CRITICAL
+impactDescription: eliminates server-side waterfalls
+tags: server, rsc, parallel-fetching, composition
+---
+
+## Parallel Data Fetching with Component Composition
+
+React Server Components execute sequentially within a tree. Restructure with composition to parallelize data fetching.
+
+**Incorrect (Sidebar waits for Page's fetch to complete):**
+
+```tsx
+export default async function Page() {
+  const header = await fetchHeader()
+  return (
+    <div>
+      <div>{header}</div>
+      <Sidebar />
+    </div>
+  )
+}
+
+async function Sidebar() {
+  const items = await fetchSidebarItems()
+  return <nav>{items.map(renderItem)}</nav>
+}
+```
+
+**Correct (both fetch simultaneously):**
+
+```tsx
+async function Header() {
+  const data = await fetchHeader()
+  return <div>{data}</div>
+}
+
+async function Sidebar() {
+  const items = await fetchSidebarItems()
+  return <nav>{items.map(renderItem)}</nav>
+}
+
+export default function Page() {
+  return (
+    <div>
+      <Header />
+      <Sidebar />
+    </div>
+  )
+}
+```
+
+**Alternative with children prop:**
+
+```tsx
+async function Layout({ children }: { children: ReactNode }) {
+  const header = await fetchHeader()
+  return (
+    <div>
+      <div>{header}</div>
+      {children}
+    </div>
+  )
+}
+
+async function Sidebar() {
+  const items = await fetchSidebarItems()
+  return <nav>{items.map(renderItem)}</nav>
+}
+
+export default function Page() {
+  return (
+    <Layout>
+      <Sidebar />
+    </Layout>
+  )
+}
+```
--- a/.claude/skills/vercel-react-best-practices/rules/server-serialization.md
+++ b/.claude/skills/vercel-react-best-practices/rules/server-serialization.md
@@ -0,0 +1,38 @@
+---
+title: Minimize Serialization at RSC Boundaries
+impact: HIGH
+impactDescription: reduces data transfer size
+tags: server, rsc, serialization, props
+---
+
+## Minimize Serialization at RSC Boundaries
+
+The React Server/Client boundary serializes all object properties into strings and embeds them in the HTML response and subsequent RSC requests. This serialized data directly impacts page weight and load time, so **size matters a lot**. Only pass fields that the client actually uses.
+
+**Incorrect (serializes all 50 fields):**
+
+```tsx
+async function Page() {
+  const user = await fetchUser()  // 50 fields
+  return <Profile user={user} />
+}
+
+'use client'
+function Profile({ user }: { user: User }) {
+  return <div>{user.name}</div>  // uses 1 field
+}
+```
+
+**Correct (serializes only 1 field):**
+
+```tsx
+async function Page() {
+  const user = await fetchUser()
+  return <Profile name={user.name} />
+}
+
+'use client'
+function Profile({ name }: { name: string }) {
+  return <div>{name}</div>
+}
+```
--- a/.github/workflows/claude-ci-failure-auto-fix.yml
+++ b/.github/workflows/claude-ci-failure-auto-fix.yml
@@ -93,5 +93,5 @@ jobs:

            Error logs:
            ${{ toJSON(fromJSON(steps.failure_details.outputs.result).errorLogs) }}
-          anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
+          claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
          claude_args: "--allowedTools 'Edit,MultiEdit,Write,Read,Glob,Grep,LS,Bash(git:*),Bash(bun:*),Bash(npm:*),Bash(npx:*),Bash(gh:*)'"
--- a/.github/workflows/claude-dependabot.yml
+++ b/.github/workflows/claude-dependabot.yml
@@ -7,7 +7,7 @@
 # - Provide actionable recommendations for the development team
 #
 # Triggered on: Dependabot PRs (opened, synchronize)
-# Requirements: ANTHROPIC_API_KEY secret must be configured
+# Requirements: CLAUDE_CODE_OAUTH_TOKEN secret must be configured

 name: Claude Dependabot PR Review

@@ -308,7 +308,7 @@ jobs:
        id: claude_review
        uses: anthropics/claude-code-action@v1
        with:
-          anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
+          claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
          claude_args: |
            --allowedTools "Bash(npm:*),Bash(pnpm:*),Bash(poetry:*),Bash(git:*),Edit,Replace,NotebookEditCell,mcp__github_inline_comment__create_inline_comment,Bash(gh pr comment:*), Bash(gh pr diff:*), Bash(gh pr view:*)"
          prompt: |
--- a/.github/workflows/claude.yml
+++ b/.github/workflows/claude.yml
@@ -323,7 +323,7 @@ jobs:
        id: claude
        uses: anthropics/claude-code-action@v1
        with:
-          anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
+          claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
          claude_args: |
            --allowedTools "Bash(npm:*),Bash(pnpm:*),Bash(poetry:*),Bash(git:*),Edit,Replace,NotebookEditCell,mcp__github_inline_comment__create_inline_comment,Bash(gh pr comment:*), Bash(gh pr diff:*), Bash(gh pr view:*), Bash(gh pr edit:*)"
            --model opus
--- a/.github/workflows/docs-block-sync.yml
+++ b/.github/workflows/docs-block-sync.yml
@@ -0,0 +1,78 @@
+name: Block Documentation Sync Check
+
+on:
+  push:
+    branches: [master, dev]
+    paths:
+      - "autogpt_platform/backend/backend/blocks/**"
+      - "docs/integrations/**"
+      - "autogpt_platform/backend/scripts/generate_block_docs.py"
+      - ".github/workflows/docs-block-sync.yml"
+  pull_request:
+    branches: [master, dev]
+    paths:
+      - "autogpt_platform/backend/backend/blocks/**"
+      - "docs/integrations/**"
+      - "autogpt_platform/backend/scripts/generate_block_docs.py"
+      - ".github/workflows/docs-block-sync.yml"
+
+jobs:
+  check-docs-sync:
+    runs-on: ubuntu-latest
+    timeout-minutes: 15
+
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v4
+        with:
+          fetch-depth: 1
+
+      - name: Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: "3.11"
+
+      - name: Set up Python dependency cache
+        uses: actions/cache@v4
+        with:
+          path: ~/.cache/pypoetry
+          key: poetry-${{ runner.os }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
+          restore-keys: |
+            poetry-${{ runner.os }}-
+
+      - name: Install Poetry
+        run: |
+          cd autogpt_platform/backend
+          HEAD_POETRY_VERSION=$(python3 ../../.github/workflows/scripts/get_package_version_from_lockfile.py poetry)
+          echo "Found Poetry version ${HEAD_POETRY_VERSION} in backend/poetry.lock"
+          curl -sSL https://install.python-poetry.org | POETRY_VERSION=$HEAD_POETRY_VERSION python3 -
+          echo "$HOME/.local/bin" >> $GITHUB_PATH
+
+      - name: Install dependencies
+        working-directory: autogpt_platform/backend
+        run: |
+          poetry install --only main
+          poetry run prisma generate
+
+      - name: Check block documentation is in sync
+        working-directory: autogpt_platform/backend
+        run: |
+          echo "Checking if block documentation is in sync with code..."
+          poetry run python scripts/generate_block_docs.py --check
+
+      - name: Show diff if out of sync
+        if: failure()
+        working-directory: autogpt_platform/backend
+        run: |
+          echo "::error::Block documentation is out of sync with code!"
+          echo ""
+          echo "To fix this, run the following command locally:"
+          echo "  cd autogpt_platform/backend && poetry run python scripts/generate_block_docs.py"
+          echo ""
+          echo "Then commit the updated documentation files."
+          echo ""
+          echo "Regenerating docs to show diff..."
+          poetry run python scripts/generate_block_docs.py
+          echo ""
+          echo "Changes detected:"
+          git diff ../../docs/integrations/ || true
--- a/.github/workflows/docs-claude-review.yml
+++ b/.github/workflows/docs-claude-review.yml
@@ -0,0 +1,95 @@
+name: Claude Block Docs Review
+
+on:
+  pull_request:
+    types: [opened, synchronize]
+    paths:
+      - "docs/integrations/**"
+      - "autogpt_platform/backend/backend/blocks/**"
+
+jobs:
+  claude-review:
+    # Only run for PRs from members/collaborators
+    if: |
+      github.event.pull_request.author_association == 'OWNER' ||
+      github.event.pull_request.author_association == 'MEMBER' ||
+      github.event.pull_request.author_association == 'COLLABORATOR'
+    runs-on: ubuntu-latest
+    timeout-minutes: 15
+    permissions:
+      contents: read
+      pull-requests: write
+      id-token: write
+
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v4
+        with:
+          fetch-depth: 0
+
+      - name: Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: "3.11"
+
+      - name: Set up Python dependency cache
+        uses: actions/cache@v4
+        with:
+          path: ~/.cache/pypoetry
+          key: poetry-${{ runner.os }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
+          restore-keys: |
+            poetry-${{ runner.os }}-
+
+      - name: Install Poetry
+        run: |
+          cd autogpt_platform/backend
+          HEAD_POETRY_VERSION=$(python3 ../../.github/workflows/scripts/get_package_version_from_lockfile.py poetry)
+          curl -sSL https://install.python-poetry.org | POETRY_VERSION=$HEAD_POETRY_VERSION python3 -
+          echo "$HOME/.local/bin" >> $GITHUB_PATH
+
+      - name: Install dependencies
+        working-directory: autogpt_platform/backend
+        run: |
+          poetry install --only main
+          poetry run prisma generate
+
+      - name: Run Claude Code Review
+        uses: anthropics/claude-code-action@v1
+        with:
+          claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
+          claude_args: |
+            --allowedTools "Read,Glob,Grep,Bash(gh pr comment:*),Bash(gh pr diff:*),Bash(gh pr view:*)"
+          prompt: |
+            You are reviewing a PR that modifies block documentation or block code for AutoGPT.
+
+            ## Your Task
+            Review the changes in this PR and provide constructive feedback. Focus on:
+
+            1. **Documentation Accuracy**: For any block code changes, verify that:
+               - Input/output tables in docs match the actual block schemas
+               - Description text accurately reflects what the block does
+               - Any new blocks have corresponding documentation
+
+            2. **Manual Content Quality**: Check manual sections (marked with `<!-- MANUAL: -->` markers):
+               - "How it works" sections should have clear technical explanations
+               - "Possible use case" sections should have practical, real-world examples
+               - Content should be helpful for users trying to understand the blocks
+
+            3. **Template Compliance**: Ensure docs follow the standard template:
+               - What it is (brief intro)
+               - What it does (description)
+               - How it works (technical explanation)
+               - Inputs table
+               - Outputs table
+               - Possible use case
+
+            4. **Cross-references**: Check that links and anchors are correct
+
+            ## Review Process
+            1. First, get the PR diff to see what changed: `gh pr diff ${{ github.event.pull_request.number }}`
+            2. Read any modified block files to understand the implementation
+            3. Read corresponding documentation files to verify accuracy
+            4. Provide your feedback as a PR comment
+
+            Be constructive and specific. If everything looks good, say so!
+            If there are issues, explain what's wrong and suggest how to fix it.
--- a/.github/workflows/docs-enhance.yml
+++ b/.github/workflows/docs-enhance.yml
@@ -0,0 +1,194 @@
+name: Enhance Block Documentation
+
+on:
+  workflow_dispatch:
+    inputs:
+      block_pattern:
+        description: 'Block file pattern to enhance (e.g., "google/*.md" or "*" for all blocks)'
+        required: true
+        default: '*'
+        type: string
+      dry_run:
+        description: 'Dry run mode - show proposed changes without committing'
+        type: boolean
+        default: true
+      max_blocks:
+        description: 'Maximum number of blocks to process (0 for unlimited)'
+        type: number
+        default: 10
+
+jobs:
+  enhance-docs:
+    runs-on: ubuntu-latest
+    timeout-minutes: 45
+    permissions:
+      contents: write
+      pull-requests: write
+      id-token: write
+
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v4
+        with:
+          fetch-depth: 1
+
+      - name: Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: "3.11"
+
+      - name: Set up Python dependency cache
+        uses: actions/cache@v4
+        with:
+          path: ~/.cache/pypoetry
+          key: poetry-${{ runner.os }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
+          restore-keys: |
+            poetry-${{ runner.os }}-
+
+      - name: Install Poetry
+        run: |
+          cd autogpt_platform/backend
+          HEAD_POETRY_VERSION=$(python3 ../../.github/workflows/scripts/get_package_version_from_lockfile.py poetry)
+          curl -sSL https://install.python-poetry.org | POETRY_VERSION=$HEAD_POETRY_VERSION python3 -
+          echo "$HOME/.local/bin" >> $GITHUB_PATH
+
+      - name: Install dependencies
+        working-directory: autogpt_platform/backend
+        run: |
+          poetry install --only main
+          poetry run prisma generate
+
+      - name: Run Claude Enhancement
+        uses: anthropics/claude-code-action@v1
+        with:
+          claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
+          claude_args: |
+            --allowedTools "Read,Edit,Glob,Grep,Write,Bash(git:*),Bash(gh:*),Bash(find:*),Bash(ls:*)"
+          prompt: |
+            You are enhancing block documentation for AutoGPT. Your task is to improve the MANUAL sections
+            of block documentation files by reading the actual block implementations and writing helpful content.
+
+            ## Configuration
+            - Block pattern: ${{ inputs.block_pattern }}
+            - Dry run: ${{ inputs.dry_run }}
+            - Max blocks to process: ${{ inputs.max_blocks }}
+
+            ## Your Task
+
+            1. **Find Documentation Files**
+               Find block documentation files matching the pattern in `docs/integrations/`
+               Pattern: ${{ inputs.block_pattern }}
+
+               Use: `find docs/integrations -name "*.md" -type f`
+
+            2. **For Each Documentation File** (up to ${{ inputs.max_blocks }} files):
+
+               a. Read the documentation file
+
+               b. Identify which block(s) it documents (look for the block class name)
+
+               c. Find and read the corresponding block implementation in `autogpt_platform/backend/backend/blocks/`
+
+               d. Improve the MANUAL sections:
+
+                  **"How it works" section** (within `<!-- MANUAL: how_it_works -->` markers):
+                  - Explain the technical flow of the block
+                  - Describe what APIs or services it connects to
+                  - Note any important configuration or prerequisites
+                  - Keep it concise but informative (2-4 paragraphs)
+
+                  **"Possible use case" section** (within `<!-- MANUAL: use_case -->` markers):
+                  - Provide 2-3 practical, real-world examples
+                  - Make them specific and actionable
+                  - Show how this block could be used in an automation workflow
+
+            3. **Important Rules**
+               - ONLY modify content within `<!-- MANUAL: -->` and `<!-- END MANUAL -->` markers
+               - Do NOT modify auto-generated sections (inputs/outputs tables, descriptions)
+               - Keep content accurate based on the actual block implementation
+               - Write for users who may not be technical experts
+
+            4. **Output**
+               ${{ inputs.dry_run == true && 'DRY RUN MODE: Show proposed changes for each file but do NOT actually edit the files. Describe what you would change.' || 'LIVE MODE: Actually edit the files to improve the documentation.' }}
+
+            ## Example Improvements
+
+            **Before (How it works):**
+            ```
+            _Add technical explanation here._
+            ```
+
+            **After (How it works):**
+            ```
+            This block connects to the GitHub API to retrieve issue information. When executed,
+            it authenticates using your GitHub credentials and fetches issue details including
+            title, body, labels, and assignees.
+
+            The block requires a valid GitHub OAuth connection with repository access permissions.
+            It supports both public and private repositories you have access to.
+            ```
+
+            **Before (Possible use case):**
+            ```
+            _Add practical use case examples here._
+            ```
+
+            **After (Possible use case):**
+            ```
+            **Customer Support Automation**: Monitor a GitHub repository for new issues with
+            the "bug" label, then automatically create a ticket in your support system and
+            notify the on-call engineer via Slack.
+
+            **Release Notes Generation**: When a new release is published, gather all closed
+            issues since the last release and generate a summary for your changelog.
+            ```
+
+            Begin by finding and listing the documentation files to process.
+
+      - name: Create PR with enhanced documentation
+        if: ${{ inputs.dry_run == false }}
+        env:
+          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+        run: |
+          # Check if there are changes
+          if git diff --quiet docs/integrations/; then
+            echo "No changes to commit"
+            exit 0
+          fi
+
+          # Configure git
+          git config user.name "github-actions[bot]"
+          git config user.email "github-actions[bot]@users.noreply.github.com"
+
+          # Create branch and commit
+          BRANCH_NAME="docs/enhance-blocks-$(date +%Y%m%d-%H%M%S)"
+          git checkout -b "$BRANCH_NAME"
+          git add docs/integrations/
+          git commit -m "docs: enhance block documentation with LLM-generated content
+
+          Pattern: ${{ inputs.block_pattern }}
+          Max blocks: ${{ inputs.max_blocks }}
+
+          🤖 Generated with [Claude Code](https://claude.com/claude-code)
+
+          Co-Authored-By: Claude <noreply@anthropic.com>"
+
+          # Push and create PR
+          git push -u origin "$BRANCH_NAME"
+          gh pr create \
+            --title "docs: LLM-enhanced block documentation" \
+            --body "## Summary
+          This PR contains LLM-enhanced documentation for block files matching pattern: \`${{ inputs.block_pattern }}\`
+
+          The following manual sections were improved:
+          - **How it works**: Technical explanations based on block implementations
+          - **Possible use case**: Practical, real-world examples
+
+          ## Review Checklist
+          - [ ] Content is accurate based on block implementations
+          - [ ] Examples are practical and helpful
+          - [ ] No auto-generated sections were modified
+
+          ---
+          🤖 Generated with [Claude Code](https://claude.com/claude-code)" \
+            --base dev
--- a/.github/workflows/platform-frontend-ci.yml
+++ b/.github/workflows/platform-frontend-ci.yml
@@ -128,7 +128,7 @@ jobs:
          token: ${{ secrets.GITHUB_TOKEN }}
          exitOnceUploaded: true

-  test:
+  e2e_test:
    runs-on: big-boi
    needs: setup
    strategy:
@@ -258,3 +258,39 @@ jobs:
      - name: Print Final Docker Compose logs
        if: always()
        run: docker compose -f ../docker-compose.yml logs
+
+  integration_test:
+    runs-on: ubuntu-latest
+    needs: setup
+
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v4
+        with:
+          submodules: recursive
+
+      - name: Set up Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: "22.18.0"
+
+      - name: Enable corepack
+        run: corepack enable
+
+      - name: Restore dependencies cache
+        uses: actions/cache@v4
+        with:
+          path: ~/.pnpm-store
+          key: ${{ needs.setup.outputs.cache-key }}
+          restore-keys: |
+            ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
+            ${{ runner.os }}-pnpm-
+
+      - name: Install dependencies
+        run: pnpm install --frozen-lockfile
+
+      - name: Generate API client
+        run: pnpm generate:api
+
+      - name: Run Integration Tests
+        run: pnpm test:unit
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -16,6 +16,32 @@ See `docs/content/platform/getting-started.md` for setup instructions.
 - Format Python code with `poetry run format`.
 - Format frontend code using `pnpm format`.

+
+## Frontend guidelines:
+
+See `/frontend/CONTRIBUTING.md` for complete patterns. Quick reference:
+
+1. **Pages**: Create in `src/app/(platform)/feature-name/page.tsx`
+   - Add `usePageName.ts` hook for logic
+   - Put sub-components in local `components/` folder
+2. **Components**: Structure as `ComponentName/ComponentName.tsx` + `useComponentName.ts` + `helpers.ts`
+   - Use design system components from `src/components/` (atoms, molecules, organisms)
+   - Never use `src/components/__legacy__/*`
+3. **Data fetching**: Use generated API hooks from `@/app/api/__generated__/endpoints/`
+   - Regenerate with `pnpm generate:api`
+   - Pattern: `use{Method}{Version}{OperationName}`
+4. **Styling**: Tailwind CSS only, use design tokens, Phosphor Icons only
+5. **Testing**: Add Storybook stories for new components, Playwright for E2E
+6. **Code conventions**: Function declarations (not arrow functions) for components/handlers
+- Component props should be `interface Props { ... }` (not exported) unless the interface needs to be used outside the component
+- Separate render logic from business logic (component.tsx + useComponent.ts + helpers.ts)
+- Colocate state when possible and avoid creating large components, use sub-components ( local `/components` folder next to the parent component ) when sensible
+- Avoid large hooks, abstract logic into `helpers.ts` files when sensible
+- Use function declarations for components, arrow functions only for callbacks
+- No barrel files or `index.ts` re-exports
+- Do not use `useCallback` or `useMemo` unless strictly needed
+- Avoid comments at all times unless the code is very complex
+
 ## Testing

 - Backend: `poetry run test` (runs pytest with a docker based postgres + prisma).
--- a/autogpt_platform/CLAUDE.md
+++ b/autogpt_platform/CLAUDE.md
@@ -201,7 +201,7 @@ If you get any pushback or hit complex block conditions check the new_blocks gui
 3. Write tests alongside the route file
 4. Run `poetry run test` to verify

-**Frontend feature development:**
+### Frontend guidelines:

 See `/frontend/CONTRIBUTING.md` for complete patterns. Quick reference:

@@ -217,6 +217,14 @@ See `/frontend/CONTRIBUTING.md` for complete patterns. Quick reference:
 4. **Styling**: Tailwind CSS only, use design tokens, Phosphor Icons only
 5. **Testing**: Add Storybook stories for new components, Playwright for E2E
 6. **Code conventions**: Function declarations (not arrow functions) for components/handlers
+- Component props should be `interface Props { ... }` (not exported) unless the interface needs to be used outside the component
+- Separate render logic from business logic (component.tsx + useComponent.ts + helpers.ts)
+- Colocate state when possible and avoid creating large components, use sub-components ( local `/components` folder next to the parent component ) when sensible
+- Avoid large hooks, abstract logic into `helpers.ts` files when sensible
+- Use function declarations for components, arrow functions only for callbacks
+- No barrel files or `index.ts` re-exports
+- Do not use `useCallback` or `useMemo` unless strictly needed
+- Avoid comments at all times unless the code is very complex

 ### Security Implementation

--- a/autogpt_platform/backend/backend/api/features/admin/execution_analytics_routes.py
+++ b/autogpt_platform/backend/backend/api/features/admin/execution_analytics_routes.py
@@ -28,6 +28,7 @@ from backend.executor.manager import get_db_async_client
 from backend.util.settings import Settings

 logger = logging.getLogger(__name__)
+settings = Settings()


 class ExecutionAnalyticsRequest(BaseModel):
@@ -63,6 +64,8 @@ class ExecutionAnalyticsResult(BaseModel):
    score: Optional[float]
    status: str  # "success", "failed", "skipped"
    error_message: Optional[str] = None
+    started_at: Optional[datetime] = None
+    ended_at: Optional[datetime] = None


 class ExecutionAnalyticsResponse(BaseModel):
@@ -224,11 +227,6 @@ async def generate_execution_analytics(
    )

    try:
-        # Validate model configuration
-        settings = Settings()
-        if not settings.secrets.openai_internal_api_key:
-            raise HTTPException(status_code=500, detail="OpenAI API key not configured")
-
        # Get database client
        db_client = get_db_async_client()

@@ -320,6 +318,8 @@ async def generate_execution_analytics(
                    ),
                    status="skipped",
                    error_message=None,  # Not an error - just already processed
+                    started_at=execution.started_at,
+                    ended_at=execution.ended_at,
                )
            )

@@ -349,6 +349,9 @@ async def _process_batch(
 ) -> list[ExecutionAnalyticsResult]:
    """Process a batch of executions concurrently."""

+    if not settings.secrets.openai_internal_api_key:
+        raise HTTPException(status_code=500, detail="OpenAI API key not configured")
+
    async def process_single_execution(execution) -> ExecutionAnalyticsResult:
        try:
            # Generate activity status and score using the specified model
@@ -387,6 +390,8 @@ async def _process_batch(
                    score=None,
                    status="skipped",
                    error_message="Activity generation returned None",
+                    started_at=execution.started_at,
+                    ended_at=execution.ended_at,
                )

            # Update the execution stats
@@ -416,6 +421,8 @@ async def _process_batch(
                summary_text=activity_response["activity_status"],
                score=activity_response["correctness_score"],
                status="success",
+                started_at=execution.started_at,
+                ended_at=execution.ended_at,
            )

        except Exception as e:
@@ -429,6 +436,8 @@ async def _process_batch(
                score=None,
                status="failed",
                error_message=str(e),
+                started_at=execution.started_at,
+                ended_at=execution.ended_at,
            )

    # Process all executions in the batch concurrently
--- a/autogpt_platform/backend/backend/api/features/chat/model.py
+++ b/autogpt_platform/backend/backend/api/features/chat/model.py
@@ -290,6 +290,11 @@ async def _cache_session(session: ChatSession) -> None:
    await async_redis.setex(redis_key, config.session_ttl, session.model_dump_json())


+async def cache_chat_session(session: ChatSession) -> None:
+    """Cache a chat session without persisting to the database."""
+    await _cache_session(session)
+
+
 async def _get_session_from_db(session_id: str) -> ChatSession | None:
    """Get a chat session from the database."""
    prisma_session = await chat_db.get_chat_session(session_id)
--- a/autogpt_platform/backend/backend/api/features/chat/routes.py
+++ b/autogpt_platform/backend/backend/api/features/chat/routes.py
@@ -172,12 +172,12 @@ async def get_session(
        user_id: The optional authenticated user ID, or None for anonymous access.

    Returns:
-        SessionDetailResponse: Details for the requested session; raises NotFoundError if not found.
+        SessionDetailResponse: Details for the requested session, or None if not found.

    """
    session = await get_chat_session(session_id, user_id)
    if not session:
-        raise NotFoundError(f"Session {session_id} not found")
+        raise NotFoundError(f"Session {session_id} not found.")

    messages = [message.model_dump() for message in session.messages]
    logger.info(
@@ -222,6 +222,8 @@ async def stream_chat_post(
    session = await _validate_and_get_session(session_id, user_id)

    async def event_generator() -> AsyncGenerator[str, None]:
+        chunk_count = 0
+        first_chunk_type: str | None = None
        async for chunk in chat_service.stream_chat_completion(
            session_id,
            request.message,
@@ -230,7 +232,26 @@ async def stream_chat_post(
            session=session,  # Pass pre-fetched session to avoid double-fetch
            context=request.context,
        ):
+            if chunk_count < 3:
+                logger.info(
+                    "Chat stream chunk",
+                    extra={
+                        "session_id": session_id,
+                        "chunk_type": str(chunk.type),
+                    },
+                )
+            if not first_chunk_type:
+                first_chunk_type = str(chunk.type)
+            chunk_count += 1
            yield chunk.to_sse()
+        logger.info(
+            "Chat stream completed",
+            extra={
+                "session_id": session_id,
+                "chunk_count": chunk_count,
+                "first_chunk_type": first_chunk_type,
+            },
+        )
        # AI SDK protocol termination
        yield "data: [DONE]\n\n"

@@ -275,6 +296,8 @@ async def stream_chat_get(
    session = await _validate_and_get_session(session_id, user_id)

    async def event_generator() -> AsyncGenerator[str, None]:
+        chunk_count = 0
+        first_chunk_type: str | None = None
        async for chunk in chat_service.stream_chat_completion(
            session_id,
            message,
@@ -282,7 +305,26 @@ async def stream_chat_get(
            user_id=user_id,
            session=session,  # Pass pre-fetched session to avoid double-fetch
        ):
+            if chunk_count < 3:
+                logger.info(
+                    "Chat stream chunk",
+                    extra={
+                        "session_id": session_id,
+                        "chunk_type": str(chunk.type),
+                    },
+                )
+            if not first_chunk_type:
+                first_chunk_type = str(chunk.type)
+            chunk_count += 1
            yield chunk.to_sse()
+        logger.info(
+            "Chat stream completed",
+            extra={
+                "session_id": session_id,
+                "chunk_count": chunk_count,
+                "first_chunk_type": first_chunk_type,
+            },
+        )
        # AI SDK protocol termination
        yield "data: [DONE]\n\n"

--- a/autogpt_platform/backend/backend/api/features/chat/service.py
+++ b/autogpt_platform/backend/backend/api/features/chat/service.py
@@ -1,15 +1,18 @@
 import asyncio
 import logging
+import time
+from asyncio import CancelledError
 from collections.abc import AsyncGenerator
 from typing import Any

 import orjson
-from langfuse import Langfuse
+from langfuse import get_client, propagate_attributes
+from langfuse.openai import openai  # type: ignore
 from openai import (
    APIConnectionError,
    APIError,
    APIStatusError,
-    AsyncOpenAI,
+    PermissionDeniedError,
    RateLimitError,
 )
 from openai.types.chat import ChatCompletionChunk, ChatCompletionToolParam
@@ -21,12 +24,12 @@ from backend.data.understanding import (
 from backend.util.exceptions import NotFoundError
 from backend.util.settings import Settings

-from . import db as chat_db
 from .config import ChatConfig
 from .model import (
    ChatMessage,
    ChatSession,
    Usage,
+    cache_chat_session,
    get_chat_session,
    update_session_title,
    upsert_chat_session,
@@ -50,10 +53,10 @@ logger = logging.getLogger(__name__)

 config = ChatConfig()
 settings = Settings()
-client = AsyncOpenAI(api_key=config.api_key, base_url=config.base_url)
+client = openai.AsyncOpenAI(api_key=config.api_key, base_url=config.base_url)

-# Langfuse client (lazy initialization)
-_langfuse_client: Langfuse | None = None
+
+langfuse = get_client()


 class LangfuseNotConfiguredError(Exception):
@@ -69,65 +72,6 @@ def _is_langfuse_configured() -> bool:
    )


-def _get_langfuse_client() -> Langfuse:
-    """Get or create the Langfuse client for prompt management and tracing."""
-    global _langfuse_client
-    if _langfuse_client is None:
-        if not _is_langfuse_configured():
-            raise LangfuseNotConfiguredError(
-                "Langfuse is not configured. The chat feature requires Langfuse for prompt management. "
-                "Please set the LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY environment variables."
-            )
-        _langfuse_client = Langfuse(
-            public_key=settings.secrets.langfuse_public_key,
-            secret_key=settings.secrets.langfuse_secret_key,
-            host=settings.secrets.langfuse_host or "https://cloud.langfuse.com",
-        )
-    return _langfuse_client
-
-
-def _get_environment() -> str:
-    """Get the current environment name for Langfuse tagging."""
-    return settings.config.app_env.value
-
-
-def _get_langfuse_prompt() -> str:
-    """Fetch the latest production prompt from Langfuse.
-
-    Returns:
-        The compiled prompt text from Langfuse.
-
-    Raises:
-        Exception: If Langfuse is unavailable or prompt fetch fails.
-    """
-    try:
-        langfuse = _get_langfuse_client()
-        # cache_ttl_seconds=0 disables SDK caching to always get the latest prompt
-        prompt = langfuse.get_prompt(config.langfuse_prompt_name, cache_ttl_seconds=0)
-        compiled = prompt.compile()
-        logger.info(
-            f"Fetched prompt '{config.langfuse_prompt_name}' from Langfuse "
-            f"(version: {prompt.version})"
-        )
-        return compiled
-    except Exception as e:
-        logger.error(f"Failed to fetch prompt from Langfuse: {e}")
-        raise
-
-
-async def _is_first_session(user_id: str) -> bool:
-    """Check if this is the user's first chat session.
-
-    Returns True if the user has 1 or fewer sessions (meaning this is their first).
-    """
-    try:
-        session_count = await chat_db.get_user_session_count(user_id)
-        return session_count <= 1
-    except Exception as e:
-        logger.warning(f"Failed to check session count for user {user_id}: {e}")
-        return False  # Default to non-onboarding if we can't check
-
-
 async def _build_system_prompt(user_id: str | None) -> tuple[str, Any]:
    """Build the full system prompt including business understanding if available.

@@ -139,8 +83,6 @@ async def _build_system_prompt(user_id: str | None) -> tuple[str, Any]:
        Tuple of (compiled prompt string, Langfuse prompt object for tracing)
    """

-    langfuse = _get_langfuse_client()
-
    # cache_ttl_seconds=0 disables SDK caching to always get the latest prompt
    prompt = langfuse.get_prompt(config.langfuse_prompt_name, cache_ttl_seconds=0)

@@ -158,7 +100,7 @@ async def _build_system_prompt(user_id: str | None) -> tuple[str, Any]:
        context = "This is the first time you are meeting the user. Greet them and introduce them to the platform"

    compiled = prompt.compile(users_information=context)
-    return compiled, prompt
+    return compiled, understanding


 async def _generate_session_title(message: str) -> str | None:
@@ -217,6 +159,7 @@ async def assign_user_to_session(
 async def stream_chat_completion(
    session_id: str,
    message: str | None = None,
+    tool_call_response: str | None = None,
    is_user_message: bool = True,
    user_id: str | None = None,
    retry_count: int = 0,
@@ -256,11 +199,6 @@ async def stream_chat_completion(
        yield StreamFinish()
        return

-    # Langfuse observations will be created after session is loaded (need messages for input)
-    # Initialize to None so finally block can safely check and end them
-    trace = None
-    generation = None
-
    # Only fetch from Redis if session not provided (initial call)
    if session is None:
        session = await get_chat_session(session_id, user_id)
@@ -299,9 +237,6 @@ async def stream_chat_completion(
            f"new message_count={len(session.messages)}"
        )

-    if len(session.messages) > config.max_context_messages:
-        raise ValueError(f"Max messages exceeded: {config.max_context_messages}")
-
    logger.info(
        f"Upserting session: {session.session_id} with user id {session.user_id}, "
        f"message_count={len(session.messages)}"
@@ -339,297 +274,349 @@ async def stream_chat_completion(
            asyncio.create_task(_update_title())

    # Build system prompt with business understanding
-    system_prompt, langfuse_prompt = await _build_system_prompt(user_id)
-
-    # Build input messages including system prompt for complete Langfuse logging
-    trace_input_messages = [{"role": "system", "content": system_prompt}] + [
-        m.model_dump() for m in session.messages
-    ]
+    system_prompt, understanding = await _build_system_prompt(user_id)

    # Create Langfuse trace for this LLM call (each call gets its own trace, grouped by session_id)
    # Using v3 SDK: start_observation creates a root span, update_trace sets trace-level attributes
-    try:
-        langfuse = _get_langfuse_client()
-        env = _get_environment()
-        trace = langfuse.start_observation(
-            name="chat_completion",
-            input={"messages": trace_input_messages},
-            metadata={
-                "environment": env,
-                "model": config.model,
-                "message_count": len(session.messages),
-                "prompt_name": langfuse_prompt.name if langfuse_prompt else None,
-                "prompt_version": langfuse_prompt.version if langfuse_prompt else None,
-            },
-        )
-        # Set trace-level attributes (session_id, user_id, tags)
-        trace.update_trace(
+    input = message
+    if not message and tool_call_response:
+        input = tool_call_response
+
+    langfuse = get_client()
+    with langfuse.start_as_current_observation(
+        as_type="span",
+        name="user-copilot-request",
+        input=input,
+    ) as span:
+        with propagate_attributes(
            session_id=session_id,
            user_id=user_id,
-            tags=[env, "copilot"],
-        )
-    except Exception as e:
-        logger.warning(f"Failed to create Langfuse trace: {e}")
+            tags=["copilot"],
+            metadata={
+                "users_information": format_understanding_for_prompt(understanding)[
+                    :200
+                ]  # langfuse only accepts upto to 200 chars
+            },
+        ):

-    # Initialize variables that will be used in finally block (must be defined before try)
-    assistant_response = ChatMessage(
-        role="assistant",
-        content="",
-    )
-    accumulated_tool_calls: list[dict[str, Any]] = []
-
-    # Wrap main logic in try/finally to ensure Langfuse observations are always ended
-    try:
-        has_yielded_end = False
-        has_yielded_error = False
-        has_done_tool_call = False
-        has_received_text = False
-        text_streaming_ended = False
-        tool_response_messages: list[ChatMessage] = []
-        should_retry = False
-
-        # Generate unique IDs for AI SDK protocol
-        import uuid as uuid_module
-
-        message_id = str(uuid_module.uuid4())
-        text_block_id = str(uuid_module.uuid4())
-
-        # Yield message start
-        yield StreamStart(messageId=message_id)
-
-        # Create Langfuse generation for each LLM call, linked to the prompt
-        # Using v3 SDK: start_observation with as_type="generation"
-        generation = (
-            trace.start_observation(
-                as_type="generation",
-                name="llm_call",
-                model=config.model,
-                input={"messages": trace_input_messages},
-                prompt=langfuse_prompt,
+            # Initialize variables that will be used in finally block (must be defined before try)
+            assistant_response = ChatMessage(
+                role="assistant",
+                content="",
            )
-            if trace
-            else None
-        )
+            accumulated_tool_calls: list[dict[str, Any]] = []
+            has_saved_assistant_message = False
+            has_appended_streaming_message = False
+            last_cache_time = 0.0
+            last_cache_content_len = 0

-        try:
-            async for chunk in _stream_chat_chunks(
-                session=session,
-                tools=tools,
-                system_prompt=system_prompt,
-                text_block_id=text_block_id,
-            ):
+            # Wrap main logic in try/finally to ensure Langfuse observations are always ended
+            has_yielded_end = False
+            has_yielded_error = False
+            has_done_tool_call = False
+            has_received_text = False
+            text_streaming_ended = False
+            tool_response_messages: list[ChatMessage] = []
+            should_retry = False

-                if isinstance(chunk, StreamTextStart):
-                    # Emit text-start before first text delta
-                    if not has_received_text:
+            # Generate unique IDs for AI SDK protocol
+            import uuid as uuid_module
+
+            message_id = str(uuid_module.uuid4())
+            text_block_id = str(uuid_module.uuid4())
+
+            # Yield message start
+            yield StreamStart(messageId=message_id)
+
+            try:
+                async for chunk in _stream_chat_chunks(
+                    session=session,
+                    tools=tools,
+                    system_prompt=system_prompt,
+                    text_block_id=text_block_id,
+                ):
+
+                    if isinstance(chunk, StreamTextStart):
+                        # Emit text-start before first text delta
+                        if not has_received_text:
+                            yield chunk
+                    elif isinstance(chunk, StreamTextDelta):
+                        delta = chunk.delta or ""
+                        assert assistant_response.content is not None
+                        assistant_response.content += delta
+                        has_received_text = True
+                        if not has_appended_streaming_message:
+                            session.messages.append(assistant_response)
+                            has_appended_streaming_message = True
+                        current_time = time.monotonic()
+                        content_len = len(assistant_response.content)
+                        if (
+                            current_time - last_cache_time >= 1.0
+                            and content_len > last_cache_content_len
+                        ):
+                            try:
+                                await cache_chat_session(session)
+                            except Exception as e:
+                                logger.warning(
+                                    f"Failed to cache partial session {session.session_id}: {e}"
+                                )
+                            last_cache_time = current_time
+                            last_cache_content_len = content_len
                        yield chunk
-                elif isinstance(chunk, StreamTextDelta):
-                    delta = chunk.delta or ""
-                    assert assistant_response.content is not None
-                    assistant_response.content += delta
-                    has_received_text = True
-                    yield chunk
-                elif isinstance(chunk, StreamTextEnd):
-                    # Emit text-end after text completes
-                    if has_received_text and not text_streaming_ended:
-                        text_streaming_ended = True
-                        yield chunk
-                elif isinstance(chunk, StreamToolInputStart):
-                    # Emit text-end before first tool call, but only if we've received text
-                    if has_received_text and not text_streaming_ended:
-                        yield StreamTextEnd(id=text_block_id)
-                        text_streaming_ended = True
-                    yield chunk
-                elif isinstance(chunk, StreamToolInputAvailable):
-                    # Accumulate tool calls in OpenAI format
-                    accumulated_tool_calls.append(
-                        {
-                            "id": chunk.toolCallId,
-                            "type": "function",
-                            "function": {
-                                "name": chunk.toolName,
-                                "arguments": orjson.dumps(chunk.input).decode("utf-8"),
-                            },
-                        }
-                    )
-                elif isinstance(chunk, StreamToolOutputAvailable):
-                    result_content = (
-                        chunk.output
-                        if isinstance(chunk.output, str)
-                        else orjson.dumps(chunk.output).decode("utf-8")
-                    )
-                    tool_response_messages.append(
-                        ChatMessage(
-                            role="tool",
-                            content=result_content,
-                            tool_call_id=chunk.toolCallId,
-                        )
-                    )
-                    has_done_tool_call = True
-                    # Track if any tool execution failed
-                    if not chunk.success:
-                        logger.warning(
-                            f"Tool {chunk.toolName} (ID: {chunk.toolCallId}) execution failed"
-                        )
-                    yield chunk
-                elif isinstance(chunk, StreamFinish):
-                    if not has_done_tool_call:
-                        # Emit text-end before finish if we received text but haven't closed it
+                    elif isinstance(chunk, StreamTextEnd):
+                        # Emit text-end after text completes
+                        if has_received_text and not text_streaming_ended:
+                            text_streaming_ended = True
+                            if assistant_response.content:
+                                logger.warn(
+                                    f"StreamTextEnd: Attempting to set output {assistant_response.content}"
+                                )
+                                span.update_trace(output=assistant_response.content)
+                                span.update(output=assistant_response.content)
+                            yield chunk
+                    elif isinstance(chunk, StreamToolInputStart):
+                        # Emit text-end before first tool call, but only if we've received text
                        if has_received_text and not text_streaming_ended:
                            yield StreamTextEnd(id=text_block_id)
                            text_streaming_ended = True
-                        has_yielded_end = True
                        yield chunk
-                elif isinstance(chunk, StreamError):
-                    has_yielded_error = True
-                elif isinstance(chunk, StreamUsage):
-                    session.usage.append(
-                        Usage(
-                            prompt_tokens=chunk.promptTokens,
-                            completion_tokens=chunk.completionTokens,
-                            total_tokens=chunk.totalTokens,
+                    elif isinstance(chunk, StreamToolInputAvailable):
+                        # Accumulate tool calls in OpenAI format
+                        accumulated_tool_calls.append(
+                            {
+                                "id": chunk.toolCallId,
+                                "type": "function",
+                                "function": {
+                                    "name": chunk.toolName,
+                                    "arguments": orjson.dumps(chunk.input).decode(
+                                        "utf-8"
+                                    ),
+                                },
+                            }
                        )
-                    )
-                else:
-                    logger.error(f"Unknown chunk type: {type(chunk)}", exc_info=True)
-        except Exception as e:
-            logger.error(f"Error during stream: {e!s}", exc_info=True)
+                    elif isinstance(chunk, StreamToolOutputAvailable):
+                        result_content = (
+                            chunk.output
+                            if isinstance(chunk.output, str)
+                            else orjson.dumps(chunk.output).decode("utf-8")
+                        )
+                        tool_response_messages.append(
+                            ChatMessage(
+                                role="tool",
+                                content=result_content,
+                                tool_call_id=chunk.toolCallId,
+                            )
+                        )
+                        has_done_tool_call = True
+                        # Track if any tool execution failed
+                        if not chunk.success:
+                            logger.warning(
+                                f"Tool {chunk.toolName} (ID: {chunk.toolCallId}) execution failed"
+                            )
+                        yield chunk
+                    elif isinstance(chunk, StreamFinish):
+                        if not has_done_tool_call:
+                            # Emit text-end before finish if we received text but haven't closed it
+                            if has_received_text and not text_streaming_ended:
+                                yield StreamTextEnd(id=text_block_id)
+                                text_streaming_ended = True

-            # Check if this is a retryable error (JSON parsing, incomplete tool calls, etc.)
-            is_retryable = isinstance(e, (orjson.JSONDecodeError, KeyError, TypeError))
+                            # Save assistant message before yielding finish to ensure it's persisted
+                            # even if client disconnects immediately after receiving StreamFinish
+                            if not has_saved_assistant_message:
+                                messages_to_save_early: list[ChatMessage] = []
+                                if accumulated_tool_calls:
+                                    assistant_response.tool_calls = (
+                                        accumulated_tool_calls
+                                    )
+                                if not has_appended_streaming_message and (
+                                    assistant_response.content
+                                    or assistant_response.tool_calls
+                                ):
+                                    messages_to_save_early.append(assistant_response)
+                                messages_to_save_early.extend(tool_response_messages)

-            if is_retryable and retry_count < config.max_retries:
-                logger.info(
-                    f"Retryable error encountered. Attempt {retry_count + 1}/{config.max_retries}"
+                                if messages_to_save_early:
+                                    session.messages.extend(messages_to_save_early)
+                                    logger.info(
+                                        f"Saving assistant message before StreamFinish: "
+                                        f"content_len={len(assistant_response.content or '')}, "
+                                        f"tool_calls={len(assistant_response.tool_calls or [])}, "
+                                        f"tool_responses={len(tool_response_messages)}"
+                                    )
+                                if (
+                                    messages_to_save_early
+                                    or has_appended_streaming_message
+                                ):
+                                    await upsert_chat_session(session)
+                                    has_saved_assistant_message = True
+
+                            has_yielded_end = True
+                            yield chunk
+                    elif isinstance(chunk, StreamError):
+                        has_yielded_error = True
+                        yield chunk
+                    elif isinstance(chunk, StreamUsage):
+                        session.usage.append(
+                            Usage(
+                                prompt_tokens=chunk.promptTokens,
+                                completion_tokens=chunk.completionTokens,
+                                total_tokens=chunk.totalTokens,
+                            )
+                        )
+                    else:
+                        logger.error(
+                            f"Unknown chunk type: {type(chunk)}", exc_info=True
+                        )
+                if assistant_response.content:
+                    langfuse.update_current_trace(output=assistant_response.content)
+                    langfuse.update_current_span(output=assistant_response.content)
+                elif tool_response_messages:
+                    langfuse.update_current_trace(output=str(tool_response_messages))
+                    langfuse.update_current_span(output=str(tool_response_messages))
+
+            except CancelledError:
+                if not has_saved_assistant_message:
+                    if accumulated_tool_calls:
+                        assistant_response.tool_calls = accumulated_tool_calls
+                    if assistant_response.content:
+                        assistant_response.content = (
+                            f"{assistant_response.content}\n\n[interrupted]"
+                        )
+                    else:
+                        assistant_response.content = "[interrupted]"
+                    if not has_appended_streaming_message:
+                        session.messages.append(assistant_response)
+                    if tool_response_messages:
+                        session.messages.extend(tool_response_messages)
+                    try:
+                        await upsert_chat_session(session)
+                    except Exception as e:
+                        logger.warning(
+                            f"Failed to save interrupted session {session.session_id}: {e}"
+                        )
+                raise
+            except Exception as e:
+                logger.error(f"Error during stream: {e!s}", exc_info=True)
+
+                # Check if this is a retryable error (JSON parsing, incomplete tool calls, etc.)
+                is_retryable = isinstance(
+                    e, (orjson.JSONDecodeError, KeyError, TypeError)
                )
-                should_retry = True
-            else:
-                # Non-retryable error or max retries exceeded
-                # Save any partial progress before reporting error
+
+                if is_retryable and retry_count < config.max_retries:
+                    logger.info(
+                        f"Retryable error encountered. Attempt {retry_count + 1}/{config.max_retries}"
+                    )
+                    should_retry = True
+                else:
+                    # Non-retryable error or max retries exceeded
+                    # Save any partial progress before reporting error
+                    messages_to_save: list[ChatMessage] = []
+
+                    # Add assistant message if it has content or tool calls
+                    if accumulated_tool_calls:
+                        assistant_response.tool_calls = accumulated_tool_calls
+                    if not has_appended_streaming_message and (
+                        assistant_response.content or assistant_response.tool_calls
+                    ):
+                        messages_to_save.append(assistant_response)
+
+                    # Add tool response messages after assistant message
+                    messages_to_save.extend(tool_response_messages)
+
+                    if not has_saved_assistant_message:
+                        if messages_to_save:
+                            session.messages.extend(messages_to_save)
+                        if messages_to_save or has_appended_streaming_message:
+                            await upsert_chat_session(session)
+
+                    if not has_yielded_error:
+                        error_message = str(e)
+                        if not is_retryable:
+                            error_message = f"Non-retryable error: {error_message}"
+                        elif retry_count >= config.max_retries:
+                            error_message = f"Max retries ({config.max_retries}) exceeded: {error_message}"
+
+                        error_response = StreamError(errorText=error_message)
+                        yield error_response
+                    if not has_yielded_end:
+                        yield StreamFinish()
+                    return
+
+            # Handle retry outside of exception handler to avoid nesting
+            if should_retry and retry_count < config.max_retries:
+                logger.info(
+                    f"Retrying stream_chat_completion for session {session_id}, attempt {retry_count + 1}"
+                )
+                async for chunk in stream_chat_completion(
+                    session_id=session.session_id,
+                    user_id=user_id,
+                    retry_count=retry_count + 1,
+                    session=session,
+                    context=context,
+                ):
+                    yield chunk
+                return  # Exit after retry to avoid double-saving in finally block
+
+            # Normal completion path - save session and handle tool call continuation
+            # Only save if we haven't already saved when StreamFinish was received
+            if not has_saved_assistant_message:
+                logger.info(
+                    f"Normal completion path: session={session.session_id}, "
+                    f"current message_count={len(session.messages)}"
+                )
+
+                # Build the messages list in the correct order
                messages_to_save: list[ChatMessage] = []

-                # Add assistant message if it has content or tool calls
+                # Add assistant message with tool_calls if any
                if accumulated_tool_calls:
                    assistant_response.tool_calls = accumulated_tool_calls
-                if assistant_response.content or assistant_response.tool_calls:
+                    logger.info(
+                        f"Added {len(accumulated_tool_calls)} tool calls to assistant message"
+                    )
+                if not has_appended_streaming_message and (
+                    assistant_response.content or assistant_response.tool_calls
+                ):
                    messages_to_save.append(assistant_response)
+                    logger.info(
+                        f"Saving assistant message with content_len={len(assistant_response.content or '')}, tool_calls={len(assistant_response.tool_calls or [])}"
+                    )

                # Add tool response messages after assistant message
                messages_to_save.extend(tool_response_messages)
-
-                session.messages.extend(messages_to_save)
-                await upsert_chat_session(session)
-
-                if not has_yielded_error:
-                    error_message = str(e)
-                    if not is_retryable:
-                        error_message = f"Non-retryable error: {error_message}"
-                    elif retry_count >= config.max_retries:
-                        error_message = f"Max retries ({config.max_retries}) exceeded: {error_message}"
-
-                    error_response = StreamError(errorText=error_message)
-                    yield error_response
-                if not has_yielded_end:
-                    yield StreamFinish()
-                return
-
-        # Handle retry outside of exception handler to avoid nesting
-        if should_retry and retry_count < config.max_retries:
-            logger.info(
-                f"Retrying stream_chat_completion for session {session_id}, attempt {retry_count + 1}"
-            )
-            async for chunk in stream_chat_completion(
-                session_id=session.session_id,
-                user_id=user_id,
-                retry_count=retry_count + 1,
-                session=session,
-                context=context,
-            ):
-                yield chunk
-            return  # Exit after retry to avoid double-saving in finally block
-
-        # Normal completion path - save session and handle tool call continuation
-        logger.info(
-            f"Normal completion path: session={session.session_id}, "
-            f"current message_count={len(session.messages)}"
-        )
-
-        # Build the messages list in the correct order
-        messages_to_save: list[ChatMessage] = []
-
-        # Add assistant message with tool_calls if any
-        if accumulated_tool_calls:
-            assistant_response.tool_calls = accumulated_tool_calls
-            logger.info(
-                f"Added {len(accumulated_tool_calls)} tool calls to assistant message"
-            )
-        if assistant_response.content or assistant_response.tool_calls:
-            messages_to_save.append(assistant_response)
-            logger.info(
-                f"Saving assistant message with content_len={len(assistant_response.content or '')}, tool_calls={len(assistant_response.tool_calls or [])}"
-            )
-
-        # Add tool response messages after assistant message
-        messages_to_save.extend(tool_response_messages)
-        logger.info(
-            f"Saving {len(tool_response_messages)} tool response messages, "
-            f"total_to_save={len(messages_to_save)}"
-        )
-
-        session.messages.extend(messages_to_save)
-        logger.info(
-            f"Extended session messages, new message_count={len(session.messages)}"
-        )
-        await upsert_chat_session(session)
-
-        # If we did a tool call, stream the chat completion again to get the next response
-        if has_done_tool_call:
-            logger.info(
-                "Tool call executed, streaming chat completion again to get assistant response"
-            )
-            async for chunk in stream_chat_completion(
-                session_id=session.session_id,
-                user_id=user_id,
-                session=session,  # Pass session object to avoid Redis refetch
-                context=context,
-            ):
-                yield chunk
-
-    finally:
-        # Always end Langfuse observations to prevent resource leaks
-        # Guard against None and catch errors to avoid masking original exceptions
-        if generation is not None:
-            try:
-                latest_usage = session.usage[-1] if session.usage else None
-                generation.update(
-                    model=config.model,
-                    output={
-                        "content": assistant_response.content,
-                        "tool_calls": accumulated_tool_calls or None,
-                    },
-                    usage_details=(
-                        {
-                            "input": latest_usage.prompt_tokens,
-                            "output": latest_usage.completion_tokens,
-                            "total": latest_usage.total_tokens,
-                        }
-                        if latest_usage
-                        else None
-                    ),
+                logger.info(
+                    f"Saving {len(tool_response_messages)} tool response messages, "
+                    f"total_to_save={len(messages_to_save)}"
                )
-                generation.end()
-            except Exception as e:
-                logger.warning(f"Failed to end Langfuse generation: {e}")

-        if trace is not None:
-            try:
-                if accumulated_tool_calls:
-                    trace.update_trace(output={"tool_calls": accumulated_tool_calls})
-                else:
-                    trace.update_trace(output={"response": assistant_response.content})
-                trace.end()
-            except Exception as e:
-                logger.warning(f"Failed to end Langfuse trace: {e}")
+                if messages_to_save:
+                    session.messages.extend(messages_to_save)
+                    logger.info(
+                        f"Extended session messages, new message_count={len(session.messages)}"
+                    )
+                if messages_to_save or has_appended_streaming_message:
+                    await upsert_chat_session(session)
+            else:
+                logger.info(
+                    "Assistant message already saved when StreamFinish was received, "
+                    "skipping duplicate save"
+                )
+
+            # If we did a tool call, stream the chat completion again to get the next response
+            if has_done_tool_call:
+                logger.info(
+                    "Tool call executed, streaming chat completion again to get assistant response"
+                )
+                async for chunk in stream_chat_completion(
+                    session_id=session.session_id,
+                    user_id=user_id,
+                    session=session,  # Pass session object to avoid Redis refetch
+                    context=context,
+                    tool_call_response=str(tool_response_messages),
+                ):
+                    yield chunk


 # Retry configuration for OpenAI API calls
@@ -657,6 +644,12 @@ def _is_retryable_error(error: Exception) -> bool:
    return False


+def _is_region_blocked_error(error: Exception) -> bool:
+    if isinstance(error, PermissionDeniedError):
+        return "not available in your region" in str(error).lower()
+    return "not available in your region" in str(error).lower()
+
+
 async def _stream_chat_chunks(
    session: ChatSession,
    tools: list[ChatCompletionToolParam],
@@ -849,7 +842,18 @@ async def _stream_chat_chunks(
                        f"Error in stream (not retrying): {e!s}",
                        exc_info=True,
                    )
-                    error_response = StreamError(errorText=str(e))
+                    error_code = None
+                    error_text = str(e)
+                    if _is_region_blocked_error(e):
+                        error_code = "MODEL_NOT_AVAILABLE_REGION"
+                        error_text = (
+                            "This model is not available in your region. "
+                            "Please connect via VPN and try again."
+                        )
+                    error_response = StreamError(
+                        errorText=error_text,
+                        code=error_code,
+                    )
                    yield error_response
                    yield StreamFinish()
                    return
@@ -903,5 +907,4 @@ async def _yield_tool_call(
        session=session,
    )

-    logger.info(f"Yielding Tool execution response: {tool_execution_response}")
    yield tool_execution_response
--- a/autogpt_platform/backend/backend/api/features/chat/tools/init.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/init.py
@@ -7,9 +7,15 @@ from backend.api.features.chat.model import ChatSession
 from .add_understanding import AddUnderstandingTool
 from .agent_output import AgentOutputTool
 from .base import BaseTool
+from .create_agent import CreateAgentTool
+from .edit_agent import EditAgentTool
 from .find_agent import FindAgentTool
+from .find_block import FindBlockTool
 from .find_library_agent import FindLibraryAgentTool
+from .get_doc_page import GetDocPageTool
 from .run_agent import RunAgentTool
+from .run_block import RunBlockTool
+from .search_docs import SearchDocsTool

 if TYPE_CHECKING:
    from backend.api.features.chat.response_model import StreamToolOutputAvailable
@@ -17,10 +23,16 @@ if TYPE_CHECKING:
 # Single source of truth for all tools
 TOOL_REGISTRY: dict[str, BaseTool] = {
    "add_understanding": AddUnderstandingTool(),
+    "create_agent": CreateAgentTool(),
+    "edit_agent": EditAgentTool(),
    "find_agent": FindAgentTool(),
+    "find_block": FindBlockTool(),
    "find_library_agent": FindLibraryAgentTool(),
    "run_agent": RunAgentTool(),
-    "agent_output": AgentOutputTool(),
+    "run_block": RunBlockTool(),
+    "view_agent_output": AgentOutputTool(),
+    "search_docs": SearchDocsTool(),
+    "get_doc_page": GetDocPageTool(),
 }

 # Export individual tool instances for backwards compatibility
--- a/autogpt_platform/backend/backend/api/features/chat/tools/add_understanding.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/add_understanding.py
@@ -3,6 +3,8 @@
 import logging
 from typing import Any

+from langfuse import observe
+
 from backend.api.features.chat.model import ChatSession
 from backend.data.understanding import (
    BusinessUnderstandingInput,
@@ -59,6 +61,7 @@ and automations for the user's specific needs."""
        """Requires authentication to store user-specific data."""
        return True

+    @observe(as_type="tool", name="add_understanding")
    async def _execute(
        self,
        user_id: str | None,
--- a/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/init.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/init.py
@@ -0,0 +1,28 @@
+"""Agent generator package - Creates agents from natural language."""
+
+from .core import (
+    AgentGeneratorNotConfiguredError,
+    decompose_goal,
+    generate_agent,
+    generate_agent_patch,
+    get_agent_as_json,
+    json_to_graph,
+    save_agent_to_library,
+)
+from .service import health_check as check_external_service_health
+from .service import is_external_service_configured
+
+__all__ = [
+    # Core functions
+    "decompose_goal",
+    "generate_agent",
+    "generate_agent_patch",
+    "save_agent_to_library",
+    "get_agent_as_json",
+    "json_to_graph",
+    # Exceptions
+    "AgentGeneratorNotConfiguredError",
+    # Service
+    "is_external_service_configured",
+    "check_external_service_health",
+]
--- a/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/core.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/core.py
@@ -0,0 +1,277 @@
+"""Core agent generation functions."""
+
+import logging
+import uuid
+from typing import Any
+
+from backend.api.features.library import db as library_db
+from backend.data.graph import Graph, Link, Node, create_graph
+
+from .service import (
+    decompose_goal_external,
+    generate_agent_external,
+    generate_agent_patch_external,
+    is_external_service_configured,
+)
+
+logger = logging.getLogger(__name__)
+
+
+class AgentGeneratorNotConfiguredError(Exception):
+    """Raised when the external Agent Generator service is not configured."""
+
+    pass
+
+
+def _check_service_configured() -> None:
+    """Check if the external Agent Generator service is configured.
+
+    Raises:
+        AgentGeneratorNotConfiguredError: If the service is not configured.
+    """
+    if not is_external_service_configured():
+        raise AgentGeneratorNotConfiguredError(
+            "Agent Generator service is not configured. "
+            "Set AGENTGENERATOR_HOST environment variable to enable agent generation."
+        )
+
+
+async def decompose_goal(description: str, context: str = "") -> dict[str, Any] | None:
+    """Break down a goal into steps or return clarifying questions.
+
+    Args:
+        description: Natural language goal description
+        context: Additional context (e.g., answers to previous questions)
+
+    Returns:
+        Dict with either:
+        - {"type": "clarifying_questions", "questions": [...]}
+        - {"type": "instructions", "steps": [...]}
+        Or None on error
+
+    Raises:
+        AgentGeneratorNotConfiguredError: If the external service is not configured.
+    """
+    _check_service_configured()
+    logger.info("Calling external Agent Generator service for decompose_goal")
+    return await decompose_goal_external(description, context)
+
+
+async def generate_agent(instructions: dict[str, Any]) -> dict[str, Any] | None:
+    """Generate agent JSON from instructions.
+
+    Args:
+        instructions: Structured instructions from decompose_goal
+
+    Returns:
+        Agent JSON dict or None on error
+
+    Raises:
+        AgentGeneratorNotConfiguredError: If the external service is not configured.
+    """
+    _check_service_configured()
+    logger.info("Calling external Agent Generator service for generate_agent")
+    result = await generate_agent_external(instructions)
+    if result:
+        # Ensure required fields
+        if "id" not in result:
+            result["id"] = str(uuid.uuid4())
+        if "version" not in result:
+            result["version"] = 1
+        if "is_active" not in result:
+            result["is_active"] = True
+    return result
+
+
+def json_to_graph(agent_json: dict[str, Any]) -> Graph:
+    """Convert agent JSON dict to Graph model.
+
+    Args:
+        agent_json: Agent JSON with nodes and links
+
+    Returns:
+        Graph ready for saving
+    """
+    nodes = []
+    for n in agent_json.get("nodes", []):
+        node = Node(
+            id=n.get("id", str(uuid.uuid4())),
+            block_id=n["block_id"],
+            input_default=n.get("input_default", {}),
+            metadata=n.get("metadata", {}),
+        )
+        nodes.append(node)
+
+    links = []
+    for link_data in agent_json.get("links", []):
+        link = Link(
+            id=link_data.get("id", str(uuid.uuid4())),
+            source_id=link_data["source_id"],
+            sink_id=link_data["sink_id"],
+            source_name=link_data["source_name"],
+            sink_name=link_data["sink_name"],
+            is_static=link_data.get("is_static", False),
+        )
+        links.append(link)
+
+    return Graph(
+        id=agent_json.get("id", str(uuid.uuid4())),
+        version=agent_json.get("version", 1),
+        is_active=agent_json.get("is_active", True),
+        name=agent_json.get("name", "Generated Agent"),
+        description=agent_json.get("description", ""),
+        nodes=nodes,
+        links=links,
+    )
+
+
+def _reassign_node_ids(graph: Graph) -> None:
+    """Reassign all node and link IDs to new UUIDs.
+
+    This is needed when creating a new version to avoid unique constraint violations.
+    """
+    # Create mapping from old node IDs to new UUIDs
+    id_map = {node.id: str(uuid.uuid4()) for node in graph.nodes}
+
+    # Reassign node IDs
+    for node in graph.nodes:
+        node.id = id_map[node.id]
+
+    # Update link references to use new node IDs
+    for link in graph.links:
+        link.id = str(uuid.uuid4())  # Also give links new IDs
+        if link.source_id in id_map:
+            link.source_id = id_map[link.source_id]
+        if link.sink_id in id_map:
+            link.sink_id = id_map[link.sink_id]
+
+
+async def save_agent_to_library(
+    agent_json: dict[str, Any], user_id: str, is_update: bool = False
+) -> tuple[Graph, Any]:
+    """Save agent to database and user's library.
+
+    Args:
+        agent_json: Agent JSON dict
+        user_id: User ID
+        is_update: Whether this is an update to an existing agent
+
+    Returns:
+        Tuple of (created Graph, LibraryAgent)
+    """
+    from backend.data.graph import get_graph_all_versions
+
+    graph = json_to_graph(agent_json)
+
+    if is_update:
+        # For updates, keep the same graph ID but increment version
+        # and reassign node/link IDs to avoid conflicts
+        if graph.id:
+            existing_versions = await get_graph_all_versions(graph.id, user_id)
+            if existing_versions:
+                latest_version = max(v.version for v in existing_versions)
+                graph.version = latest_version + 1
+                # Reassign node IDs (but keep graph ID the same)
+                _reassign_node_ids(graph)
+                logger.info(f"Updating agent {graph.id} to version {graph.version}")
+    else:
+        # For new agents, always generate a fresh UUID to avoid collisions
+        graph.id = str(uuid.uuid4())
+        graph.version = 1
+        # Reassign all node IDs as well
+        _reassign_node_ids(graph)
+        logger.info(f"Creating new agent with ID {graph.id}")
+
+    # Save to database
+    created_graph = await create_graph(graph, user_id)
+
+    # Add to user's library (or update existing library agent)
+    library_agents = await library_db.create_library_agent(
+        graph=created_graph,
+        user_id=user_id,
+        sensitive_action_safe_mode=True,
+        create_library_agents_for_sub_graphs=False,
+    )
+
+    return created_graph, library_agents[0]
+
+
+async def get_agent_as_json(
+    graph_id: str, user_id: str | None
+) -> dict[str, Any] | None:
+    """Fetch an agent and convert to JSON format for editing.
+
+    Args:
+        graph_id: Graph ID or library agent ID
+        user_id: User ID
+
+    Returns:
+        Agent as JSON dict or None if not found
+    """
+    from backend.data.graph import get_graph
+
+    # Try to get the graph (version=None gets the active version)
+    graph = await get_graph(graph_id, version=None, user_id=user_id)
+    if not graph:
+        return None
+
+    # Convert to JSON format
+    nodes = []
+    for node in graph.nodes:
+        nodes.append(
+            {
+                "id": node.id,
+                "block_id": node.block_id,
+                "input_default": node.input_default,
+                "metadata": node.metadata,
+            }
+        )
+
+    links = []
+    for node in graph.nodes:
+        for link in node.output_links:
+            links.append(
+                {
+                    "id": link.id,
+                    "source_id": link.source_id,
+                    "sink_id": link.sink_id,
+                    "source_name": link.source_name,
+                    "sink_name": link.sink_name,
+                    "is_static": link.is_static,
+                }
+            )
+
+    return {
+        "id": graph.id,
+        "name": graph.name,
+        "description": graph.description,
+        "version": graph.version,
+        "is_active": graph.is_active,
+        "nodes": nodes,
+        "links": links,
+    }
+
+
+async def generate_agent_patch(
+    update_request: str, current_agent: dict[str, Any]
+) -> dict[str, Any] | None:
+    """Update an existing agent using natural language.
+
+    The external Agent Generator service handles:
+    - Generating the patch
+    - Applying the patch
+    - Fixing and validating the result
+
+    Args:
+        update_request: Natural language description of changes
+        current_agent: Current agent JSON
+
+    Returns:
+        Updated agent JSON, clarifying questions dict, or None on error
+
+    Raises:
+        AgentGeneratorNotConfiguredError: If the external service is not configured.
+    """
+    _check_service_configured()
+    logger.info("Calling external Agent Generator service for generate_agent_patch")
+    return await generate_agent_patch_external(update_request, current_agent)
--- a/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/service.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/service.py
@@ -0,0 +1,269 @@
+"""External Agent Generator service client.
+
+This module provides a client for communicating with the external Agent Generator
+microservice. When AGENTGENERATOR_HOST is configured, the agent generation functions
+will delegate to the external service instead of using the built-in LLM-based implementation.
+"""
+
+import logging
+from typing import Any
+
+import httpx
+
+from backend.util.settings import Settings
+
+logger = logging.getLogger(__name__)
+
+_client: httpx.AsyncClient | None = None
+_settings: Settings | None = None
+
+
+def _get_settings() -> Settings:
+    """Get or create settings singleton."""
+    global _settings
+    if _settings is None:
+        _settings = Settings()
+    return _settings
+
+
+def is_external_service_configured() -> bool:
+    """Check if external Agent Generator service is configured."""
+    settings = _get_settings()
+    return bool(settings.config.agentgenerator_host)
+
+
+def _get_base_url() -> str:
+    """Get the base URL for the external service."""
+    settings = _get_settings()
+    host = settings.config.agentgenerator_host
+    port = settings.config.agentgenerator_port
+    return f"http://{host}:{port}"
+
+
+def _get_client() -> httpx.AsyncClient:
+    """Get or create the HTTP client for the external service."""
+    global _client
+    if _client is None:
+        settings = _get_settings()
+        _client = httpx.AsyncClient(
+            base_url=_get_base_url(),
+            timeout=httpx.Timeout(settings.config.agentgenerator_timeout),
+        )
+    return _client
+
+
+async def decompose_goal_external(
+    description: str, context: str = ""
+) -> dict[str, Any] | None:
+    """Call the external service to decompose a goal.
+
+    Args:
+        description: Natural language goal description
+        context: Additional context (e.g., answers to previous questions)
+
+    Returns:
+        Dict with either:
+        - {"type": "clarifying_questions", "questions": [...]}
+        - {"type": "instructions", "steps": [...]}
+        - {"type": "unachievable_goal", ...}
+        - {"type": "vague_goal", ...}
+        Or None on error
+    """
+    client = _get_client()
+
+    # Build the request payload
+    payload: dict[str, Any] = {"description": description}
+    if context:
+        # The external service uses user_instruction for additional context
+        payload["user_instruction"] = context
+
+    try:
+        response = await client.post("/api/decompose-description", json=payload)
+        response.raise_for_status()
+        data = response.json()
+
+        if not data.get("success"):
+            logger.error(f"External service returned error: {data.get('error')}")
+            return None
+
+        # Map the response to the expected format
+        response_type = data.get("type")
+        if response_type == "instructions":
+            return {"type": "instructions", "steps": data.get("steps", [])}
+        elif response_type == "clarifying_questions":
+            return {
+                "type": "clarifying_questions",
+                "questions": data.get("questions", []),
+            }
+        elif response_type == "unachievable_goal":
+            return {
+                "type": "unachievable_goal",
+                "reason": data.get("reason"),
+                "suggested_goal": data.get("suggested_goal"),
+            }
+        elif response_type == "vague_goal":
+            return {
+                "type": "vague_goal",
+                "suggested_goal": data.get("suggested_goal"),
+            }
+        else:
+            logger.error(
+                f"Unknown response type from external service: {response_type}"
+            )
+            return None
+
+    except httpx.HTTPStatusError as e:
+        logger.error(f"HTTP error calling external agent generator: {e}")
+        return None
+    except httpx.RequestError as e:
+        logger.error(f"Request error calling external agent generator: {e}")
+        return None
+    except Exception as e:
+        logger.error(f"Unexpected error calling external agent generator: {e}")
+        return None
+
+
+async def generate_agent_external(
+    instructions: dict[str, Any]
+) -> dict[str, Any] | None:
+    """Call the external service to generate an agent from instructions.
+
+    Args:
+        instructions: Structured instructions from decompose_goal
+
+    Returns:
+        Agent JSON dict or None on error
+    """
+    client = _get_client()
+
+    try:
+        response = await client.post(
+            "/api/generate-agent", json={"instructions": instructions}
+        )
+        response.raise_for_status()
+        data = response.json()
+
+        if not data.get("success"):
+            logger.error(f"External service returned error: {data.get('error')}")
+            return None
+
+        return data.get("agent_json")
+
+    except httpx.HTTPStatusError as e:
+        logger.error(f"HTTP error calling external agent generator: {e}")
+        return None
+    except httpx.RequestError as e:
+        logger.error(f"Request error calling external agent generator: {e}")
+        return None
+    except Exception as e:
+        logger.error(f"Unexpected error calling external agent generator: {e}")
+        return None
+
+
+async def generate_agent_patch_external(
+    update_request: str, current_agent: dict[str, Any]
+) -> dict[str, Any] | None:
+    """Call the external service to generate a patch for an existing agent.
+
+    Args:
+        update_request: Natural language description of changes
+        current_agent: Current agent JSON
+
+    Returns:
+        Updated agent JSON, clarifying questions dict, or None on error
+    """
+    client = _get_client()
+
+    try:
+        response = await client.post(
+            "/api/update-agent",
+            json={
+                "update_request": update_request,
+                "current_agent_json": current_agent,
+            },
+        )
+        response.raise_for_status()
+        data = response.json()
+
+        if not data.get("success"):
+            logger.error(f"External service returned error: {data.get('error')}")
+            return None
+
+        # Check if it's clarifying questions
+        if data.get("type") == "clarifying_questions":
+            return {
+                "type": "clarifying_questions",
+                "questions": data.get("questions", []),
+            }
+
+        # Otherwise return the updated agent JSON
+        return data.get("agent_json")
+
+    except httpx.HTTPStatusError as e:
+        logger.error(f"HTTP error calling external agent generator: {e}")
+        return None
+    except httpx.RequestError as e:
+        logger.error(f"Request error calling external agent generator: {e}")
+        return None
+    except Exception as e:
+        logger.error(f"Unexpected error calling external agent generator: {e}")
+        return None
+
+
+async def get_blocks_external() -> list[dict[str, Any]] | None:
+    """Get available blocks from the external service.
+
+    Returns:
+        List of block info dicts or None on error
+    """
+    client = _get_client()
+
+    try:
+        response = await client.get("/api/blocks")
+        response.raise_for_status()
+        data = response.json()
+
+        if not data.get("success"):
+            logger.error("External service returned error getting blocks")
+            return None
+
+        return data.get("blocks", [])
+
+    except httpx.HTTPStatusError as e:
+        logger.error(f"HTTP error getting blocks from external service: {e}")
+        return None
+    except httpx.RequestError as e:
+        logger.error(f"Request error getting blocks from external service: {e}")
+        return None
+    except Exception as e:
+        logger.error(f"Unexpected error getting blocks from external service: {e}")
+        return None
+
+
+async def health_check() -> bool:
+    """Check if the external service is healthy.
+
+    Returns:
+        True if healthy, False otherwise
+    """
+    if not is_external_service_configured():
+        return False
+
+    client = _get_client()
+
+    try:
+        response = await client.get("/health")
+        response.raise_for_status()
+        data = response.json()
+        return data.get("status") == "healthy" and data.get("blocks_loaded", False)
+    except Exception as e:
+        logger.warning(f"External agent generator health check failed: {e}")
+        return False
+
+
+async def close_client() -> None:
+    """Close the HTTP client."""
+    global _client
+    if _client is not None:
+        await _client.aclose()
+        _client = None
--- a/autogpt_platform/backend/backend/api/features/chat/tools/agent_output.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/agent_output.py
@@ -5,6 +5,7 @@ import re
 from datetime import datetime, timedelta, timezone
 from typing import Any

+from langfuse import observe
 from pydantic import BaseModel, field_validator

 from backend.api.features.chat.model import ChatSession
@@ -103,7 +104,7 @@ class AgentOutputTool(BaseTool):

    @property
    def name(self) -> str:
-        return "agent_output"
+        return "view_agent_output"

    @property
    def description(self) -> str:
@@ -328,6 +329,7 @@ class AgentOutputTool(BaseTool):
            total_executions=len(available_executions) if available_executions else 1,
        )

+    @observe(as_type="tool", name="view_agent_output")
    async def _execute(
        self,
        user_id: str | None,
--- a/autogpt_platform/backend/backend/api/features/chat/tools/create_agent.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/create_agent.py
@@ -0,0 +1,238 @@
+"""CreateAgentTool - Creates agents from natural language descriptions."""
+
+import logging
+from typing import Any
+
+from langfuse import observe
+
+from backend.api.features.chat.model import ChatSession
+
+from .agent_generator import (
+    AgentGeneratorNotConfiguredError,
+    decompose_goal,
+    generate_agent,
+    save_agent_to_library,
+)
+from .base import BaseTool
+from .models import (
+    AgentPreviewResponse,
+    AgentSavedResponse,
+    ClarificationNeededResponse,
+    ClarifyingQuestion,
+    ErrorResponse,
+    ToolResponseBase,
+)
+
+logger = logging.getLogger(__name__)
+
+
+class CreateAgentTool(BaseTool):
+    """Tool for creating agents from natural language descriptions."""
+
+    @property
+    def name(self) -> str:
+        return "create_agent"
+
+    @property
+    def description(self) -> str:
+        return (
+            "Create a new agent workflow from a natural language description. "
+            "First generates a preview, then saves to library if save=true."
+        )
+
+    @property
+    def requires_auth(self) -> bool:
+        return True
+
+    @property
+    def parameters(self) -> dict[str, Any]:
+        return {
+            "type": "object",
+            "properties": {
+                "description": {
+                    "type": "string",
+                    "description": (
+                        "Natural language description of what the agent should do. "
+                        "Be specific about inputs, outputs, and the workflow steps."
+                    ),
+                },
+                "context": {
+                    "type": "string",
+                    "description": (
+                        "Additional context or answers to previous clarifying questions. "
+                        "Include any preferences or constraints mentioned by the user."
+                    ),
+                },
+                "save": {
+                    "type": "boolean",
+                    "description": (
+                        "Whether to save the agent to the user's library. "
+                        "Default is true. Set to false for preview only."
+                    ),
+                    "default": True,
+                },
+            },
+            "required": ["description"],
+        }
+
+    @observe(as_type="tool", name="create_agent")
+    async def _execute(
+        self,
+        user_id: str | None,
+        session: ChatSession,
+        **kwargs,
+    ) -> ToolResponseBase:
+        """Execute the create_agent tool.
+
+        Flow:
+        1. Decompose the description into steps (may return clarifying questions)
+        2. Generate agent JSON (external service handles fixing and validation)
+        3. Preview or save based on the save parameter
+        """
+        description = kwargs.get("description", "").strip()
+        context = kwargs.get("context", "")
+        save = kwargs.get("save", True)
+        session_id = session.session_id if session else None
+
+        if not description:
+            return ErrorResponse(
+                message="Please provide a description of what the agent should do.",
+                error="Missing description parameter",
+                session_id=session_id,
+            )
+
+        # Step 1: Decompose goal into steps
+        try:
+            decomposition_result = await decompose_goal(description, context)
+        except AgentGeneratorNotConfiguredError:
+            return ErrorResponse(
+                message=(
+                    "Agent generation is not available. "
+                    "The Agent Generator service is not configured."
+                ),
+                error="service_not_configured",
+                session_id=session_id,
+            )
+
+        if decomposition_result is None:
+            return ErrorResponse(
+                message="Failed to analyze the goal. Please try rephrasing.",
+                error="Decomposition failed",
+                session_id=session_id,
+            )
+
+        # Check if LLM returned clarifying questions
+        if decomposition_result.get("type") == "clarifying_questions":
+            questions = decomposition_result.get("questions", [])
+            return ClarificationNeededResponse(
+                message=(
+                    "I need some more information to create this agent. "
+                    "Please answer the following questions:"
+                ),
+                questions=[
+                    ClarifyingQuestion(
+                        question=q.get("question", ""),
+                        keyword=q.get("keyword", ""),
+                        example=q.get("example"),
+                    )
+                    for q in questions
+                ],
+                session_id=session_id,
+            )
+
+        # Check for unachievable/vague goals
+        if decomposition_result.get("type") == "unachievable_goal":
+            suggested = decomposition_result.get("suggested_goal", "")
+            reason = decomposition_result.get("reason", "")
+            return ErrorResponse(
+                message=(
+                    f"This goal cannot be accomplished with the available blocks. "
+                    f"{reason} "
+                    f"Suggestion: {suggested}"
+                ),
+                error="unachievable_goal",
+                details={"suggested_goal": suggested, "reason": reason},
+                session_id=session_id,
+            )
+
+        if decomposition_result.get("type") == "vague_goal":
+            suggested = decomposition_result.get("suggested_goal", "")
+            return ErrorResponse(
+                message=(
+                    f"The goal is too vague to create a specific workflow. "
+                    f"Suggestion: {suggested}"
+                ),
+                error="vague_goal",
+                details={"suggested_goal": suggested},
+                session_id=session_id,
+            )
+
+        # Step 2: Generate agent JSON (external service handles fixing and validation)
+        try:
+            agent_json = await generate_agent(decomposition_result)
+        except AgentGeneratorNotConfiguredError:
+            return ErrorResponse(
+                message=(
+                    "Agent generation is not available. "
+                    "The Agent Generator service is not configured."
+                ),
+                error="service_not_configured",
+                session_id=session_id,
+            )
+
+        if agent_json is None:
+            return ErrorResponse(
+                message="Failed to generate the agent. Please try again.",
+                error="Generation failed",
+                session_id=session_id,
+            )
+
+        agent_name = agent_json.get("name", "Generated Agent")
+        agent_description = agent_json.get("description", "")
+        node_count = len(agent_json.get("nodes", []))
+        link_count = len(agent_json.get("links", []))
+
+        # Step 3: Preview or save
+        if not save:
+            return AgentPreviewResponse(
+                message=(
+                    f"I've generated an agent called '{agent_name}' with {node_count} blocks. "
+                    f"Review it and call create_agent with save=true to save it to your library."
+                ),
+                agent_json=agent_json,
+                agent_name=agent_name,
+                description=agent_description,
+                node_count=node_count,
+                link_count=link_count,
+                session_id=session_id,
+            )
+
+        # Save to library
+        if not user_id:
+            return ErrorResponse(
+                message="You must be logged in to save agents.",
+                error="auth_required",
+                session_id=session_id,
+            )
+
+        try:
+            created_graph, library_agent = await save_agent_to_library(
+                agent_json, user_id
+            )
+
+            return AgentSavedResponse(
+                message=f"Agent '{created_graph.name}' has been saved to your library!",
+                agent_id=created_graph.id,
+                agent_name=created_graph.name,
+                library_agent_id=library_agent.id,
+                library_agent_link=f"/library/{library_agent.id}",
+                agent_page_link=f"/build?flowID={created_graph.id}",
+                session_id=session_id,
+            )
+        except Exception as e:
+            return ErrorResponse(
+                message=f"Failed to save the agent: {str(e)}",
+                error="save_failed",
+                details={"exception": str(e)},
+                session_id=session_id,
+            )
--- a/autogpt_platform/backend/backend/api/features/chat/tools/edit_agent.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/edit_agent.py
@@ -0,0 +1,224 @@
+"""EditAgentTool - Edits existing agents using natural language."""
+
+import logging
+from typing import Any
+
+from langfuse import observe
+
+from backend.api.features.chat.model import ChatSession
+
+from .agent_generator import (
+    AgentGeneratorNotConfiguredError,
+    generate_agent_patch,
+    get_agent_as_json,
+    save_agent_to_library,
+)
+from .base import BaseTool
+from .models import (
+    AgentPreviewResponse,
+    AgentSavedResponse,
+    ClarificationNeededResponse,
+    ClarifyingQuestion,
+    ErrorResponse,
+    ToolResponseBase,
+)
+
+logger = logging.getLogger(__name__)
+
+
+class EditAgentTool(BaseTool):
+    """Tool for editing existing agents using natural language."""
+
+    @property
+    def name(self) -> str:
+        return "edit_agent"
+
+    @property
+    def description(self) -> str:
+        return (
+            "Edit an existing agent from the user's library using natural language. "
+            "Generates updates to the agent while preserving unchanged parts."
+        )
+
+    @property
+    def requires_auth(self) -> bool:
+        return True
+
+    @property
+    def parameters(self) -> dict[str, Any]:
+        return {
+            "type": "object",
+            "properties": {
+                "agent_id": {
+                    "type": "string",
+                    "description": (
+                        "The ID of the agent to edit. "
+                        "Can be a graph ID or library agent ID."
+                    ),
+                },
+                "changes": {
+                    "type": "string",
+                    "description": (
+                        "Natural language description of what changes to make. "
+                        "Be specific about what to add, remove, or modify."
+                    ),
+                },
+                "context": {
+                    "type": "string",
+                    "description": (
+                        "Additional context or answers to previous clarifying questions."
+                    ),
+                },
+                "save": {
+                    "type": "boolean",
+                    "description": (
+                        "Whether to save the changes. "
+                        "Default is true. Set to false for preview only."
+                    ),
+                    "default": True,
+                },
+            },
+            "required": ["agent_id", "changes"],
+        }
+
+    @observe(as_type="tool", name="edit_agent")
+    async def _execute(
+        self,
+        user_id: str | None,
+        session: ChatSession,
+        **kwargs,
+    ) -> ToolResponseBase:
+        """Execute the edit_agent tool.
+
+        Flow:
+        1. Fetch the current agent
+        2. Generate updated agent (external service handles fixing and validation)
+        3. Preview or save based on the save parameter
+        """
+        agent_id = kwargs.get("agent_id", "").strip()
+        changes = kwargs.get("changes", "").strip()
+        context = kwargs.get("context", "")
+        save = kwargs.get("save", True)
+        session_id = session.session_id if session else None
+
+        if not agent_id:
+            return ErrorResponse(
+                message="Please provide the agent ID to edit.",
+                error="Missing agent_id parameter",
+                session_id=session_id,
+            )
+
+        if not changes:
+            return ErrorResponse(
+                message="Please describe what changes you want to make.",
+                error="Missing changes parameter",
+                session_id=session_id,
+            )
+
+        # Step 1: Fetch current agent
+        current_agent = await get_agent_as_json(agent_id, user_id)
+
+        if current_agent is None:
+            return ErrorResponse(
+                message=f"Could not find agent with ID '{agent_id}' in your library.",
+                error="agent_not_found",
+                session_id=session_id,
+            )
+
+        # Build the update request with context
+        update_request = changes
+        if context:
+            update_request = f"{changes}\n\nAdditional context:\n{context}"
+
+        # Step 2: Generate updated agent (external service handles fixing and validation)
+        try:
+            result = await generate_agent_patch(update_request, current_agent)
+        except AgentGeneratorNotConfiguredError:
+            return ErrorResponse(
+                message=(
+                    "Agent editing is not available. "
+                    "The Agent Generator service is not configured."
+                ),
+                error="service_not_configured",
+                session_id=session_id,
+            )
+
+        if result is None:
+            return ErrorResponse(
+                message="Failed to generate changes. Please try rephrasing.",
+                error="Update generation failed",
+                session_id=session_id,
+            )
+
+        # Check if LLM returned clarifying questions
+        if result.get("type") == "clarifying_questions":
+            questions = result.get("questions", [])
+            return ClarificationNeededResponse(
+                message=(
+                    "I need some more information about the changes. "
+                    "Please answer the following questions:"
+                ),
+                questions=[
+                    ClarifyingQuestion(
+                        question=q.get("question", ""),
+                        keyword=q.get("keyword", ""),
+                        example=q.get("example"),
+                    )
+                    for q in questions
+                ],
+                session_id=session_id,
+            )
+
+        # Result is the updated agent JSON
+        updated_agent = result
+
+        agent_name = updated_agent.get("name", "Updated Agent")
+        agent_description = updated_agent.get("description", "")
+        node_count = len(updated_agent.get("nodes", []))
+        link_count = len(updated_agent.get("links", []))
+
+        # Step 3: Preview or save
+        if not save:
+            return AgentPreviewResponse(
+                message=(
+                    f"I've updated the agent. "
+                    f"The agent now has {node_count} blocks. "
+                    f"Review it and call edit_agent with save=true to save the changes."
+                ),
+                agent_json=updated_agent,
+                agent_name=agent_name,
+                description=agent_description,
+                node_count=node_count,
+                link_count=link_count,
+                session_id=session_id,
+            )
+
+        # Save to library (creates a new version)
+        if not user_id:
+            return ErrorResponse(
+                message="You must be logged in to save agents.",
+                error="auth_required",
+                session_id=session_id,
+            )
+
+        try:
+            created_graph, library_agent = await save_agent_to_library(
+                updated_agent, user_id, is_update=True
+            )
+
+            return AgentSavedResponse(
+                message=f"Updated agent '{created_graph.name}' has been saved to your library!",
+                agent_id=created_graph.id,
+                agent_name=created_graph.name,
+                library_agent_id=library_agent.id,
+                library_agent_link=f"/library/{library_agent.id}",
+                agent_page_link=f"/build?flowID={created_graph.id}",
+                session_id=session_id,
+            )
+        except Exception as e:
+            return ErrorResponse(
+                message=f"Failed to save the updated agent: {str(e)}",
+                error="save_failed",
+                details={"exception": str(e)},
+                session_id=session_id,
+            )
--- a/autogpt_platform/backend/backend/api/features/chat/tools/find_agent.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/find_agent.py
@@ -2,6 +2,8 @@

 from typing import Any

+from langfuse import observe
+
 from backend.api.features.chat.model import ChatSession

 from .agent_search import search_agents
@@ -35,6 +37,7 @@ class FindAgentTool(BaseTool):
            "required": ["query"],
        }

+    @observe(as_type="tool", name="find_agent")
    async def _execute(
        self, user_id: str | None, session: ChatSession, **kwargs
    ) -> ToolResponseBase:
--- a/autogpt_platform/backend/backend/api/features/chat/tools/find_block.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/find_block.py
@@ -0,0 +1,194 @@
+import logging
+from typing import Any
+
+from langfuse import observe
+from prisma.enums import ContentType
+
+from backend.api.features.chat.model import ChatSession
+from backend.api.features.chat.tools.base import BaseTool, ToolResponseBase
+from backend.api.features.chat.tools.models import (
+    BlockInfoSummary,
+    BlockInputFieldInfo,
+    BlockListResponse,
+    ErrorResponse,
+    NoResultsResponse,
+)
+from backend.api.features.store.hybrid_search import unified_hybrid_search
+from backend.data.block import get_block
+
+logger = logging.getLogger(__name__)
+
+
+class FindBlockTool(BaseTool):
+    """Tool for searching available blocks."""
+
+    @property
+    def name(self) -> str:
+        return "find_block"
+
+    @property
+    def description(self) -> str:
+        return (
+            "Search for available blocks by name or description. "
+            "Blocks are reusable components that perform specific tasks like "
+            "sending emails, making API calls, processing text, etc. "
+            "IMPORTANT: Use this tool FIRST to get the block's 'id' before calling run_block. "
+            "The response includes each block's id, required_inputs, and input_schema."
+        )
+
+    @property
+    def parameters(self) -> dict[str, Any]:
+        return {
+            "type": "object",
+            "properties": {
+                "query": {
+                    "type": "string",
+                    "description": (
+                        "Search query to find blocks by name or description. "
+                        "Use keywords like 'email', 'http', 'text', 'ai', etc."
+                    ),
+                },
+            },
+            "required": ["query"],
+        }
+
+    @property
+    def requires_auth(self) -> bool:
+        return True
+
+    @observe(as_type="tool", name="find_block")
+    async def _execute(
+        self,
+        user_id: str | None,
+        session: ChatSession,
+        **kwargs,
+    ) -> ToolResponseBase:
+        """Search for blocks matching the query.
+
+        Args:
+            user_id: User ID (required)
+            session: Chat session
+            query: Search query
+
+        Returns:
+            BlockListResponse: List of matching blocks
+            NoResultsResponse: No blocks found
+            ErrorResponse: Error message
+        """
+        query = kwargs.get("query", "").strip()
+        session_id = session.session_id
+
+        if not query:
+            return ErrorResponse(
+                message="Please provide a search query",
+                session_id=session_id,
+            )
+
+        try:
+            # Search for blocks using hybrid search
+            results, total = await unified_hybrid_search(
+                query=query,
+                content_types=[ContentType.BLOCK],
+                page=1,
+                page_size=10,
+            )
+
+            if not results:
+                return NoResultsResponse(
+                    message=f"No blocks found for '{query}'",
+                    suggestions=[
+                        "Try broader keywords like 'email', 'http', 'text', 'ai'",
+                        "Check spelling of technical terms",
+                    ],
+                    session_id=session_id,
+                )
+
+            # Enrich results with full block information
+            blocks: list[BlockInfoSummary] = []
+            for result in results:
+                block_id = result["content_id"]
+                block = get_block(block_id)
+
+                if block:
+                    # Get input/output schemas
+                    input_schema = {}
+                    output_schema = {}
+                    try:
+                        input_schema = block.input_schema.jsonschema()
+                    except Exception:
+                        pass
+                    try:
+                        output_schema = block.output_schema.jsonschema()
+                    except Exception:
+                        pass
+
+                    # Get categories from block instance
+                    categories = []
+                    if hasattr(block, "categories") and block.categories:
+                        categories = [cat.value for cat in block.categories]
+
+                    # Extract required inputs for easier use
+                    required_inputs: list[BlockInputFieldInfo] = []
+                    if input_schema:
+                        properties = input_schema.get("properties", {})
+                        required_fields = set(input_schema.get("required", []))
+                        # Get credential field names to exclude from required inputs
+                        credentials_fields = set(
+                            block.input_schema.get_credentials_fields().keys()
+                        )
+
+                        for field_name, field_schema in properties.items():
+                            # Skip credential fields - they're handled separately
+                            if field_name in credentials_fields:
+                                continue
+
+                            required_inputs.append(
+                                BlockInputFieldInfo(
+                                    name=field_name,
+                                    type=field_schema.get("type", "string"),
+                                    description=field_schema.get("description", ""),
+                                    required=field_name in required_fields,
+                                    default=field_schema.get("default"),
+                                )
+                            )
+
+                    blocks.append(
+                        BlockInfoSummary(
+                            id=block_id,
+                            name=block.name,
+                            description=block.description or "",
+                            categories=categories,
+                            input_schema=input_schema,
+                            output_schema=output_schema,
+                            required_inputs=required_inputs,
+                        )
+                    )
+
+            if not blocks:
+                return NoResultsResponse(
+                    message=f"No blocks found for '{query}'",
+                    suggestions=[
+                        "Try broader keywords like 'email', 'http', 'text', 'ai'",
+                    ],
+                    session_id=session_id,
+                )
+
+            return BlockListResponse(
+                message=(
+                    f"Found {len(blocks)} block(s) matching '{query}'. "
+                    "To execute a block, use run_block with the block's 'id' field "
+                    "and provide 'input_data' matching the block's input_schema."
+                ),
+                blocks=blocks,
+                count=len(blocks),
+                query=query,
+                session_id=session_id,
+            )
+
+        except Exception as e:
+            logger.error(f"Error searching blocks: {e}", exc_info=True)
+            return ErrorResponse(
+                message="Failed to search blocks",
+                error=str(e),
+                session_id=session_id,
+            )
--- a/autogpt_platform/backend/backend/api/features/chat/tools/find_library_agent.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/find_library_agent.py
@@ -2,6 +2,8 @@

 from typing import Any

+from langfuse import observe
+
 from backend.api.features.chat.model import ChatSession

 from .agent_search import search_agents
@@ -41,6 +43,7 @@ class FindLibraryAgentTool(BaseTool):
    def requires_auth(self) -> bool:
        return True

+    @observe(as_type="tool", name="find_library_agent")
    async def _execute(
        self, user_id: str | None, session: ChatSession, **kwargs
    ) -> ToolResponseBase:
--- a/autogpt_platform/backend/backend/api/features/chat/tools/get_doc_page.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/get_doc_page.py
@@ -0,0 +1,151 @@
+"""GetDocPageTool - Fetch full content of a documentation page."""
+
+import logging
+from pathlib import Path
+from typing import Any
+
+from langfuse import observe
+
+from backend.api.features.chat.model import ChatSession
+from backend.api.features.chat.tools.base import BaseTool
+from backend.api.features.chat.tools.models import (
+    DocPageResponse,
+    ErrorResponse,
+    ToolResponseBase,
+)
+
+logger = logging.getLogger(__name__)
+
+# Base URL for documentation (can be configured)
+DOCS_BASE_URL = "https://docs.agpt.co"
+
+
+class GetDocPageTool(BaseTool):
+    """Tool for fetching full content of a documentation page."""
+
+    @property
+    def name(self) -> str:
+        return "get_doc_page"
+
+    @property
+    def description(self) -> str:
+        return (
+            "Get the full content of a documentation page by its path. "
+            "Use this after search_docs to read the complete content of a relevant page."
+        )
+
+    @property
+    def parameters(self) -> dict[str, Any]:
+        return {
+            "type": "object",
+            "properties": {
+                "path": {
+                    "type": "string",
+                    "description": (
+                        "The path to the documentation file, as returned by search_docs. "
+                        "Example: 'platform/block-sdk-guide.md'"
+                    ),
+                },
+            },
+            "required": ["path"],
+        }
+
+    @property
+    def requires_auth(self) -> bool:
+        return False  # Documentation is public
+
+    def _get_docs_root(self) -> Path:
+        """Get the documentation root directory."""
+        this_file = Path(__file__)
+        project_root = this_file.parent.parent.parent.parent.parent.parent.parent.parent
+        return project_root / "docs"
+
+    def _extract_title(self, content: str, fallback: str) -> str:
+        """Extract title from markdown content."""
+        lines = content.split("\n")
+        for line in lines:
+            if line.startswith("# "):
+                return line[2:].strip()
+        return fallback
+
+    def _make_doc_url(self, path: str) -> str:
+        """Create a URL for a documentation page."""
+        url_path = path.rsplit(".", 1)[0] if "." in path else path
+        return f"{DOCS_BASE_URL}/{url_path}"
+
+    @observe(as_type="tool", name="get_doc_page")
+    async def _execute(
+        self,
+        user_id: str | None,
+        session: ChatSession,
+        **kwargs,
+    ) -> ToolResponseBase:
+        """Fetch full content of a documentation page.
+
+        Args:
+            user_id: User ID (not required for docs)
+            session: Chat session
+            path: Path to the documentation file
+
+        Returns:
+            DocPageResponse: Full document content
+            ErrorResponse: Error message
+        """
+        path = kwargs.get("path", "").strip()
+        session_id = session.session_id if session else None
+
+        if not path:
+            return ErrorResponse(
+                message="Please provide a documentation path.",
+                error="Missing path parameter",
+                session_id=session_id,
+            )
+
+        # Sanitize path to prevent directory traversal
+        if ".." in path or path.startswith("/"):
+            return ErrorResponse(
+                message="Invalid documentation path.",
+                error="invalid_path",
+                session_id=session_id,
+            )
+
+        docs_root = self._get_docs_root()
+        full_path = docs_root / path
+
+        if not full_path.exists():
+            return ErrorResponse(
+                message=f"Documentation page not found: {path}",
+                error="not_found",
+                session_id=session_id,
+            )
+
+        # Ensure the path is within docs root
+        try:
+            full_path.resolve().relative_to(docs_root.resolve())
+        except ValueError:
+            return ErrorResponse(
+                message="Invalid documentation path.",
+                error="invalid_path",
+                session_id=session_id,
+            )
+
+        try:
+            content = full_path.read_text(encoding="utf-8")
+            title = self._extract_title(content, path)
+
+            return DocPageResponse(
+                message=f"Retrieved documentation page: {title}",
+                title=title,
+                path=path,
+                content=content,
+                doc_url=self._make_doc_url(path),
+                session_id=session_id,
+            )
+
+        except Exception as e:
+            logger.error(f"Failed to read documentation page {path}: {e}")
+            return ErrorResponse(
+                message=f"Failed to read documentation page: {str(e)}",
+                error="read_failed",
+                session_id=session_id,
+            )
--- a/autogpt_platform/backend/backend/api/features/chat/tools/models.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/models.py
@@ -21,6 +21,13 @@ class ResponseType(str, Enum):
    NO_RESULTS = "no_results"
    AGENT_OUTPUT = "agent_output"
    UNDERSTANDING_UPDATED = "understanding_updated"
+    AGENT_PREVIEW = "agent_preview"
+    AGENT_SAVED = "agent_saved"
+    CLARIFICATION_NEEDED = "clarification_needed"
+    BLOCK_LIST = "block_list"
+    BLOCK_OUTPUT = "block_output"
+    DOC_SEARCH_RESULTS = "doc_search_results"
+    DOC_PAGE = "doc_page"


 # Base response model
@@ -209,3 +216,121 @@ class UnderstandingUpdatedResponse(ToolResponseBase):
    type: ResponseType = ResponseType.UNDERSTANDING_UPDATED
    updated_fields: list[str] = Field(default_factory=list)
    current_understanding: dict[str, Any] = Field(default_factory=dict)
+
+
+# Agent generation models
+class ClarifyingQuestion(BaseModel):
+    """A question that needs user clarification."""
+
+    question: str
+    keyword: str
+    example: str | None = None
+
+
+class AgentPreviewResponse(ToolResponseBase):
+    """Response for previewing a generated agent before saving."""
+
+    type: ResponseType = ResponseType.AGENT_PREVIEW
+    agent_json: dict[str, Any]
+    agent_name: str
+    description: str
+    node_count: int
+    link_count: int = 0
+
+
+class AgentSavedResponse(ToolResponseBase):
+    """Response when an agent is saved to the library."""
+
+    type: ResponseType = ResponseType.AGENT_SAVED
+    agent_id: str
+    agent_name: str
+    library_agent_id: str
+    library_agent_link: str
+    agent_page_link: str  # Link to the agent builder/editor page
+
+
+class ClarificationNeededResponse(ToolResponseBase):
+    """Response when the LLM needs more information from the user."""
+
+    type: ResponseType = ResponseType.CLARIFICATION_NEEDED
+    questions: list[ClarifyingQuestion] = Field(default_factory=list)
+
+
+# Documentation search models
+class DocSearchResult(BaseModel):
+    """A single documentation search result."""
+
+    title: str
+    path: str
+    section: str
+    snippet: str  # Short excerpt for UI display
+    score: float
+    doc_url: str | None = None
+
+
+class DocSearchResultsResponse(ToolResponseBase):
+    """Response for search_docs tool."""
+
+    type: ResponseType = ResponseType.DOC_SEARCH_RESULTS
+    results: list[DocSearchResult]
+    count: int
+    query: str
+
+
+class DocPageResponse(ToolResponseBase):
+    """Response for get_doc_page tool."""
+
+    type: ResponseType = ResponseType.DOC_PAGE
+    title: str
+    path: str
+    content: str  # Full document content
+    doc_url: str | None = None
+
+
+# Block models
+class BlockInputFieldInfo(BaseModel):
+    """Information about a block input field."""
+
+    name: str
+    type: str
+    description: str = ""
+    required: bool = False
+    default: Any | None = None
+
+
+class BlockInfoSummary(BaseModel):
+    """Summary of a block for search results."""
+
+    id: str
+    name: str
+    description: str
+    categories: list[str]
+    input_schema: dict[str, Any]
+    output_schema: dict[str, Any]
+    required_inputs: list[BlockInputFieldInfo] = Field(
+        default_factory=list,
+        description="List of required input fields for this block",
+    )
+
+
+class BlockListResponse(ToolResponseBase):
+    """Response for find_block tool."""
+
+    type: ResponseType = ResponseType.BLOCK_LIST
+    blocks: list[BlockInfoSummary]
+    count: int
+    query: str
+    usage_hint: str = Field(
+        default="To execute a block, call run_block with block_id set to the block's "
+        "'id' field and input_data containing the required fields from input_schema."
+    )
+
+
+class BlockOutputResponse(ToolResponseBase):
+    """Response for run_block tool."""
+
+    type: ResponseType = ResponseType.BLOCK_OUTPUT
+    block_id: str
+    block_name: str
+    outputs: dict[str, list[Any]]
+    success: bool = True
--- a/autogpt_platform/backend/backend/api/features/chat/tools/run_agent.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/run_agent.py
@@ -3,6 +3,7 @@
 import logging
 from typing import Any

+from langfuse import observe
 from pydantic import BaseModel, Field, field_validator

 from backend.api.features.chat.config import ChatConfig
@@ -32,7 +33,7 @@ from .models import (
    UserReadiness,
 )
 from .utils import (
-    check_user_has_required_credentials,
+    build_missing_credentials_from_graph,
    extract_credentials_from_schema,
    fetch_graph_from_store_slug,
    get_or_create_library_agent,
@@ -154,6 +155,7 @@ class RunAgentTool(BaseTool):
        """All operations require authentication."""
        return True

+    @observe(as_type="tool", name="run_agent")
    async def _execute(
        self,
        user_id: str | None,
@@ -235,15 +237,13 @@ class RunAgentTool(BaseTool):
                # Return credentials needed response with input data info
                # The UI handles credential setup automatically, so the message
                # focuses on asking about input data
-                credentials = extract_credentials_from_schema(
-                    graph.credentials_input_schema
+                requirements_creds_dict = build_missing_credentials_from_graph(
+                    graph, None
                )
-                missing_creds_check = await check_user_has_required_credentials(
-                    user_id, credentials
+                missing_credentials_dict = build_missing_credentials_from_graph(
+                    graph, graph_credentials
                )
-                missing_credentials_dict = {
-                    c.id: c.model_dump() for c in missing_creds_check
-                }
+                requirements_creds_list = list(requirements_creds_dict.values())

                return SetupRequirementsResponse(
                    message=self._build_inputs_message(graph, MSG_WHAT_VALUES_TO_USE),
@@ -257,7 +257,7 @@ class RunAgentTool(BaseTool):
                            ready_to_run=False,
                        ),
                        requirements={
-                            "credentials": [c.model_dump() for c in credentials],
+                            "credentials": requirements_creds_list,
                            "inputs": self._get_inputs_list(graph.input_schema),
                            "execution_modes": self._get_execution_modes(graph),
                        },
--- a/autogpt_platform/backend/backend/api/features/chat/tools/run_agent_test.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/run_agent_test.py
@@ -29,7 +29,7 @@ def mock_embedding_functions():
        yield


-@pytest.mark.asyncio(scope="session")
+@pytest.mark.asyncio(loop_scope="session")
 async def test_run_agent(setup_test_data):
    """Test that the run_agent tool successfully executes an approved agent"""
    # Use test data from fixture
@@ -70,7 +70,7 @@ async def test_run_agent(setup_test_data):
    assert result_data["graph_name"] == "Test Agent"


-@pytest.mark.asyncio(scope="session")
+@pytest.mark.asyncio(loop_scope="session")
 async def test_run_agent_missing_inputs(setup_test_data):
    """Test that the run_agent tool returns error when inputs are missing"""
    # Use test data from fixture
@@ -106,7 +106,7 @@ async def test_run_agent_missing_inputs(setup_test_data):
    assert "message" in result_data


-@pytest.mark.asyncio(scope="session")
+@pytest.mark.asyncio(loop_scope="session")
 async def test_run_agent_invalid_agent_id(setup_test_data):
    """Test that the run_agent tool returns error for invalid agent ID"""
    # Use test data from fixture
@@ -141,7 +141,7 @@ async def test_run_agent_invalid_agent_id(setup_test_data):
    )


-@pytest.mark.asyncio(scope="session")
+@pytest.mark.asyncio(loop_scope="session")
 async def test_run_agent_with_llm_credentials(setup_llm_test_data):
    """Test that run_agent works with an agent requiring LLM credentials"""
    # Use test data from fixture
@@ -185,7 +185,7 @@ async def test_run_agent_with_llm_credentials(setup_llm_test_data):
    assert result_data["graph_name"] == "LLM Test Agent"


-@pytest.mark.asyncio(scope="session")
+@pytest.mark.asyncio(loop_scope="session")
 async def test_run_agent_shows_available_inputs_when_none_provided(setup_test_data):
    """Test that run_agent returns available inputs when called without inputs or use_defaults."""
    user = setup_test_data["user"]
@@ -219,7 +219,7 @@ async def test_run_agent_shows_available_inputs_when_none_provided(setup_test_da
    assert "inputs" in result_data["message"].lower()


-@pytest.mark.asyncio(scope="session")
+@pytest.mark.asyncio(loop_scope="session")
 async def test_run_agent_with_use_defaults(setup_test_data):
    """Test that run_agent executes successfully with use_defaults=True."""
    user = setup_test_data["user"]
@@ -251,7 +251,7 @@ async def test_run_agent_with_use_defaults(setup_test_data):
    assert result_data["graph_id"] == graph.id


-@pytest.mark.asyncio(scope="session")
+@pytest.mark.asyncio(loop_scope="session")
 async def test_run_agent_missing_credentials(setup_firecrawl_test_data):
    """Test that run_agent returns setup_requirements when credentials are missing."""
    user = setup_firecrawl_test_data["user"]
@@ -285,7 +285,7 @@ async def test_run_agent_missing_credentials(setup_firecrawl_test_data):
    assert len(setup_info["user_readiness"]["missing_credentials"]) > 0


-@pytest.mark.asyncio(scope="session")
+@pytest.mark.asyncio(loop_scope="session")
 async def test_run_agent_invalid_slug_format(setup_test_data):
    """Test that run_agent returns error for invalid slug format (no slash)."""
    user = setup_test_data["user"]
@@ -313,7 +313,7 @@ async def test_run_agent_invalid_slug_format(setup_test_data):
    assert "username/agent-name" in result_data["message"]


-@pytest.mark.asyncio(scope="session")
+@pytest.mark.asyncio(loop_scope="session")
 async def test_run_agent_unauthenticated():
    """Test that run_agent returns need_login for unauthenticated users."""
    tool = RunAgentTool()
@@ -340,7 +340,7 @@ async def test_run_agent_unauthenticated():
    assert "sign in" in result_data["message"].lower()


-@pytest.mark.asyncio(scope="session")
+@pytest.mark.asyncio(loop_scope="session")
 async def test_run_agent_schedule_without_cron(setup_test_data):
    """Test that run_agent returns error when scheduling without cron expression."""
    user = setup_test_data["user"]
@@ -372,7 +372,7 @@ async def test_run_agent_schedule_without_cron(setup_test_data):
    assert "cron" in result_data["message"].lower()


-@pytest.mark.asyncio(scope="session")
+@pytest.mark.asyncio(loop_scope="session")
 async def test_run_agent_schedule_without_name(setup_test_data):
    """Test that run_agent returns error when scheduling without schedule_name."""
    user = setup_test_data["user"]
--- a/autogpt_platform/backend/backend/api/features/chat/tools/run_block.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/run_block.py
@@ -0,0 +1,305 @@
+"""Tool for executing blocks directly."""
+
+import logging
+from collections import defaultdict
+from typing import Any
+
+from langfuse import observe
+
+from backend.api.features.chat.model import ChatSession
+from backend.data.block import get_block
+from backend.data.execution import ExecutionContext
+from backend.data.model import CredentialsMetaInput
+from backend.integrations.creds_manager import IntegrationCredentialsManager
+from backend.util.exceptions import BlockError
+
+from .base import BaseTool
+from .models import (
+    BlockOutputResponse,
+    ErrorResponse,
+    SetupInfo,
+    SetupRequirementsResponse,
+    ToolResponseBase,
+    UserReadiness,
+)
+from .utils import build_missing_credentials_from_field_info
+
+logger = logging.getLogger(__name__)
+
+
+class RunBlockTool(BaseTool):
+    """Tool for executing a block and returning its outputs."""
+
+    @property
+    def name(self) -> str:
+        return "run_block"
+
+    @property
+    def description(self) -> str:
+        return (
+            "Execute a specific block with the provided input data. "
+            "IMPORTANT: You MUST call find_block first to get the block's 'id' - "
+            "do NOT guess or make up block IDs. "
+            "Use the 'id' from find_block results and provide input_data "
+            "matching the block's required_inputs."
+        )
+
+    @property
+    def parameters(self) -> dict[str, Any]:
+        return {
+            "type": "object",
+            "properties": {
+                "block_id": {
+                    "type": "string",
+                    "description": (
+                        "The block's 'id' field from find_block results. "
+                        "NEVER guess this - always get it from find_block first."
+                    ),
+                },
+                "input_data": {
+                    "type": "object",
+                    "description": (
+                        "Input values for the block. Use the 'required_inputs' field "
+                        "from find_block to see what fields are needed."
+                    ),
+                },
+            },
+            "required": ["block_id", "input_data"],
+        }
+
+    @property
+    def requires_auth(self) -> bool:
+        return True
+
+    async def _check_block_credentials(
+        self,
+        user_id: str,
+        block: Any,
+    ) -> tuple[dict[str, CredentialsMetaInput], list[CredentialsMetaInput]]:
+        """
+        Check if user has required credentials for a block.
+
+        Returns:
+            tuple[matched_credentials, missing_credentials]
+        """
+        matched_credentials: dict[str, CredentialsMetaInput] = {}
+        missing_credentials: list[CredentialsMetaInput] = []
+
+        # Get credential field info from block's input schema
+        credentials_fields_info = block.input_schema.get_credentials_fields_info()
+
+        if not credentials_fields_info:
+            return matched_credentials, missing_credentials
+
+        # Get user's available credentials
+        creds_manager = IntegrationCredentialsManager()
+        available_creds = await creds_manager.store.get_all_creds(user_id)
+
+        for field_name, field_info in credentials_fields_info.items():
+            # field_info.provider is a frozenset of acceptable providers
+            # field_info.supported_types is a frozenset of acceptable types
+            matching_cred = next(
+                (
+                    cred
+                    for cred in available_creds
+                    if cred.provider in field_info.provider
+                    and cred.type in field_info.supported_types
+                ),
+                None,
+            )
+
+            if matching_cred:
+                matched_credentials[field_name] = CredentialsMetaInput(
+                    id=matching_cred.id,
+                    provider=matching_cred.provider,  # type: ignore
+                    type=matching_cred.type,
+                    title=matching_cred.title,
+                )
+            else:
+                # Create a placeholder for the missing credential
+                provider = next(iter(field_info.provider), "unknown")
+                cred_type = next(iter(field_info.supported_types), "api_key")
+                missing_credentials.append(
+                    CredentialsMetaInput(
+                        id=field_name,
+                        provider=provider,  # type: ignore
+                        type=cred_type,  # type: ignore
+                        title=field_name.replace("_", " ").title(),
+                    )
+                )
+
+        return matched_credentials, missing_credentials
+
+    @observe(as_type="tool", name="run_block")
+    async def _execute(
+        self,
+        user_id: str | None,
+        session: ChatSession,
+        **kwargs,
+    ) -> ToolResponseBase:
+        """Execute a block with the given input data.
+
+        Args:
+            user_id: User ID (required)
+            session: Chat session
+            block_id: Block UUID to execute
+            input_data: Input values for the block
+
+        Returns:
+            BlockOutputResponse: Block execution outputs
+            SetupRequirementsResponse: Missing credentials
+            ErrorResponse: Error message
+        """
+        block_id = kwargs.get("block_id", "").strip()
+        input_data = kwargs.get("input_data", {})
+        session_id = session.session_id
+
+        if not block_id:
+            return ErrorResponse(
+                message="Please provide a block_id",
+                session_id=session_id,
+            )
+
+        if not isinstance(input_data, dict):
+            return ErrorResponse(
+                message="input_data must be an object",
+                session_id=session_id,
+            )
+
+        if not user_id:
+            return ErrorResponse(
+                message="Authentication required",
+                session_id=session_id,
+            )
+
+        # Get the block
+        block = get_block(block_id)
+        if not block:
+            return ErrorResponse(
+                message=f"Block '{block_id}' not found",
+                session_id=session_id,
+            )
+
+        logger.info(f"Executing block {block.name} ({block_id}) for user {user_id}")
+
+        # Check credentials
+        creds_manager = IntegrationCredentialsManager()
+        matched_credentials, missing_credentials = await self._check_block_credentials(
+            user_id, block
+        )
+
+        if missing_credentials:
+            # Return setup requirements response with missing credentials
+            credentials_fields_info = block.input_schema.get_credentials_fields_info()
+            missing_creds_dict = build_missing_credentials_from_field_info(
+                credentials_fields_info, set(matched_credentials.keys())
+            )
+            missing_creds_list = list(missing_creds_dict.values())
+
+            return SetupRequirementsResponse(
+                message=(
+                    f"Block '{block.name}' requires credentials that are not configured. "
+                    "Please set up the required credentials before running this block."
+                ),
+                session_id=session_id,
+                setup_info=SetupInfo(
+                    agent_id=block_id,
+                    agent_name=block.name,
+                    user_readiness=UserReadiness(
+                        has_all_credentials=False,
+                        missing_credentials=missing_creds_dict,
+                        ready_to_run=False,
+                    ),
+                    requirements={
+                        "credentials": missing_creds_list,
+                        "inputs": self._get_inputs_list(block),
+                        "execution_modes": ["immediate"],
+                    },
+                ),
+                graph_id=None,
+                graph_version=None,
+            )
+
+        try:
+            # Fetch actual credentials and prepare kwargs for block execution
+            # Create execution context with defaults (blocks may require it)
+            exec_kwargs: dict[str, Any] = {
+                "user_id": user_id,
+                "execution_context": ExecutionContext(),
+            }
+
+            for field_name, cred_meta in matched_credentials.items():
+                # Inject metadata into input_data (for validation)
+                if field_name not in input_data:
+                    input_data[field_name] = cred_meta.model_dump()
+
+                # Fetch actual credentials and pass as kwargs (for execution)
+                actual_credentials = await creds_manager.get(
+                    user_id, cred_meta.id, lock=False
+                )
+                if actual_credentials:
+                    exec_kwargs[field_name] = actual_credentials
+                else:
+                    return ErrorResponse(
+                        message=f"Failed to retrieve credentials for {field_name}",
+                        session_id=session_id,
+                    )
+
+            # Execute the block and collect outputs
+            outputs: dict[str, list[Any]] = defaultdict(list)
+            async for output_name, output_data in block.execute(
+                input_data,
+                **exec_kwargs,
+            ):
+                outputs[output_name].append(output_data)
+
+            return BlockOutputResponse(
+                message=f"Block '{block.name}' executed successfully",
+                block_id=block_id,
+                block_name=block.name,
+                outputs=dict(outputs),
+                success=True,
+                session_id=session_id,
+            )
+
+        except BlockError as e:
+            logger.warning(f"Block execution failed: {e}")
+            return ErrorResponse(
+                message=f"Block execution failed: {e}",
+                error=str(e),
+                session_id=session_id,
+            )
+        except Exception as e:
+            logger.error(f"Unexpected error executing block: {e}", exc_info=True)
+            return ErrorResponse(
+                message=f"Failed to execute block: {str(e)}",
+                error=str(e),
+                session_id=session_id,
+            )
+
+    def _get_inputs_list(self, block: Any) -> list[dict[str, Any]]:
+        """Extract non-credential inputs from block schema."""
+        inputs_list = []
+        schema = block.input_schema.jsonschema()
+        properties = schema.get("properties", {})
+        required_fields = set(schema.get("required", []))
+
+        # Get credential field names to exclude
+        credentials_fields = set(block.input_schema.get_credentials_fields().keys())
+
+        for field_name, field_schema in properties.items():
+            # Skip credential fields
+            if field_name in credentials_fields:
+                continue
+
+            inputs_list.append(
+                {
+                    "name": field_name,
+                    "title": field_schema.get("title", field_name),
+                    "type": field_schema.get("type", "string"),
+                    "description": field_schema.get("description", ""),
+                    "required": field_name in required_fields,
+                }
+            )
+
+        return inputs_list
--- a/autogpt_platform/backend/backend/api/features/chat/tools/search_docs.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/search_docs.py
@@ -0,0 +1,210 @@
+"""SearchDocsTool - Search documentation using hybrid search."""
+
+import logging
+from typing import Any
+
+from langfuse import observe
+from prisma.enums import ContentType
+
+from backend.api.features.chat.model import ChatSession
+from backend.api.features.chat.tools.base import BaseTool
+from backend.api.features.chat.tools.models import (
+    DocSearchResult,
+    DocSearchResultsResponse,
+    ErrorResponse,
+    NoResultsResponse,
+    ToolResponseBase,
+)
+from backend.api.features.store.hybrid_search import unified_hybrid_search
+
+logger = logging.getLogger(__name__)
+
+# Base URL for documentation (can be configured)
+DOCS_BASE_URL = "https://docs.agpt.co"
+
+# Maximum number of results to return
+MAX_RESULTS = 5
+
+# Snippet length for preview
+SNIPPET_LENGTH = 200
+
+
+class SearchDocsTool(BaseTool):
+    """Tool for searching AutoGPT platform documentation."""
+
+    @property
+    def name(self) -> str:
+        return "search_docs"
+
+    @property
+    def description(self) -> str:
+        return (
+            "Search the AutoGPT platform documentation for information about "
+            "how to use the platform, build agents, configure blocks, and more. "
+            "Returns relevant documentation sections. Use get_doc_page to read full content."
+        )
+
+    @property
+    def parameters(self) -> dict[str, Any]:
+        return {
+            "type": "object",
+            "properties": {
+                "query": {
+                    "type": "string",
+                    "description": (
+                        "Search query to find relevant documentation. "
+                        "Use natural language to describe what you're looking for."
+                    ),
+                },
+            },
+            "required": ["query"],
+        }
+
+    @property
+    def requires_auth(self) -> bool:
+        return False  # Documentation is public
+
+    def _create_snippet(self, content: str, max_length: int = SNIPPET_LENGTH) -> str:
+        """Create a short snippet from content for preview."""
+        # Remove markdown formatting for cleaner snippet
+        clean_content = content.replace("#", "").replace("*", "").replace("`", "")
+        # Remove extra whitespace
+        clean_content = " ".join(clean_content.split())
+
+        if len(clean_content) <= max_length:
+            return clean_content
+
+        # Truncate at word boundary
+        truncated = clean_content[:max_length]
+        last_space = truncated.rfind(" ")
+        if last_space > max_length // 2:
+            truncated = truncated[:last_space]
+
+        return truncated + "..."
+
+    def _make_doc_url(self, path: str) -> str:
+        """Create a URL for a documentation page."""
+        # Remove file extension for URL
+        url_path = path.rsplit(".", 1)[0] if "." in path else path
+        return f"{DOCS_BASE_URL}/{url_path}"
+
+    @observe(as_type="tool", name="search_docs")
+    async def _execute(
+        self,
+        user_id: str | None,
+        session: ChatSession,
+        **kwargs,
+    ) -> ToolResponseBase:
+        """Search documentation and return relevant sections.
+
+        Args:
+            user_id: User ID (not required for docs)
+            session: Chat session
+            query: Search query
+
+        Returns:
+            DocSearchResultsResponse: List of matching documentation sections
+            NoResultsResponse: No results found
+            ErrorResponse: Error message
+        """
+        query = kwargs.get("query", "").strip()
+        session_id = session.session_id if session else None
+
+        if not query:
+            return ErrorResponse(
+                message="Please provide a search query.",
+                error="Missing query parameter",
+                session_id=session_id,
+            )
+
+        try:
+            # Search using hybrid search for DOCUMENTATION content type only
+            results, total = await unified_hybrid_search(
+                query=query,
+                content_types=[ContentType.DOCUMENTATION],
+                page=1,
+                page_size=MAX_RESULTS * 2,  # Fetch extra for deduplication
+                min_score=0.1,  # Lower threshold for docs
+            )
+
+            if not results:
+                return NoResultsResponse(
+                    message=f"No documentation found for '{query}'.",
+                    suggestions=[
+                        "Try different keywords",
+                        "Use more general terms",
+                        "Check for typos in your query",
+                    ],
+                    session_id=session_id,
+                )
+
+            # Deduplicate by document path (keep highest scoring section per doc)
+            seen_docs: dict[str, dict[str, Any]] = {}
+            for result in results:
+                metadata = result.get("metadata", {})
+                doc_path = metadata.get("path", "")
+
+                if not doc_path:
+                    continue
+
+                # Keep the highest scoring result for each document
+                if doc_path not in seen_docs:
+                    seen_docs[doc_path] = result
+                elif result.get("combined_score", 0) > seen_docs[doc_path].get(
+                    "combined_score", 0
+                ):
+                    seen_docs[doc_path] = result
+
+            # Sort by score and take top MAX_RESULTS
+            deduplicated = sorted(
+                seen_docs.values(),
+                key=lambda x: x.get("combined_score", 0),
+                reverse=True,
+            )[:MAX_RESULTS]
+
+            if not deduplicated:
+                return NoResultsResponse(
+                    message=f"No documentation found for '{query}'.",
+                    suggestions=[
+                        "Try different keywords",
+                        "Use more general terms",
+                    ],
+                    session_id=session_id,
+                )
+
+            # Build response
+            doc_results: list[DocSearchResult] = []
+            for result in deduplicated:
+                metadata = result.get("metadata", {})
+                doc_path = metadata.get("path", "")
+                doc_title = metadata.get("doc_title", "")
+                section_title = metadata.get("section_title", "")
+                searchable_text = result.get("searchable_text", "")
+                score = result.get("combined_score", 0)
+
+                doc_results.append(
+                    DocSearchResult(
+                        title=doc_title or section_title or doc_path,
+                        path=doc_path,
+                        section=section_title,
+                        snippet=self._create_snippet(searchable_text),
+                        score=round(score, 3),
+                        doc_url=self._make_doc_url(doc_path),
+                    )
+                )
+
+            return DocSearchResultsResponse(
+                message=f"Found {len(doc_results)} relevant documentation sections.",
+                results=doc_results,
+                count=len(doc_results),
+                query=query,
+                session_id=session_id,
+            )
+
+        except Exception as e:
+            logger.error(f"Documentation search failed: {e}")
+            return ErrorResponse(
+                message=f"Failed to search documentation: {str(e)}",
+                error="search_failed",
+                session_id=session_id,
+            )
--- a/autogpt_platform/backend/backend/api/features/chat/tools/utils.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/utils.py
@@ -8,7 +8,7 @@ from backend.api.features.library import model as library_model
 from backend.api.features.store import db as store_db
 from backend.data import graph as graph_db
 from backend.data.graph import GraphModel
-from backend.data.model import CredentialsMetaInput
+from backend.data.model import CredentialsFieldInfo, CredentialsMetaInput
 from backend.integrations.creds_manager import IntegrationCredentialsManager
 from backend.util.exceptions import NotFoundError

@@ -89,6 +89,59 @@ def extract_credentials_from_schema(
    return credentials


+def _serialize_missing_credential(
+    field_key: str, field_info: CredentialsFieldInfo
+) -> dict[str, Any]:
+    """
+    Convert credential field info into a serializable dict that preserves all supported
+    credential types (e.g., api_key + oauth2) so the UI can offer multiple options.
+    """
+    supported_types = sorted(field_info.supported_types)
+    provider = next(iter(field_info.provider), "unknown")
+    scopes = sorted(field_info.required_scopes or [])
+
+    return {
+        "id": field_key,
+        "title": field_key.replace("_", " ").title(),
+        "provider": provider,
+        "provider_name": provider.replace("_", " ").title(),
+        "type": supported_types[0] if supported_types else "api_key",
+        "types": supported_types,
+        "scopes": scopes,
+    }
+
+
+def build_missing_credentials_from_graph(
+    graph: GraphModel, matched_credentials: dict[str, CredentialsMetaInput] | None
+) -> dict[str, Any]:
+    """
+    Build a missing_credentials mapping from a graph's aggregated credentials inputs,
+    preserving all supported credential types for each field.
+    """
+    matched_keys = set(matched_credentials.keys()) if matched_credentials else set()
+    aggregated_fields = graph.aggregate_credentials_inputs()
+
+    return {
+        field_key: _serialize_missing_credential(field_key, field_info)
+        for field_key, (field_info, _node_fields) in aggregated_fields.items()
+        if field_key not in matched_keys
+    }
+
+
+def build_missing_credentials_from_field_info(
+    credential_fields: dict[str, CredentialsFieldInfo],
+    matched_keys: set[str],
+) -> dict[str, Any]:
+    """
+    Build missing_credentials mapping from a simple credentials field info dictionary.
+    """
+    return {
+        field_key: _serialize_missing_credential(field_key, field_info)
+        for field_key, field_info in credential_fields.items()
+        if field_key not in matched_keys
+    }
+
+
 def extract_credentials_as_dict(
    credentials_input_schema: dict[str, Any] | None,
 ) -> dict[str, CredentialsMetaInput]:
--- a/autogpt_platform/backend/backend/api/features/executions/review/model.py
+++ b/autogpt_platform/backend/backend/api/features/executions/review/model.py
@@ -23,6 +23,7 @@ class PendingHumanReviewModel(BaseModel):
        id: Unique identifier for the review record
        user_id: ID of the user who must perform the review
        node_exec_id: ID of the node execution that created this review
+        node_id: ID of the node definition (for grouping reviews from same node)
        graph_exec_id: ID of the graph execution containing the node
        graph_id: ID of the graph template being executed
        graph_version: Version number of the graph template
@@ -37,6 +38,10 @@ class PendingHumanReviewModel(BaseModel):
    """

    node_exec_id: str = Field(description="Node execution ID (primary key)")
+    node_id: str = Field(
+        description="Node definition ID (for grouping)",
+        default="",  # Temporary default for test compatibility
+    )
    user_id: str = Field(description="User ID associated with the review")
    graph_exec_id: str = Field(description="Graph execution ID")
    graph_id: str = Field(description="Graph ID")
@@ -66,7 +71,9 @@ class PendingHumanReviewModel(BaseModel):
    )

    @classmethod
-    def from_db(cls, review: "PendingHumanReview") -> "PendingHumanReviewModel":
+    def from_db(
+        cls, review: "PendingHumanReview", node_id: str
+    ) -> "PendingHumanReviewModel":
        """
        Convert a database model to a response model.

@@ -74,9 +81,14 @@ class PendingHumanReviewModel(BaseModel):
        payload, instructions, and editable flag.

        Handles invalid data gracefully by using safe defaults.
+
+        Args:
+            review: Database review object
+            node_id: Node definition ID (fetched from NodeExecution)
        """
        return cls(
            node_exec_id=review.nodeExecId,
+            node_id=node_id,
            user_id=review.userId,
            graph_exec_id=review.graphExecId,
            graph_id=review.graphId,
@@ -107,6 +119,13 @@ class ReviewItem(BaseModel):
    reviewed_data: SafeJsonData | None = Field(
        None, description="Optional edited data (ignored if approved=False)"
    )
+    auto_approve_future: bool = Field(
+        default=False,
+        description=(
+            "If true and this review is approved, future executions of this same "
+            "block (node) will be automatically approved. This only affects approved reviews."
+        ),
+    )

    @field_validator("reviewed_data")
    @classmethod
@@ -174,6 +193,9 @@ class ReviewRequest(BaseModel):
    This request must include ALL pending reviews for a graph execution.
    Each review will be either approved (with optional data modifications)
    or rejected (data ignored). The execution will resume only after ALL reviews are processed.
+
+    Each review item can individually specify whether to auto-approve future executions
+    of the same block via the `auto_approve_future` field on ReviewItem.
    """

    reviews: List[ReviewItem] = Field(
--- a/autogpt_platform/backend/backend/api/features/executions/review/review_routes_test.py
+++ b/autogpt_platform/backend/backend/api/features/executions/review/review_routes_test.py
--- a/autogpt_platform/backend/backend/api/features/executions/review/routes.py
+++ b/autogpt_platform/backend/backend/api/features/executions/review/routes.py
@@ -1,17 +1,27 @@
+import asyncio
 import logging
-from typing import List
+from typing import Any, List

 import autogpt_libs.auth as autogpt_auth_lib
 from fastapi import APIRouter, HTTPException, Query, Security, status
 from prisma.enums import ReviewStatus

-from backend.data.execution import get_graph_execution_meta
+from backend.data.execution import (
+    ExecutionContext,
+    ExecutionStatus,
+    get_graph_execution_meta,
+)
+from backend.data.graph import get_graph_settings
 from backend.data.human_review import (
+    create_auto_approval_record,
+    get_pending_reviews_by_node_exec_ids,
    get_pending_reviews_for_execution,
    get_pending_reviews_for_user,
    has_pending_reviews_for_graph_exec,
    process_all_reviews_for_execution,
 )
+from backend.data.model import USER_TIMEZONE_NOT_SET
+from backend.data.user import get_user_by_id
 from backend.executor.utils import add_graph_execution

 from .model import PendingHumanReviewModel, ReviewRequest, ReviewResponse
@@ -127,17 +137,70 @@ async def process_review_action(
            detail="At least one review must be provided",
        )

-    # Build review decisions map
+    # Batch fetch all requested reviews
+    reviews_map = await get_pending_reviews_by_node_exec_ids(
+        list(all_request_node_ids), user_id
+    )
+
+    # Validate all reviews were found
+    missing_ids = all_request_node_ids - set(reviews_map.keys())
+    if missing_ids:
+        raise HTTPException(
+            status_code=status.HTTP_404_NOT_FOUND,
+            detail=f"No pending review found for node execution(s): {', '.join(missing_ids)}",
+        )
+
+    # Validate all reviews belong to the same execution
+    graph_exec_ids = {review.graph_exec_id for review in reviews_map.values()}
+    if len(graph_exec_ids) > 1:
+        raise HTTPException(
+            status_code=status.HTTP_409_CONFLICT,
+            detail="All reviews in a single request must belong to the same execution.",
+        )
+
+    graph_exec_id = next(iter(graph_exec_ids))
+
+    # Validate execution status before processing reviews
+    graph_exec_meta = await get_graph_execution_meta(
+        user_id=user_id, execution_id=graph_exec_id
+    )
+
+    if not graph_exec_meta:
+        raise HTTPException(
+            status_code=status.HTTP_404_NOT_FOUND,
+            detail=f"Graph execution #{graph_exec_id} not found",
+        )
+
+    # Only allow processing reviews if execution is paused for review
+    # or incomplete (partial execution with some reviews already processed)
+    if graph_exec_meta.status not in (
+        ExecutionStatus.REVIEW,
+        ExecutionStatus.INCOMPLETE,
+    ):
+        raise HTTPException(
+            status_code=status.HTTP_409_CONFLICT,
+            detail=f"Cannot process reviews while execution status is {graph_exec_meta.status}. "
+            f"Reviews can only be processed when execution is paused (REVIEW status). "
+            f"Current status: {graph_exec_meta.status}",
+        )
+
+    # Build review decisions map and track which reviews requested auto-approval
+    # Auto-approved reviews use original data (no modifications allowed)
    review_decisions = {}
+    auto_approve_requests = {}  # Map node_exec_id -> auto_approve_future flag
+
    for review in request.reviews:
        review_status = (
            ReviewStatus.APPROVED if review.approved else ReviewStatus.REJECTED
        )
+        # If this review requested auto-approval, don't allow data modifications
+        reviewed_data = None if review.auto_approve_future else review.reviewed_data
        review_decisions[review.node_exec_id] = (
            review_status,
-            review.reviewed_data,
+            reviewed_data,
            review.message,
        )
+        auto_approve_requests[review.node_exec_id] = review.auto_approve_future

    # Process all reviews
    updated_reviews = await process_all_reviews_for_execution(
@@ -145,6 +208,87 @@ async def process_review_action(
        review_decisions=review_decisions,
    )

+    # Create auto-approval records for approved reviews that requested it
+    # Deduplicate by node_id to avoid race conditions when multiple reviews
+    # for the same node are processed in parallel
+    async def create_auto_approval_for_node(
+        node_id: str, review_result
+    ) -> tuple[str, bool]:
+        """
+        Create auto-approval record for a node.
+        Returns (node_id, success) tuple for tracking failures.
+        """
+        try:
+            await create_auto_approval_record(
+                user_id=user_id,
+                graph_exec_id=review_result.graph_exec_id,
+                graph_id=review_result.graph_id,
+                graph_version=review_result.graph_version,
+                node_id=node_id,
+                payload=review_result.payload,
+            )
+            return (node_id, True)
+        except Exception as e:
+            logger.error(
+                f"Failed to create auto-approval record for node {node_id}",
+                exc_info=e,
+            )
+            return (node_id, False)
+
+    # Collect node_exec_ids that need auto-approval
+    node_exec_ids_needing_auto_approval = [
+        node_exec_id
+        for node_exec_id, review_result in updated_reviews.items()
+        if review_result.status == ReviewStatus.APPROVED
+        and auto_approve_requests.get(node_exec_id, False)
+    ]
+
+    # Batch-fetch node executions to get node_ids
+    nodes_needing_auto_approval: dict[str, Any] = {}
+    if node_exec_ids_needing_auto_approval:
+        from backend.data.execution import get_node_executions
+
+        node_execs = await get_node_executions(
+            graph_exec_id=graph_exec_id, include_exec_data=False
+        )
+        node_exec_map = {node_exec.node_exec_id: node_exec for node_exec in node_execs}
+
+        for node_exec_id in node_exec_ids_needing_auto_approval:
+            node_exec = node_exec_map.get(node_exec_id)
+            if node_exec:
+                review_result = updated_reviews[node_exec_id]
+                # Use the first approved review for this node (deduplicate by node_id)
+                if node_exec.node_id not in nodes_needing_auto_approval:
+                    nodes_needing_auto_approval[node_exec.node_id] = review_result
+            else:
+                logger.error(
+                    f"Failed to create auto-approval record for {node_exec_id}: "
+                    f"Node execution not found. This may indicate a race condition "
+                    f"or data inconsistency."
+                )
+
+    # Execute all auto-approval creations in parallel (deduplicated by node_id)
+    auto_approval_results = await asyncio.gather(
+        *[
+            create_auto_approval_for_node(node_id, review_result)
+            for node_id, review_result in nodes_needing_auto_approval.items()
+        ],
+        return_exceptions=True,
+    )
+
+    # Count auto-approval failures
+    auto_approval_failed_count = 0
+    for result in auto_approval_results:
+        if isinstance(result, Exception):
+            # Unexpected exception during auto-approval creation
+            auto_approval_failed_count += 1
+            logger.error(
+                f"Unexpected exception during auto-approval creation: {result}"
+            )
+        elif isinstance(result, tuple) and len(result) == 2 and not result[1]:
+            # Auto-approval creation failed (returned False)
+            auto_approval_failed_count += 1
+
    # Count results
    approved_count = sum(
        1
@@ -157,30 +301,53 @@ async def process_review_action(
        if review.status == ReviewStatus.REJECTED
    )

-    # Resume execution if we processed some reviews
+    # Resume execution only if ALL pending reviews for this execution have been processed
    if updated_reviews:
-        # Get graph execution ID from any processed review
-        first_review = next(iter(updated_reviews.values()))
-        graph_exec_id = first_review.graph_exec_id
-
-        # Check if any pending reviews remain for this execution
        still_has_pending = await has_pending_reviews_for_graph_exec(graph_exec_id)

        if not still_has_pending:
-            # Resume execution
+            # Get the graph_id from any processed review
+            first_review = next(iter(updated_reviews.values()))
+
            try:
+                # Fetch user and settings to build complete execution context
+                user = await get_user_by_id(user_id)
+                settings = await get_graph_settings(
+                    user_id=user_id, graph_id=first_review.graph_id
+                )
+
+                # Preserve user's timezone preference when resuming execution
+                user_timezone = (
+                    user.timezone if user.timezone != USER_TIMEZONE_NOT_SET else "UTC"
+                )
+
+                execution_context = ExecutionContext(
+                    human_in_the_loop_safe_mode=settings.human_in_the_loop_safe_mode,
+                    sensitive_action_safe_mode=settings.sensitive_action_safe_mode,
+                    user_timezone=user_timezone,
+                )
+
                await add_graph_execution(
                    graph_id=first_review.graph_id,
                    user_id=user_id,
                    graph_exec_id=graph_exec_id,
+                    execution_context=execution_context,
                )
                logger.info(f"Resumed execution {graph_exec_id}")
            except Exception as e:
                logger.error(f"Failed to resume execution {graph_exec_id}: {str(e)}")

+    # Build error message if auto-approvals failed
+    error_message = None
+    if auto_approval_failed_count > 0:
+        error_message = (
+            f"{auto_approval_failed_count} auto-approval setting(s) could not be saved. "
+            f"You may need to manually approve these reviews in future executions."
+        )
+
    return ReviewResponse(
        approved_count=approved_count,
        rejected_count=rejected_count,
-        failed_count=0,
-        error=None,
+        failed_count=auto_approval_failed_count,
+        error=error_message,
    )
--- a/autogpt_platform/backend/backend/api/features/library/db.py
+++ b/autogpt_platform/backend/backend/api/features/library/db.py
@@ -401,27 +401,11 @@ async def add_generated_agent_image(
    )


-def _initialize_graph_settings(graph: graph_db.GraphModel) -> GraphSettings:
-    """
-    Initialize GraphSettings based on graph content.
-
-    Args:
-        graph: The graph to analyze
-
-    Returns:
-        GraphSettings with appropriate human_in_the_loop_safe_mode value
-    """
-    if graph.has_human_in_the_loop:
-        # Graph has HITL blocks - set safe mode to True by default
-        return GraphSettings(human_in_the_loop_safe_mode=True)
-    else:
-        # Graph has no HITL blocks - keep None
-        return GraphSettings(human_in_the_loop_safe_mode=None)
-
-
 async def create_library_agent(
    graph: graph_db.GraphModel,
    user_id: str,
+    hitl_safe_mode: bool = True,
+    sensitive_action_safe_mode: bool = False,
    create_library_agents_for_sub_graphs: bool = True,
 ) -> list[library_model.LibraryAgent]:
    """
@@ -430,6 +414,8 @@ async def create_library_agent(
    Args:
        agent: The agent/Graph to add to the library.
        user_id: The user to whom the agent will be added.
+        hitl_safe_mode: Whether HITL blocks require manual review (default True).
+        sensitive_action_safe_mode: Whether sensitive action blocks require review.
        create_library_agents_for_sub_graphs: If True, creates LibraryAgent records for sub-graphs as well.

    Returns:
@@ -465,7 +451,11 @@ async def create_library_agent(
                            }
                        },
                        settings=SafeJson(
-                            _initialize_graph_settings(graph_entry).model_dump()
+                            GraphSettings.from_graph(
+                                graph_entry,
+                                hitl_safe_mode=hitl_safe_mode,
+                                sensitive_action_safe_mode=sensitive_action_safe_mode,
+                            ).model_dump()
                        ),
                    ),
                    include=library_agent_include(
@@ -593,7 +583,13 @@ async def update_library_agent(
            )
        update_fields["isDeleted"] = is_deleted
    if settings is not None:
-        update_fields["settings"] = SafeJson(settings.model_dump())
+        existing_agent = await get_library_agent(id=library_agent_id, user_id=user_id)
+        current_settings_dict = (
+            existing_agent.settings.model_dump() if existing_agent.settings else {}
+        )
+        new_settings = settings.model_dump(exclude_unset=True)
+        merged_settings = {**current_settings_dict, **new_settings}
+        update_fields["settings"] = SafeJson(merged_settings)

    try:
        # If graph_version is provided, update to that specific version
@@ -627,33 +623,6 @@ async def update_library_agent(
        raise DatabaseError("Failed to update library agent") from e


-async def update_library_agent_settings(
-    user_id: str,
-    agent_id: str,
-    settings: GraphSettings,
-) -> library_model.LibraryAgent:
-    """
-    Updates the settings for a specific LibraryAgent.
-
-    Args:
-        user_id: The owner of the LibraryAgent.
-        agent_id: The ID of the LibraryAgent to update.
-        settings: New GraphSettings to apply.
-
-    Returns:
-        The updated LibraryAgent.
-
-    Raises:
-        NotFoundError: If the specified LibraryAgent does not exist.
-        DatabaseError: If there's an error in the update operation.
-    """
-    return await update_library_agent(
-        library_agent_id=agent_id,
-        user_id=user_id,
-        settings=settings,
-    )
-
-
 async def delete_library_agent(
    library_agent_id: str, user_id: str, soft_delete: bool = True
 ) -> None:
@@ -838,7 +807,7 @@ async def add_store_agent_to_library(
                "isCreatedByUser": False,
                "useGraphIsActiveVersion": False,
                "settings": SafeJson(
-                    _initialize_graph_settings(graph_model).model_dump()
+                    GraphSettings.from_graph(graph_model).model_dump()
                ),
            },
            include=library_agent_include(
@@ -1228,8 +1197,15 @@ async def fork_library_agent(
        )
        new_graph = await on_graph_activate(new_graph, user_id=user_id)

-        # Create a library agent for the new graph
-        return (await create_library_agent(new_graph, user_id))[0]
+        # Create a library agent for the new graph, preserving safe mode settings
+        return (
+            await create_library_agent(
+                new_graph,
+                user_id,
+                hitl_safe_mode=original_agent.settings.human_in_the_loop_safe_mode,
+                sensitive_action_safe_mode=original_agent.settings.sensitive_action_safe_mode,
+            )
+        )[0]
    except prisma.errors.PrismaError as e:
        logger.error(f"Database error cloning library agent: {e}")
        raise DatabaseError("Failed to fork library agent") from e
--- a/autogpt_platform/backend/backend/api/features/library/model.py
+++ b/autogpt_platform/backend/backend/api/features/library/model.py
@@ -73,6 +73,12 @@ class LibraryAgent(pydantic.BaseModel):
    has_external_trigger: bool = pydantic.Field(
        description="Whether the agent has an external trigger (e.g. webhook) node"
    )
+    has_human_in_the_loop: bool = pydantic.Field(
+        description="Whether the agent has human-in-the-loop blocks"
+    )
+    has_sensitive_action: bool = pydantic.Field(
+        description="Whether the agent has sensitive action blocks"
+    )
    trigger_setup_info: Optional[GraphTriggerInfo] = None

    # Indicates whether there's a new output (based on recent runs)
@@ -180,6 +186,8 @@ class LibraryAgent(pydantic.BaseModel):
                graph.credentials_input_schema if sub_graphs is not None else None
            ),
            has_external_trigger=graph.has_external_trigger,
+            has_human_in_the_loop=graph.has_human_in_the_loop,
+            has_sensitive_action=graph.has_sensitive_action,
            trigger_setup_info=graph.trigger_setup_info,
            new_output=new_output,
            can_access_graph=can_access_graph,
--- a/autogpt_platform/backend/backend/api/features/library/routes_test.py
+++ b/autogpt_platform/backend/backend/api/features/library/routes_test.py
@@ -52,6 +52,8 @@ async def test_get_library_agents_success(
                output_schema={"type": "object", "properties": {}},
                credentials_input_schema={"type": "object", "properties": {}},
                has_external_trigger=False,
+                has_human_in_the_loop=False,
+                has_sensitive_action=False,
                status=library_model.LibraryAgentStatus.COMPLETED,
                recommended_schedule_cron=None,
                new_output=False,
@@ -75,6 +77,8 @@ async def test_get_library_agents_success(
                output_schema={"type": "object", "properties": {}},
                credentials_input_schema={"type": "object", "properties": {}},
                has_external_trigger=False,
+                has_human_in_the_loop=False,
+                has_sensitive_action=False,
                status=library_model.LibraryAgentStatus.COMPLETED,
                recommended_schedule_cron=None,
                new_output=False,
@@ -150,6 +154,8 @@ async def test_get_favorite_library_agents_success(
                output_schema={"type": "object", "properties": {}},
                credentials_input_schema={"type": "object", "properties": {}},
                has_external_trigger=False,
+                has_human_in_the_loop=False,
+                has_sensitive_action=False,
                status=library_model.LibraryAgentStatus.COMPLETED,
                recommended_schedule_cron=None,
                new_output=False,
@@ -218,6 +224,8 @@ def test_add_agent_to_library_success(
        output_schema={"type": "object", "properties": {}},
        credentials_input_schema={"type": "object", "properties": {}},
        has_external_trigger=False,
+        has_human_in_the_loop=False,
+        has_sensitive_action=False,
        status=library_model.LibraryAgentStatus.COMPLETED,
        new_output=False,
        can_access_graph=True,
--- a/autogpt_platform/backend/backend/api/features/oauth_test.py
+++ b/autogpt_platform/backend/backend/api/features/oauth_test.py
@@ -20,6 +20,7 @@ from typing import AsyncGenerator

 import httpx
 import pytest
+import pytest_asyncio
 from autogpt_libs.api_key.keysmith import APIKeySmith
 from prisma.enums import APIKeyPermission
 from prisma.models import OAuthAccessToken as PrismaOAuthAccessToken
@@ -38,13 +39,13 @@ keysmith = APIKeySmith()
 # ============================================================================


-@pytest.fixture
+@pytest.fixture(scope="session")
 def test_user_id() -> str:
    """Test user ID for OAuth tests."""
    return str(uuid.uuid4())


-@pytest.fixture
+@pytest_asyncio.fixture(scope="session", loop_scope="session")
 async def test_user(server, test_user_id: str):
    """Create a test user in the database."""
    await PrismaUser.prisma().create(
@@ -67,7 +68,7 @@ async def test_user(server, test_user_id: str):
    await PrismaUser.prisma().delete(where={"id": test_user_id})


-@pytest.fixture
+@pytest_asyncio.fixture
 async def test_oauth_app(test_user: str):
    """Create a test OAuth application in the database."""
    app_id = str(uuid.uuid4())
@@ -122,7 +123,7 @@ def pkce_credentials() -> tuple[str, str]:
    return generate_pkce()


-@pytest.fixture
+@pytest_asyncio.fixture
 async def client(server, test_user: str) -> AsyncGenerator[httpx.AsyncClient, None]:
    """
    Create an async HTTP client that talks directly to the FastAPI app.
@@ -287,7 +288,7 @@ async def test_authorize_invalid_client_returns_error(
    assert query_params["error"][0] == "invalid_client"


-@pytest.fixture
+@pytest_asyncio.fixture
 async def inactive_oauth_app(test_user: str):
    """Create an inactive test OAuth application in the database."""
    app_id = str(uuid.uuid4())
@@ -1004,7 +1005,7 @@ async def test_token_refresh_revoked(
    assert "revoked" in response.json()["detail"].lower()


-@pytest.fixture
+@pytest_asyncio.fixture
 async def other_oauth_app(test_user: str):
    """Create a second OAuth application for cross-app tests."""
    app_id = str(uuid.uuid4())
--- a/autogpt_platform/backend/backend/api/features/store/content_handlers.py
+++ b/autogpt_platform/backend/backend/api/features/store/content_handlers.py
@@ -275,8 +275,22 @@ class BlockHandler(ContentHandler):
        }


+@dataclass
+class MarkdownSection:
+    """Represents a section of a markdown document."""
+
+    title: str  # Section heading text
+    content: str  # Section content (including the heading line)
+    level: int  # Heading level (1 for #, 2 for ##, etc.)
+    index: int  # Section index within the document
+
+
 class DocumentationHandler(ContentHandler):
-    """Handler for documentation files (.md/.mdx)."""
+    """Handler for documentation files (.md/.mdx).
+
+    Chunks documents by markdown headings to create multiple embeddings per file.
+    Each section (## heading) becomes a separate embedding for better retrieval.
+    """

    @property
    def content_type(self) -> ContentType:
@@ -297,35 +311,162 @@ class DocumentationHandler(ContentHandler):
        docs_root = project_root / "docs"
        return docs_root

-    def _extract_title_and_content(self, file_path: Path) -> tuple[str, str]:
-        """Extract title and content from markdown file."""
+    def _extract_doc_title(self, file_path: Path) -> str:
+        """Extract the document title from a markdown file."""
        try:
            content = file_path.read_text(encoding="utf-8")
+            lines = content.split("\n")

            # Try to extract title from first # heading
-            lines = content.split("\n")
-            title = ""
-            body_lines = []
-
            for line in lines:
-                if line.startswith("# ") and not title:
-                    title = line[2:].strip()
-                else:
-                    body_lines.append(line)
+                if line.startswith("# "):
+                    return line[2:].strip()

            # If no title found, use filename
-            if not title:
-                title = file_path.stem.replace("-", " ").replace("_", " ").title()
+            return file_path.stem.replace("-", " ").replace("_", " ").title()
+        except Exception as e:
+            logger.warning(f"Failed to read title from {file_path}: {e}")
+            return file_path.stem.replace("-", " ").replace("_", " ").title()

-            body = "\n".join(body_lines)
+    def _chunk_markdown_by_headings(
+        self, file_path: Path, min_heading_level: int = 2
+    ) -> list[MarkdownSection]:
+        """
+        Split a markdown file into sections based on headings.

-            return title, body
+        Args:
+            file_path: Path to the markdown file
+            min_heading_level: Minimum heading level to split on (default: 2 for ##)
+
+        Returns:
+            List of MarkdownSection objects, one per section.
+            If no headings found, returns a single section with all content.
+        """
+        try:
+            content = file_path.read_text(encoding="utf-8")
        except Exception as e:
            logger.warning(f"Failed to read {file_path}: {e}")
-            return file_path.stem, ""
+            return []
+
+        lines = content.split("\n")
+        sections: list[MarkdownSection] = []
+        current_section_lines: list[str] = []
+        current_title = ""
+        current_level = 0
+        section_index = 0
+        doc_title = ""
+
+        for line in lines:
+            # Check if line is a heading
+            if line.startswith("#"):
+                # Count heading level
+                level = 0
+                for char in line:
+                    if char == "#":
+                        level += 1
+                    else:
+                        break
+
+                heading_text = line[level:].strip()
+
+                # Track document title (level 1 heading)
+                if level == 1 and not doc_title:
+                    doc_title = heading_text
+                    # Don't create a section for just the title - add it to first section
+                    current_section_lines.append(line)
+                    continue
+
+                # Check if this heading should start a new section
+                if level >= min_heading_level:
+                    # Save previous section if it has content
+                    if current_section_lines:
+                        section_content = "\n".join(current_section_lines).strip()
+                        if section_content:
+                            # Use doc title for first section if no specific title
+                            title = current_title if current_title else doc_title
+                            if not title:
+                                title = file_path.stem.replace("-", " ").replace(
+                                    "_", " "
+                                )
+                            sections.append(
+                                MarkdownSection(
+                                    title=title,
+                                    content=section_content,
+                                    level=current_level if current_level else 1,
+                                    index=section_index,
+                                )
+                            )
+                            section_index += 1
+
+                    # Start new section
+                    current_section_lines = [line]
+                    current_title = heading_text
+                    current_level = level
+                else:
+                    # Lower level heading (e.g., # when splitting on ##)
+                    current_section_lines.append(line)
+            else:
+                current_section_lines.append(line)
+
+        # Don't forget the last section
+        if current_section_lines:
+            section_content = "\n".join(current_section_lines).strip()
+            if section_content:
+                title = current_title if current_title else doc_title
+                if not title:
+                    title = file_path.stem.replace("-", " ").replace("_", " ")
+                sections.append(
+                    MarkdownSection(
+                        title=title,
+                        content=section_content,
+                        level=current_level if current_level else 1,
+                        index=section_index,
+                    )
+                )
+
+        # If no sections were created (no headings found), create one section with all content
+        if not sections and content.strip():
+            title = (
+                doc_title
+                if doc_title
+                else file_path.stem.replace("-", " ").replace("_", " ")
+            )
+            sections.append(
+                MarkdownSection(
+                    title=title,
+                    content=content.strip(),
+                    level=1,
+                    index=0,
+                )
+            )
+
+        return sections
+
+    def _make_section_content_id(self, doc_path: str, section_index: int) -> str:
+        """Create a unique content ID for a document section.
+
+        Format: doc_path::section_index
+        Example: 'platform/getting-started.md::0'
+        """
+        return f"{doc_path}::{section_index}"
+
+    def _parse_section_content_id(self, content_id: str) -> tuple[str, int]:
+        """Parse a section content ID back into doc_path and section_index.
+
+        Returns: (doc_path, section_index)
+        """
+        if "::" in content_id:
+            parts = content_id.rsplit("::", 1)
+            return parts[0], int(parts[1])
+        # Legacy format (whole document)
+        return content_id, 0

    async def get_missing_items(self, batch_size: int) -> list[ContentItem]:
-        """Fetch documentation files without embeddings."""
+        """Fetch documentation sections without embeddings.
+
+        Chunks each document by markdown headings and creates embeddings for each section.
+        Content IDs use the format: 'path/to/doc.md::section_index'
+        """
        docs_root = self._get_docs_root()

        if not docs_root.exists():
@@ -335,14 +476,28 @@ class DocumentationHandler(ContentHandler):
        # Find all .md and .mdx files
        all_docs = list(docs_root.rglob("*.md")) + list(docs_root.rglob("*.mdx"))

-        # Get relative paths for content IDs
-        doc_paths = [str(doc.relative_to(docs_root)) for doc in all_docs]
-
-        if not doc_paths:
+        if not all_docs:
            return []

+        # Build list of all sections from all documents
+        all_sections: list[tuple[str, Path, MarkdownSection]] = []
+        for doc_file in all_docs:
+            doc_path = str(doc_file.relative_to(docs_root))
+            sections = self._chunk_markdown_by_headings(doc_file)
+            for section in sections:
+                all_sections.append((doc_path, doc_file, section))
+
+        if not all_sections:
+            return []
+
+        # Generate content IDs for all sections
+        section_content_ids = [
+            self._make_section_content_id(doc_path, section.index)
+            for doc_path, _, section in all_sections
+        ]
+
        # Check which ones have embeddings
-        placeholders = ",".join([f"${i+1}" for i in range(len(doc_paths))])
+        placeholders = ",".join([f"${i+1}" for i in range(len(section_content_ids))])
        existing_result = await query_raw_with_schema(
            f"""
            SELECT "contentId"
@@ -350,76 +505,100 @@ class DocumentationHandler(ContentHandler):
            WHERE "contentType" = 'DOCUMENTATION'::{{schema_prefix}}"ContentType"
            AND "contentId" = ANY(ARRAY[{placeholders}])
            """,
-            *doc_paths,
+            *section_content_ids,
        )

        existing_ids = {row["contentId"] for row in existing_result}
-        missing_docs = [
-            (doc_path, doc_file)
-            for doc_path, doc_file in zip(doc_paths, all_docs)
-            if doc_path not in existing_ids
+
+        # Filter to missing sections
+        missing_sections = [
+            (doc_path, doc_file, section, content_id)
+            for (doc_path, doc_file, section), content_id in zip(
+                all_sections, section_content_ids
+            )
+            if content_id not in existing_ids
        ]

-        # Convert to ContentItem
+        # Convert to ContentItem (up to batch_size)
        items = []
-        for doc_path, doc_file in missing_docs[:batch_size]:
+        for doc_path, doc_file, section, content_id in missing_sections[:batch_size]:
            try:
-                title, content = self._extract_title_and_content(doc_file)
+                # Get document title for context
+                doc_title = self._extract_doc_title(doc_file)

-                # Build searchable text
-                searchable_text = f"{title} {content}"
+                # Build searchable text with context
+                # Include doc title and section title for better search relevance
+                searchable_text = f"{doc_title} - {section.title}\n\n{section.content}"

                items.append(
                    ContentItem(
-                        content_id=doc_path,
+                        content_id=content_id,
                        content_type=ContentType.DOCUMENTATION,
                        searchable_text=searchable_text,
                        metadata={
-                            "title": title,
+                            "doc_title": doc_title,
+                            "section_title": section.title,
+                            "section_index": section.index,
+                            "heading_level": section.level,
                            "path": doc_path,
                        },
                        user_id=None,  # Documentation is public
                    )
                )
            except Exception as e:
-                logger.warning(f"Failed to process doc {doc_path}: {e}")
+                logger.warning(f"Failed to process section {content_id}: {e}")
                continue

        return items

+    def _get_all_section_content_ids(self, docs_root: Path) -> set[str]:
+        """Get all current section content IDs from the docs directory.
+
+        Used for stats and cleanup to know what sections should exist.
+        """
+        all_docs = list(docs_root.rglob("*.md")) + list(docs_root.rglob("*.mdx"))
+        content_ids = set()
+
+        for doc_file in all_docs:
+            doc_path = str(doc_file.relative_to(docs_root))
+            sections = self._chunk_markdown_by_headings(doc_file)
+            for section in sections:
+                content_ids.add(self._make_section_content_id(doc_path, section.index))
+
+        return content_ids
+
    async def get_stats(self) -> dict[str, int]:
-        """Get statistics about documentation embedding coverage."""
+        """Get statistics about documentation embedding coverage.
+
+        Counts sections (not documents) since each section gets its own embedding.
+        """
        docs_root = self._get_docs_root()

        if not docs_root.exists():
            return {"total": 0, "with_embeddings": 0, "without_embeddings": 0}

-        # Count all .md and .mdx files
-        all_docs = list(docs_root.rglob("*.md")) + list(docs_root.rglob("*.mdx"))
-        total_docs = len(all_docs)
+        # Get all section content IDs
+        all_section_ids = self._get_all_section_content_ids(docs_root)
+        total_sections = len(all_section_ids)

-        if total_docs == 0:
+        if total_sections == 0:
            return {"total": 0, "with_embeddings": 0, "without_embeddings": 0}

-        doc_paths = [str(doc.relative_to(docs_root)) for doc in all_docs]
-        placeholders = ",".join([f"${i+1}" for i in range(len(doc_paths))])
-
+        # Count embeddings in database for DOCUMENTATION type
        embedded_result = await query_raw_with_schema(
-            f"""
+            """
            SELECT COUNT(*) as count
-            FROM {{schema_prefix}}"UnifiedContentEmbedding"
-            WHERE "contentType" = 'DOCUMENTATION'::{{schema_prefix}}"ContentType"
-            AND "contentId" = ANY(ARRAY[{placeholders}])
-            """,
-            *doc_paths,
+            FROM {schema_prefix}"UnifiedContentEmbedding"
+            WHERE "contentType" = 'DOCUMENTATION'::{schema_prefix}"ContentType"
+            """
        )

        with_embeddings = embedded_result[0]["count"] if embedded_result else 0

        return {
-            "total": total_docs,
+            "total": total_sections,
            "with_embeddings": with_embeddings,
-            "without_embeddings": total_docs - with_embeddings,
+            "without_embeddings": total_sections - with_embeddings,
        }


--- a/autogpt_platform/backend/backend/api/features/store/content_handlers_test.py
+++ b/autogpt_platform/backend/backend/api/features/store/content_handlers_test.py
@@ -164,20 +164,20 @@ async def test_documentation_handler_get_missing_items(tmp_path, mocker):

            assert len(items) == 2

-            # Check guide.md
+            # Check guide.md (content_id format: doc_path::section_index)
            guide_item = next(
-                (item for item in items if item.content_id == "guide.md"), None
+                (item for item in items if item.content_id == "guide.md::0"), None
            )
            assert guide_item is not None
            assert guide_item.content_type == ContentType.DOCUMENTATION
            assert "Getting Started" in guide_item.searchable_text
            assert "This is a guide" in guide_item.searchable_text
-            assert guide_item.metadata["title"] == "Getting Started"
+            assert guide_item.metadata["doc_title"] == "Getting Started"
            assert guide_item.user_id is None

-            # Check api.mdx
+            # Check api.mdx (content_id format: doc_path::section_index)
            api_item = next(
-                (item for item in items if item.content_id == "api.mdx"), None
+                (item for item in items if item.content_id == "api.mdx::0"), None
            )
            assert api_item is not None
            assert "API Reference" in api_item.searchable_text
@@ -218,17 +218,74 @@ async def test_documentation_handler_title_extraction(tmp_path):
    # Test with heading
    doc_with_heading = tmp_path / "with_heading.md"
    doc_with_heading.write_text("# My Title\n\nContent here")
-    title, content = handler._extract_title_and_content(doc_with_heading)
+    title = handler._extract_doc_title(doc_with_heading)
    assert title == "My Title"
-    assert "# My Title" not in content
-    assert "Content here" in content

    # Test without heading
    doc_without_heading = tmp_path / "no-heading.md"
    doc_without_heading.write_text("Just content, no heading")
-    title, content = handler._extract_title_and_content(doc_without_heading)
+    title = handler._extract_doc_title(doc_without_heading)
    assert title == "No Heading"  # Uses filename
-    assert "Just content" in content
+
+
+@pytest.mark.asyncio(loop_scope="session")
+async def test_documentation_handler_markdown_chunking(tmp_path):
+    """Test DocumentationHandler chunks markdown by headings."""
+    handler = DocumentationHandler()
+
+    # Test document with multiple sections
+    doc_with_sections = tmp_path / "sections.md"
+    doc_with_sections.write_text(
+        "# Document Title\n\n"
+        "Intro paragraph.\n\n"
+        "## Section One\n\n"
+        "Content for section one.\n\n"
+        "## Section Two\n\n"
+        "Content for section two.\n"
+    )
+    sections = handler._chunk_markdown_by_headings(doc_with_sections)
+
+    # Should have 3 sections: intro (with doc title), section one, section two
+    assert len(sections) == 3
+    assert sections[0].title == "Document Title"
+    assert sections[0].index == 0
+    assert "Intro paragraph" in sections[0].content
+
+    assert sections[1].title == "Section One"
+    assert sections[1].index == 1
+    assert "Content for section one" in sections[1].content
+
+    assert sections[2].title == "Section Two"
+    assert sections[2].index == 2
+    assert "Content for section two" in sections[2].content
+
+    # Test document without headings
+    doc_no_sections = tmp_path / "no-sections.md"
+    doc_no_sections.write_text("Just plain content without any headings.")
+    sections = handler._chunk_markdown_by_headings(doc_no_sections)
+    assert len(sections) == 1
+    assert sections[0].index == 0
+    assert "Just plain content" in sections[0].content
+
+
+@pytest.mark.asyncio(loop_scope="session")
+async def test_documentation_handler_section_content_ids():
+    """Test DocumentationHandler creates and parses section content IDs."""
+    handler = DocumentationHandler()
+
+    # Test making content ID
+    content_id = handler._make_section_content_id("docs/guide.md", 2)
+    assert content_id == "docs/guide.md::2"
+
+    # Test parsing content ID
+    doc_path, section_index = handler._parse_section_content_id("docs/guide.md::2")
+    assert doc_path == "docs/guide.md"
+    assert section_index == 2
+
+    # Test parsing legacy format (no section index)
+    doc_path, section_index = handler._parse_section_content_id("docs/old-format.md")
+    assert doc_path == "docs/old-format.md"
+    assert section_index == 0


@pytest.mark.asyncio(loop_scope="session")
--- a/autogpt_platform/backend/backend/api/features/store/db.py
+++ b/autogpt_platform/backend/backend/api/features/store/db.py
@@ -1552,7 +1552,7 @@ async def review_store_submission(

                # Generate embedding for approved listing (blocking - admin operation)
                # Inside transaction: if embedding fails, entire transaction rolls back
-                embedding_success = await ensure_embedding(
+                await ensure_embedding(
                    version_id=store_listing_version_id,
                    name=store_listing_version.name,
                    description=store_listing_version.description,
@@ -1560,12 +1560,6 @@ async def review_store_submission(
                    categories=store_listing_version.categories or [],
                    tx=tx,
                )
-                if not embedding_success:
-                    raise ValueError(
-                        f"Failed to generate embedding for listing {store_listing_version_id}. "
-                        "This is likely due to OpenAI API being unavailable. "
-                        "Please try again later or contact support if the issue persists."
-                    )

                await prisma.models.StoreListing.prisma(tx).update(
                    where={"id": store_listing_version.StoreListing.id},
--- a/autogpt_platform/backend/backend/api/features/store/embeddings.py
+++ b/autogpt_platform/backend/backend/api/features/store/embeddings.py
@@ -21,7 +21,6 @@ from backend.util.json import dumps

 logger = logging.getLogger(__name__)

-
 # OpenAI embedding model configuration
 EMBEDDING_MODEL = "text-embedding-3-small"
 # Embedding dimension for the model above
@@ -63,49 +62,42 @@ def build_searchable_text(
    return " ".join(parts)


-async def generate_embedding(text: str) -> list[float] | None:
+async def generate_embedding(text: str) -> list[float]:
    """
    Generate embedding for text using OpenAI API.

-    Returns None if embedding generation fails.
-    Fail-fast: no retries to maintain consistency with approval flow.
+    Raises exceptions on failure - caller should handle.
    """
-    try:
-        client = get_openai_client()
-        if not client:
-            logger.error("openai_internal_api_key not set, cannot generate embedding")
-            return None
+    client = get_openai_client()
+    if not client:
+        raise RuntimeError("openai_internal_api_key not set, cannot generate embedding")

-        # Truncate text to token limit using tiktoken
-        # Character-based truncation is insufficient because token ratios vary by content type
-        enc = encoding_for_model(EMBEDDING_MODEL)
-        tokens = enc.encode(text)
-        if len(tokens) > EMBEDDING_MAX_TOKENS:
-            tokens = tokens[:EMBEDDING_MAX_TOKENS]
-            truncated_text = enc.decode(tokens)
-            logger.info(
-                f"Truncated text from {len(enc.encode(text))} to {len(tokens)} tokens"
-            )
-        else:
-            truncated_text = text
-
-        start_time = time.time()
-        response = await client.embeddings.create(
-            model=EMBEDDING_MODEL,
-            input=truncated_text,
-        )
-        latency_ms = (time.time() - start_time) * 1000
-
-        embedding = response.data[0].embedding
+    # Truncate text to token limit using tiktoken
+    # Character-based truncation is insufficient because token ratios vary by content type
+    enc = encoding_for_model(EMBEDDING_MODEL)
+    tokens = enc.encode(text)
+    if len(tokens) > EMBEDDING_MAX_TOKENS:
+        tokens = tokens[:EMBEDDING_MAX_TOKENS]
+        truncated_text = enc.decode(tokens)
        logger.info(
-            f"Generated embedding: {len(embedding)} dims, "
-            f"{len(tokens)} tokens, {latency_ms:.0f}ms"
+            f"Truncated text from {len(enc.encode(text))} to {len(tokens)} tokens"
        )
-        return embedding
+    else:
+        truncated_text = text

-    except Exception as e:
-        logger.error(f"Failed to generate embedding: {e}")
-        return None
+    start_time = time.time()
+    response = await client.embeddings.create(
+        model=EMBEDDING_MODEL,
+        input=truncated_text,
+    )
+    latency_ms = (time.time() - start_time) * 1000
+
+    embedding = response.data[0].embedding
+    logger.info(
+        f"Generated embedding: {len(embedding)} dims, "
+        f"{len(tokens)} tokens, {latency_ms:.0f}ms"
+    )
+    return embedding


 async def store_embedding(
@@ -144,48 +136,45 @@ async def store_content_embedding(

    New function for unified content embedding storage.
    Uses raw SQL since Prisma doesn't natively support pgvector.
+
+    Raises exceptions on failure - caller should handle.
    """
-    try:
-        client = tx if tx else prisma.get_client()
+    client = tx if tx else prisma.get_client()

-        # Convert embedding to PostgreSQL vector format
-        embedding_str = embedding_to_vector_string(embedding)
-        metadata_json = dumps(metadata or {})
+    # Convert embedding to PostgreSQL vector format
+    embedding_str = embedding_to_vector_string(embedding)
+    metadata_json = dumps(metadata or {})

-        # Upsert the embedding
-        # WHERE clause in DO UPDATE prevents PostgreSQL 15 bug with NULLS NOT DISTINCT
-        await execute_raw_with_schema(
-            """
-            INSERT INTO {schema_prefix}"UnifiedContentEmbedding" (
-                "id", "contentType", "contentId", "userId", "embedding", "searchableText", "metadata", "createdAt", "updatedAt"
-            )
-            VALUES (gen_random_uuid()::text, $1::{schema_prefix}"ContentType", $2, $3, $4::vector, $5, $6::jsonb, NOW(), NOW())
-            ON CONFLICT ("contentType", "contentId", "userId")
-            DO UPDATE SET
-                "embedding" = $4::vector,
-                "searchableText" = $5,
-                "metadata" = $6::jsonb,
-                "updatedAt" = NOW()
-            WHERE {schema_prefix}"UnifiedContentEmbedding"."contentType" = $1::{schema_prefix}"ContentType"
-                AND {schema_prefix}"UnifiedContentEmbedding"."contentId" = $2
-                AND ({schema_prefix}"UnifiedContentEmbedding"."userId" = $3 OR ($3 IS NULL AND {schema_prefix}"UnifiedContentEmbedding"."userId" IS NULL))
-            """,
-            content_type,
-            content_id,
-            user_id,
-            embedding_str,
-            searchable_text,
-            metadata_json,
-            client=client,
-            set_public_search_path=True,
+    # Upsert the embedding
+    # WHERE clause in DO UPDATE prevents PostgreSQL 15 bug with NULLS NOT DISTINCT
+    # Use unqualified ::vector - pgvector is in search_path on all environments
+    await execute_raw_with_schema(
+        """
+        INSERT INTO {schema_prefix}"UnifiedContentEmbedding" (
+            "id", "contentType", "contentId", "userId", "embedding", "searchableText", "metadata", "createdAt", "updatedAt"
        )
+        VALUES (gen_random_uuid()::text, $1::{schema_prefix}"ContentType", $2, $3, $4::vector, $5, $6::jsonb, NOW(), NOW())
+        ON CONFLICT ("contentType", "contentId", "userId")
+        DO UPDATE SET
+            "embedding" = $4::vector,
+            "searchableText" = $5,
+            "metadata" = $6::jsonb,
+            "updatedAt" = NOW()
+        WHERE {schema_prefix}"UnifiedContentEmbedding"."contentType" = $1::{schema_prefix}"ContentType"
+            AND {schema_prefix}"UnifiedContentEmbedding"."contentId" = $2
+            AND ({schema_prefix}"UnifiedContentEmbedding"."userId" = $3 OR ($3 IS NULL AND {schema_prefix}"UnifiedContentEmbedding"."userId" IS NULL))
+        """,
+        content_type,
+        content_id,
+        user_id,
+        embedding_str,
+        searchable_text,
+        metadata_json,
+        client=client,
+    )

-        logger.info(f"Stored embedding for {content_type}:{content_id}")
-        return True
-
-    except Exception as e:
-        logger.error(f"Failed to store embedding for {content_type}:{content_id}: {e}")
-        return False
+    logger.info(f"Stored embedding for {content_type}:{content_id}")
+    return True


 async def get_embedding(version_id: str) -> dict[str, Any] | None:
@@ -217,35 +206,31 @@ async def get_content_embedding(

    New function for unified content embedding retrieval.
    Returns dict with contentType, contentId, embedding, timestamps or None if not found.
+
+    Raises exceptions on failure - caller should handle.
    """
-    try:
-        result = await query_raw_with_schema(
-            """
-            SELECT
-                "contentType",
-                "contentId",
-                "userId",
-                "embedding"::text as "embedding",
-                "searchableText",
-                "metadata",
-                "createdAt",
-                "updatedAt"
-            FROM {schema_prefix}"UnifiedContentEmbedding"
-            WHERE "contentType" = $1::{schema_prefix}"ContentType" AND "contentId" = $2 AND ("userId" = $3 OR ($3 IS NULL AND "userId" IS NULL))
-            """,
-            content_type,
-            content_id,
-            user_id,
-            set_public_search_path=True,
-        )
+    result = await query_raw_with_schema(
+        """
+        SELECT
+            "contentType",
+            "contentId",
+            "userId",
+            "embedding"::text as "embedding",
+            "searchableText",
+            "metadata",
+            "createdAt",
+            "updatedAt"
+        FROM {schema_prefix}"UnifiedContentEmbedding"
+        WHERE "contentType" = $1::{schema_prefix}"ContentType" AND "contentId" = $2 AND ("userId" = $3 OR ($3 IS NULL AND "userId" IS NULL))
+        """,
+        content_type,
+        content_id,
+        user_id,
+    )

-        if result and len(result) > 0:
-            return result[0]
-        return None
-
-    except Exception as e:
-        logger.error(f"Failed to get embedding for {content_type}:{content_id}: {e}")
-        return None
+    if result and len(result) > 0:
+        return result[0]
+    return None


 async def ensure_embedding(
@@ -273,46 +258,38 @@ async def ensure_embedding(
        tx: Optional transaction client

    Returns:
-        True if embedding exists/was created, False on failure
+        True if embedding exists/was created
+
+    Raises exceptions on failure - caller should handle.
    """
-    try:
-        # Check if embedding already exists
-        if not force:
-            existing = await get_embedding(version_id)
-            if existing and existing.get("embedding"):
-                logger.debug(f"Embedding for version {version_id} already exists")
-                return True
+    # Check if embedding already exists
+    if not force:
+        existing = await get_embedding(version_id)
+        if existing and existing.get("embedding"):
+            logger.debug(f"Embedding for version {version_id} already exists")
+            return True

-        # Build searchable text for embedding
-        searchable_text = build_searchable_text(
-            name, description, sub_heading, categories
-        )
+    # Build searchable text for embedding
+    searchable_text = build_searchable_text(name, description, sub_heading, categories)

-        # Generate new embedding
-        embedding = await generate_embedding(searchable_text)
-        if embedding is None:
-            logger.warning(f"Could not generate embedding for version {version_id}")
-            return False
+    # Generate new embedding
+    embedding = await generate_embedding(searchable_text)

-        # Store the embedding with metadata using new function
-        metadata = {
-            "name": name,
-            "subHeading": sub_heading,
-            "categories": categories,
-        }
-        return await store_content_embedding(
-            content_type=ContentType.STORE_AGENT,
-            content_id=version_id,
-            embedding=embedding,
-            searchable_text=searchable_text,
-            metadata=metadata,
-            user_id=None,  # Store agents are public
-            tx=tx,
-        )
-
-    except Exception as e:
-        logger.error(f"Failed to ensure embedding for version {version_id}: {e}")
-        return False
+    # Store the embedding with metadata using new function
+    metadata = {
+        "name": name,
+        "subHeading": sub_heading,
+        "categories": categories,
+    }
+    return await store_content_embedding(
+        content_type=ContentType.STORE_AGENT,
+        content_id=version_id,
+        embedding=embedding,
+        searchable_text=searchable_text,
+        metadata=metadata,
+        user_id=None,  # Store agents are public
+        tx=tx,
+    )


 async def delete_embedding(version_id: str) -> bool:
@@ -522,6 +499,24 @@ async def backfill_all_content_types(batch_size: int = 10) -> dict[str, Any]:
            success = sum(1 for result in results if result is True)
            failed = len(results) - success

+            # Aggregate unique errors to avoid Sentry spam
+            if failed > 0:
+                # Group errors by type and message
+                error_summary: dict[str, int] = {}
+                for result in results:
+                    if isinstance(result, Exception):
+                        error_key = f"{type(result).__name__}: {str(result)}"
+                        error_summary[error_key] = error_summary.get(error_key, 0) + 1
+
+                # Log aggregated error summary
+                error_details = ", ".join(
+                    f"{error} ({count}x)" for error, count in error_summary.items()
+                )
+                logger.error(
+                    f"{content_type.value}: {failed}/{len(results)} embeddings failed. "
+                    f"Errors: {error_details}"
+                )
+
            results_by_type[content_type.value] = {
                "processed": len(missing_items),
                "success": success,
@@ -558,11 +553,12 @@ async def backfill_all_content_types(batch_size: int = 10) -> dict[str, Any]:
    }


-async def embed_query(query: str) -> list[float] | None:
+async def embed_query(query: str) -> list[float]:
    """
    Generate embedding for a search query.

    Same as generate_embedding but with clearer intent.
+    Raises exceptions on failure - caller should handle.
    """
    return await generate_embedding(query)

@@ -595,40 +591,30 @@ async def ensure_content_embedding(
        tx: Optional transaction client

    Returns:
-        True if embedding exists/was created, False on failure
+        True if embedding exists/was created
+
+    Raises exceptions on failure - caller should handle.
    """
-    try:
-        # Check if embedding already exists
-        if not force:
-            existing = await get_content_embedding(content_type, content_id, user_id)
-            if existing and existing.get("embedding"):
-                logger.debug(
-                    f"Embedding for {content_type}:{content_id} already exists"
-                )
-                return True
+    # Check if embedding already exists
+    if not force:
+        existing = await get_content_embedding(content_type, content_id, user_id)
+        if existing and existing.get("embedding"):
+            logger.debug(f"Embedding for {content_type}:{content_id} already exists")
+            return True

-        # Generate new embedding
-        embedding = await generate_embedding(searchable_text)
-        if embedding is None:
-            logger.warning(
-                f"Could not generate embedding for {content_type}:{content_id}"
-            )
-            return False
+    # Generate new embedding
+    embedding = await generate_embedding(searchable_text)

-        # Store the embedding
-        return await store_content_embedding(
-            content_type=content_type,
-            content_id=content_id,
-            embedding=embedding,
-            searchable_text=searchable_text,
-            metadata=metadata or {},
-            user_id=user_id,
-            tx=tx,
-        )
-
-    except Exception as e:
-        logger.error(f"Failed to ensure embedding for {content_type}:{content_id}: {e}")
-        return False
+    # Store the embedding
+    return await store_content_embedding(
+        content_type=content_type,
+        content_id=content_id,
+        embedding=embedding,
+        searchable_text=searchable_text,
+        metadata=metadata or {},
+        user_id=user_id,
+        tx=tx,
+    )


 async def cleanup_orphaned_embeddings() -> dict[str, Any]:
@@ -683,20 +669,20 @@ async def cleanup_orphaned_embeddings() -> dict[str, Any]:

                current_ids = set(get_blocks().keys())
            elif content_type == ContentType.DOCUMENTATION:
-                from pathlib import Path
-
-                # embeddings.py is at: backend/backend/api/features/store/embeddings.py
-                # Need to go up to project root then into docs/
-                this_file = Path(__file__)
-                project_root = (
-                    this_file.parent.parent.parent.parent.parent.parent.parent
+                # Use DocumentationHandler to get section-based content IDs
+                from backend.api.features.store.content_handlers import (
+                    DocumentationHandler,
                )
-                docs_root = project_root / "docs"
-                if docs_root.exists():
-                    all_docs = list(docs_root.rglob("*.md")) + list(
-                        docs_root.rglob("*.mdx")
-                    )
-                    current_ids = {str(doc.relative_to(docs_root)) for doc in all_docs}
+
+                doc_handler = CONTENT_HANDLERS.get(ContentType.DOCUMENTATION)
+                if isinstance(doc_handler, DocumentationHandler):
+                    docs_root = doc_handler._get_docs_root()
+                    if docs_root.exists():
+                        current_ids = doc_handler._get_all_section_content_ids(
+                            docs_root
+                        )
+                    else:
+                        current_ids = set()
                else:
                    current_ids = set()
            else:
@@ -855,9 +841,8 @@ async def semantic_search(
        limit = 100

    # Generate query embedding
-    query_embedding = await embed_query(query)
-
-    if query_embedding is not None:
+    try:
+        query_embedding = await embed_query(query)
        # Semantic search with embeddings
        embedding_str = embedding_to_vector_string(query_embedding)

@@ -871,47 +856,58 @@ async def semantic_search(
        # Add content type parameters and build placeholders dynamically
        content_type_start_idx = len(params) + 1
        content_type_placeholders = ", ".join(
-            f'${content_type_start_idx + i}::{{{{schema_prefix}}}}"ContentType"'
+            "$" + str(content_type_start_idx + i) + '::{schema_prefix}"ContentType"'
            for i in range(len(content_types))
        )
        params.extend([ct.value for ct in content_types])

-        sql = f"""
+        # Build min_similarity param index before appending
+        min_similarity_idx = len(params) + 1
+        params.append(min_similarity)
+
+        # Use unqualified ::vector and <=> operator - pgvector is in search_path on all environments
+        sql = (
+            """
            SELECT
                "contentId" as content_id,
                "contentType" as content_type,
                "searchableText" as searchable_text,
                metadata,
-                1 - (embedding <=> '{embedding_str}'::vector) as similarity
-            FROM {{{{schema_prefix}}}}"UnifiedContentEmbedding"
-            WHERE "contentType" IN ({content_type_placeholders})
-            {user_filter}
-            AND 1 - (embedding <=> '{embedding_str}'::vector) >= ${len(params) + 1}
+                1 - (embedding <=> '"""
+            + embedding_str
+            + """'::vector) as similarity
+            FROM {schema_prefix}"UnifiedContentEmbedding"
+            WHERE "contentType" IN ("""
+            + content_type_placeholders
+            + """)
+            """
+            + user_filter
+            + """
+            AND 1 - (embedding <=> '"""
+            + embedding_str
+            + """'::vector) >= $"""
+            + str(min_similarity_idx)
+            + """
            ORDER BY similarity DESC
            LIMIT $1
        """
-        params.append(min_similarity)
+        )

-        try:
-            results = await query_raw_with_schema(
-                sql, *params, set_public_search_path=True
-            )
-            return [
-                {
-                    "content_id": row["content_id"],
-                    "content_type": row["content_type"],
-                    "searchable_text": row["searchable_text"],
-                    "metadata": row["metadata"],
-                    "similarity": float(row["similarity"]),
-                }
-                for row in results
-            ]
-        except Exception as e:
-            logger.error(f"Semantic search failed: {e}")
-            # Fall through to lexical search below
+        results = await query_raw_with_schema(sql, *params)
+        return [
+            {
+                "content_id": row["content_id"],
+                "content_type": row["content_type"],
+                "searchable_text": row["searchable_text"],
+                "metadata": row["metadata"],
+                "similarity": float(row["similarity"]),
+            }
+            for row in results
+        ]
+    except Exception as e:
+        logger.warning(f"Semantic search failed, falling back to lexical search: {e}")

    # Fallback to lexical search if embeddings unavailable
-    logger.warning("Falling back to lexical search (embeddings unavailable)")

    params_lexical: list[Any] = [limit]
    user_filter = ""
@@ -922,31 +918,41 @@ async def semantic_search(
    # Add content type parameters and build placeholders dynamically
    content_type_start_idx = len(params_lexical) + 1
    content_type_placeholders_lexical = ", ".join(
-        f'${content_type_start_idx + i}::{{{{schema_prefix}}}}"ContentType"'
+        "$" + str(content_type_start_idx + i) + '::{schema_prefix}"ContentType"'
        for i in range(len(content_types))
    )
    params_lexical.extend([ct.value for ct in content_types])

-    sql_lexical = f"""
+    # Build query param index before appending
+    query_param_idx = len(params_lexical) + 1
+    params_lexical.append(f"%{query}%")
+
+    # Use regular string (not f-string) for template to preserve {schema_prefix} placeholders
+    sql_lexical = (
+        """
        SELECT
            "contentId" as content_id,
            "contentType" as content_type,
            "searchableText" as searchable_text,
            metadata,
            0.0 as similarity
-        FROM {{{{schema_prefix}}}}"UnifiedContentEmbedding"
-        WHERE "contentType" IN ({content_type_placeholders_lexical})
-        {user_filter}
-        AND "searchableText" ILIKE ${len(params_lexical) + 1}
+        FROM {schema_prefix}"UnifiedContentEmbedding"
+        WHERE "contentType" IN ("""
+        + content_type_placeholders_lexical
+        + """)
+        """
+        + user_filter
+        + """
+        AND "searchableText" ILIKE $"""
+        + str(query_param_idx)
+        + """
        ORDER BY "updatedAt" DESC
        LIMIT $1
    """
-    params_lexical.append(f"%{query}%")
+    )

    try:
-        results = await query_raw_with_schema(
-            sql_lexical, *params_lexical, set_public_search_path=True
-        )
+        results = await query_raw_with_schema(sql_lexical, *params_lexical)
        return [
            {
                "content_id": row["content_id"],
--- a/autogpt_platform/backend/backend/api/features/store/embeddings_schema_test.py
+++ b/autogpt_platform/backend/backend/api/features/store/embeddings_schema_test.py
@@ -298,17 +298,16 @@ async def test_schema_handling_error_cases():
            mock_client.execute_raw.side_effect = Exception("Database error")
            mock_get_client.return_value = mock_client

-            result = await embeddings.store_content_embedding(
-                content_type=ContentType.STORE_AGENT,
-                content_id="test-id",
-                embedding=[0.1] * EMBEDDING_DIM,
-                searchable_text="test",
-                metadata=None,
-                user_id=None,
-            )
-
-            # Should return False on error, not raise
-            assert result is False
+            # Should raise exception on error
+            with pytest.raises(Exception, match="Database error"):
+                await embeddings.store_content_embedding(
+                    content_type=ContentType.STORE_AGENT,
+                    content_id="test-id",
+                    embedding=[0.1] * EMBEDDING_DIM,
+                    searchable_text="test",
+                    metadata=None,
+                    user_id=None,
+                )


 if __name__ == "__main__":
--- a/autogpt_platform/backend/backend/api/features/store/embeddings_test.py
+++ b/autogpt_platform/backend/backend/api/features/store/embeddings_test.py
@@ -80,9 +80,8 @@ async def test_generate_embedding_no_api_key():
    ) as mock_get_client:
        mock_get_client.return_value = None

-        result = await embeddings.generate_embedding("test text")
-
-        assert result is None
+        with pytest.raises(RuntimeError, match="openai_internal_api_key not set"):
+            await embeddings.generate_embedding("test text")


@pytest.mark.asyncio(loop_scope="session")
@@ -97,9 +96,8 @@ async def test_generate_embedding_api_error():
    ) as mock_get_client:
        mock_get_client.return_value = mock_client

-        result = await embeddings.generate_embedding("test text")
-
-        assert result is None
+        with pytest.raises(Exception, match="API Error"):
+            await embeddings.generate_embedding("test text")


@pytest.mark.asyncio(loop_scope="session")
@@ -155,18 +153,14 @@ async def test_store_embedding_success(mocker):
    )

    assert result is True
-    # execute_raw is called twice: once for SET search_path, once for INSERT
-    assert mock_client.execute_raw.call_count == 2
+    # execute_raw is called once for INSERT (no separate SET search_path needed)
+    assert mock_client.execute_raw.call_count == 1

-    # First call: SET search_path
-    first_call_args = mock_client.execute_raw.call_args_list[0][0]
-    assert "SET search_path" in first_call_args[0]
-
-    # Second call: INSERT query with the actual data
-    second_call_args = mock_client.execute_raw.call_args_list[1][0]
-    assert "test-version-id" in second_call_args
-    assert "[0.1,0.2,0.3]" in second_call_args
-    assert None in second_call_args  # userId should be None for store agents
+    # Verify the INSERT query with the actual data
+    call_args = mock_client.execute_raw.call_args_list[0][0]
+    assert "test-version-id" in call_args
+    assert "[0.1,0.2,0.3]" in call_args
+    assert None in call_args  # userId should be None for store agents


@pytest.mark.asyncio(loop_scope="session")
@@ -177,11 +171,10 @@ async def test_store_embedding_database_error(mocker):

    embedding = [0.1, 0.2, 0.3]

-    result = await embeddings.store_embedding(
-        version_id="test-version-id", embedding=embedding, tx=mock_client
-    )
-
-    assert result is False
+    with pytest.raises(Exception, match="Database error"):
+        await embeddings.store_embedding(
+            version_id="test-version-id", embedding=embedding, tx=mock_client
+        )


@pytest.mark.asyncio(loop_scope="session")
@@ -281,17 +274,16 @@ async def test_ensure_embedding_create_new(mock_get, mock_store, mock_generate):
 async def test_ensure_embedding_generation_fails(mock_get, mock_generate):
    """Test ensure_embedding when generation fails."""
    mock_get.return_value = None
-    mock_generate.return_value = None
+    mock_generate.side_effect = Exception("Generation failed")

-    result = await embeddings.ensure_embedding(
-        version_id="test-id",
-        name="Test",
-        description="Test description",
-        sub_heading="Test heading",
-        categories=["test"],
-    )
-
-    assert result is False
+    with pytest.raises(Exception, match="Generation failed"):
+        await embeddings.ensure_embedding(
+            version_id="test-id",
+            name="Test",
+            description="Test description",
+            sub_heading="Test heading",
+            categories=["test"],
+        )


@pytest.mark.asyncio(loop_scope="session")
--- a/autogpt_platform/backend/backend/api/features/store/hybrid_search.py
+++ b/autogpt_platform/backend/backend/api/features/store/hybrid_search.py
@@ -3,13 +3,16 @@ Unified Hybrid Search

 Combines semantic (embedding) search with lexical (tsvector) search
 for improved relevance across all content types (agents, blocks, docs).
+Includes BM25 reranking for improved lexical relevance.
 """

 import logging
+import re
 from dataclasses import dataclass
 from typing import Any, Literal

 from prisma.enums import ContentType
+from rank_bm25 import BM25Okapi  # type: ignore[import-untyped]

 from backend.api.features.store.embeddings import (
    EMBEDDING_DIM,
@@ -21,6 +24,84 @@ from backend.data.db import query_raw_with_schema
 logger = logging.getLogger(__name__)


+# ============================================================================
+# BM25 Reranking
+# ============================================================================
+
+
+def tokenize(text: str) -> list[str]:
+    """Simple tokenizer for BM25 - lowercase and split on non-alphanumeric."""
+    if not text:
+        return []
+    # Lowercase and split on non-alphanumeric characters
+    tokens = re.findall(r"\b\w+\b", text.lower())
+    return tokens
+
+
+def bm25_rerank(
+    query: str,
+    results: list[dict[str, Any]],
+    text_field: str = "searchable_text",
+    bm25_weight: float = 0.3,
+    original_score_field: str = "combined_score",
+) -> list[dict[str, Any]]:
+    """
+    Rerank search results using BM25.
+
+    Combines the original combined_score with BM25 score for improved
+    lexical relevance, especially for exact term matches.
+
+    Args:
+        query: The search query
+        results: List of result dicts with text_field and original_score_field
+        text_field: Field name containing the text to score
+        bm25_weight: Weight for BM25 score (0-1). Original score gets (1 - bm25_weight)
+        original_score_field: Field name containing the original score
+
+    Returns:
+        Results list sorted by combined score (BM25 + original)
+    """
+    if not results or not query:
+        return results
+
+    # Extract texts and tokenize
+    corpus = [tokenize(r.get(text_field, "") or "") for r in results]
+
+    # Handle edge case where all documents are empty
+    if all(len(doc) == 0 for doc in corpus):
+        return results
+
+    # Build BM25 index
+    bm25 = BM25Okapi(corpus)
+
+    # Score query against corpus
+    query_tokens = tokenize(query)
+    if not query_tokens:
+        return results
+
+    bm25_scores = bm25.get_scores(query_tokens)
+
+    # Normalize BM25 scores to 0-1 range
+    max_bm25 = max(bm25_scores) if max(bm25_scores) > 0 else 1.0
+    normalized_bm25 = [s / max_bm25 for s in bm25_scores]
+
+    # Combine scores
+    original_weight = 1.0 - bm25_weight
+    for i, result in enumerate(results):
+        original_score = result.get(original_score_field, 0) or 0
+        result["bm25_score"] = normalized_bm25[i]
+        final_score = (
+            original_weight * original_score + bm25_weight * normalized_bm25[i]
+        )
+        result["final_score"] = final_score
+        result["relevance"] = final_score
+
+    # Sort by relevance descending
+    results.sort(key=lambda x: x.get("relevance", 0), reverse=True)
+
+    return results
+
+
@dataclass
 class UnifiedSearchWeights:
    """Weights for unified search (no popularity signal)."""
@@ -105,13 +186,12 @@ async def unified_hybrid_search(

    offset = (page - 1) * page_size

-    # Generate query embedding
-    query_embedding = await embed_query(query)
-
-    # Graceful degradation if embedding unavailable
-    if query_embedding is None or not query_embedding:
+    # Generate query embedding with graceful degradation
+    try:
+        query_embedding = await embed_query(query)
+    except Exception as e:
        logger.warning(
-            "Failed to generate query embedding - falling back to lexical-only search. "
+            f"Failed to generate query embedding - falling back to lexical-only search: {e}. "
            "Check that openai_internal_api_key is configured and OpenAI API is accessible."
        )
        query_embedding = [0.0] * EMBEDDING_DIM
@@ -273,9 +353,7 @@ async def unified_hybrid_search(
            FROM normalized
        ),
        filtered AS (
-            SELECT
-                *,
-                COUNT(*) OVER () as total_count
+            SELECT *, COUNT(*) OVER () as total_count
            FROM scored
            WHERE combined_score >= {min_score_param}
        )
@@ -284,11 +362,18 @@ async def unified_hybrid_search(
        LIMIT {limit_param} OFFSET {offset_param}
    """

-    results = await query_raw_with_schema(
-        sql_query, *params, set_public_search_path=True
-    )
+    results = await query_raw_with_schema(sql_query, *params)

    total = results[0]["total_count"] if results else 0
+    # Apply BM25 reranking
+    if results:
+        results = bm25_rerank(
+            query=query,
+            results=results,
+            text_field="searchable_text",
+            bm25_weight=0.3,
+            original_score_field="combined_score",
+        )

    # Clean up results
    for result in results:
@@ -378,13 +463,12 @@ async def hybrid_search(

    offset = (page - 1) * page_size

-    # Generate query embedding
-    query_embedding = await embed_query(query)
-
-    # Graceful degradation
-    if query_embedding is None or not query_embedding:
+    # Generate query embedding with graceful degradation
+    try:
+        query_embedding = await embed_query(query)
+    except Exception as e:
        logger.warning(
-            "Failed to generate query embedding - falling back to lexical-only search."
+            f"Failed to generate query embedding - falling back to lexical-only search: {e}"
        )
        query_embedding = [0.0] * EMBEDDING_DIM
        total_non_semantic = (
@@ -516,6 +600,8 @@ async def hybrid_search(
                sa.featured,
                sa.is_available,
                sa.updated_at,
+                -- Searchable text for BM25 reranking
+                COALESCE(sa.agent_name, '') || ' ' || COALESCE(sa.sub_heading, '') || ' ' || COALESCE(sa.description, '') as searchable_text,
                -- Semantic score
                COALESCE(1 - (uce.embedding <=> {embedding_param}::vector), 0) as semantic_score,
                -- Lexical score (raw, will normalize)
@@ -573,6 +659,7 @@ async def hybrid_search(
                featured,
                is_available,
                updated_at,
+                searchable_text,
                semantic_score,
                lexical_score,
                category_score,
@@ -597,14 +684,23 @@ async def hybrid_search(
        LIMIT {limit_param} OFFSET {offset_param}
    """

-    results = await query_raw_with_schema(
-        sql_query, *params, set_public_search_path=True
-    )
+    results = await query_raw_with_schema(sql_query, *params)

    total = results[0]["total_count"] if results else 0

+    # Apply BM25 reranking
+    if results:
+        results = bm25_rerank(
+            query=query,
+            results=results,
+            text_field="searchable_text",
+            bm25_weight=0.3,
+            original_score_field="combined_score",
+        )
+
    for result in results:
        result.pop("total_count", None)
+        result.pop("searchable_text", None)

    logger.info(f"Hybrid search (store agents): {len(results)} results, {total} total")

--- a/autogpt_platform/backend/backend/api/features/store/hybrid_search_test.py
+++ b/autogpt_platform/backend/backend/api/features/store/hybrid_search_test.py
@@ -172,8 +172,8 @@ async def test_hybrid_search_without_embeddings():
        with patch(
            "backend.api.features.store.hybrid_search.query_raw_with_schema"
        ) as mock_query:
-            # Simulate embedding failure
-            mock_embed.return_value = None
+            # Simulate embedding failure by raising exception
+            mock_embed.side_effect = Exception("Embedding generation failed")
            mock_query.return_value = mock_results

            # Should NOT raise - graceful degradation
@@ -311,11 +311,43 @@ async def test_hybrid_search_min_score_filtering():
@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.integration
 async def test_hybrid_search_pagination():
-    """Test hybrid search pagination."""
+    """Test hybrid search pagination.
+
+    Pagination happens in SQL (LIMIT/OFFSET), then BM25 reranking is applied
+    to the paginated results.
+    """
+    # Create mock results that SQL would return for a page
+    mock_results = [
+        {
+            "slug": f"agent-{i}",
+            "agent_name": f"Agent {i}",
+            "agent_image": "test.png",
+            "creator_username": "test",
+            "creator_avatar": "avatar.png",
+            "sub_heading": "Test",
+            "description": "Test description",
+            "runs": 100 - i,
+            "rating": 4.5,
+            "categories": ["test"],
+            "featured": False,
+            "is_available": True,
+            "updated_at": "2024-01-01T00:00:00Z",
+            "searchable_text": f"Agent {i} test description",
+            "combined_score": 0.9 - (i * 0.01),
+            "semantic_score": 0.7,
+            "lexical_score": 0.6,
+            "category_score": 0.5,
+            "recency_score": 0.4,
+            "popularity_score": 0.3,
+            "total_count": 25,
+        }
+        for i in range(10)  # SQL returns page_size results
+    ]
+
    with patch(
        "backend.api.features.store.hybrid_search.query_raw_with_schema"
    ) as mock_query:
-        mock_query.return_value = []
+        mock_query.return_value = mock_results

        with patch(
            "backend.api.features.store.hybrid_search.embed_query"
@@ -329,16 +361,18 @@ async def test_hybrid_search_pagination():
                page_size=10,
            )

-            # Verify pagination parameters
+            # Verify results returned
+            assert len(results) == 10
+            assert total == 25  # Total from SQL COUNT(*) OVER()
+
+            # Verify the SQL query uses page_size and offset
            call_args = mock_query.call_args
            params = call_args[0]
-
-            # Last two params should be LIMIT and OFFSET
-            limit = params[-2]
-            offset = params[-1]
-
-            assert limit == 10  # page_size
-            assert offset == 10  # (page - 1) * page_size = (2 - 1) * 10
+            # Last two params are page_size and offset
+            page_size_param = params[-2]
+            offset_param = params[-1]
+            assert page_size_param == 10
+            assert offset_param == 10  # (page 2 - 1) * 10


@pytest.mark.asyncio(loop_scope="session")
@@ -579,7 +613,9 @@ async def test_unified_hybrid_search_graceful_degradation():
            "backend.api.features.store.hybrid_search.embed_query"
        ) as mock_embed:
            mock_query.return_value = mock_results
-            mock_embed.return_value = None  # Embedding failure
+            mock_embed.side_effect = Exception(
+                "Embedding generation failed"
+            )  # Embedding failure

            # Should NOT raise - graceful degradation
            results, total = await unified_hybrid_search(
@@ -609,14 +645,36 @@ async def test_unified_hybrid_search_empty_query():
@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.integration
 async def test_unified_hybrid_search_pagination():
-    """Test unified search pagination."""
+    """Test unified search pagination with BM25 reranking.
+
+    Pagination happens in SQL (LIMIT/OFFSET), then BM25 reranking is applied
+    to the paginated results.
+    """
+    # Create mock results that SQL would return for a page
+    mock_results = [
+        {
+            "content_type": "STORE_AGENT",
+            "content_id": f"agent-{i}",
+            "searchable_text": f"Agent {i} description",
+            "metadata": {"name": f"Agent {i}"},
+            "updated_at": "2025-01-01T00:00:00Z",
+            "semantic_score": 0.7,
+            "lexical_score": 0.8 - (i * 0.01),
+            "category_score": 0.5,
+            "recency_score": 0.3,
+            "combined_score": 0.6 - (i * 0.01),
+            "total_count": 50,
+        }
+        for i in range(15)  # SQL returns page_size results
+    ]
+
    with patch(
        "backend.api.features.store.hybrid_search.query_raw_with_schema"
    ) as mock_query:
        with patch(
            "backend.api.features.store.hybrid_search.embed_query"
        ) as mock_embed:
-            mock_query.return_value = []
+            mock_query.return_value = mock_results
            mock_embed.return_value = [0.1] * embeddings.EMBEDDING_DIM

            results, total = await unified_hybrid_search(
@@ -625,15 +683,18 @@ async def test_unified_hybrid_search_pagination():
                page_size=15,
            )

-            # Verify pagination parameters (last two params are LIMIT and OFFSET)
+            # Verify results returned
+            assert len(results) == 15
+            assert total == 50  # Total from SQL COUNT(*) OVER()
+
+            # Verify the SQL query uses page_size and offset
            call_args = mock_query.call_args
            params = call_args[0]
-
-            limit = params[-2]
-            offset = params[-1]
-
-            assert limit == 15  # page_size
-            assert offset == 30  # (page - 1) * page_size = (3 - 1) * 15
+            # Last two params are page_size and offset
+            page_size_param = params[-2]
+            offset_param = params[-1]
+            assert page_size_param == 15
+            assert offset_param == 30  # (page 3 - 1) * 15


@pytest.mark.asyncio(loop_scope="session")
--- a/autogpt_platform/backend/backend/api/features/v1.py
+++ b/autogpt_platform/backend/backend/api/features/v1.py
@@ -761,10 +761,8 @@ async def create_new_graph(
    graph.reassign_ids(user_id=user_id, reassign_graph_id=True)
    graph.validate_graph(for_run=False)

-    # The return value of the create graph & library function is intentionally not used here,
-    # as the graph already valid and no sub-graphs are returned back.
    await graph_db.create_graph(graph, user_id=user_id)
-    await library_db.create_library_agent(graph, user_id=user_id)
+    await library_db.create_library_agent(graph, user_id)
    activated_graph = await on_graph_activate(graph, user_id=user_id)

    if create_graph.source == "builder":
@@ -888,21 +886,19 @@ async def set_graph_active_version(
 async def _update_library_agent_version_and_settings(
    user_id: str, agent_graph: graph_db.GraphModel
 ) -> library_model.LibraryAgent:
-    # Keep the library agent up to date with the new active version
    library = await library_db.update_agent_version_in_library(
        user_id, agent_graph.id, agent_graph.version
    )
-    # If the graph has HITL node, initialize the setting if it's not already set.
-    if (
-        agent_graph.has_human_in_the_loop
-        and library.settings.human_in_the_loop_safe_mode is None
-    ):
-        await library_db.update_library_agent_settings(
+    updated_settings = GraphSettings.from_graph(
+        graph=agent_graph,
+        hitl_safe_mode=library.settings.human_in_the_loop_safe_mode,
+        sensitive_action_safe_mode=library.settings.sensitive_action_safe_mode,
+    )
+    if updated_settings != library.settings:
+        library = await library_db.update_library_agent(
+            library_agent_id=library.id,
            user_id=user_id,
-            agent_id=library.id,
-            settings=library.settings.model_copy(
-                update={"human_in_the_loop_safe_mode": True}
-            ),
+            settings=updated_settings,
        )
    return library

@@ -919,21 +915,18 @@ async def update_graph_settings(
    user_id: Annotated[str, Security(get_user_id)],
 ) -> GraphSettings:
    """Update graph settings for the user's library agent."""
-    # Get the library agent for this graph
    library_agent = await library_db.get_library_agent_by_graph_id(
        graph_id=graph_id, user_id=user_id
    )
    if not library_agent:
        raise HTTPException(404, f"Graph #{graph_id} not found in user's library")

-    # Update the library agent settings
-    updated_agent = await library_db.update_library_agent_settings(
+    updated_agent = await library_db.update_library_agent(
+        library_agent_id=library_agent.id,
        user_id=user_id,
-        agent_id=library_agent.id,
        settings=settings,
    )

-    # Return the updated settings
    return GraphSettings.model_validate(updated_agent.settings)


--- a/autogpt_platform/backend/backend/blocks/ai_shortform_video_block.py
+++ b/autogpt_platform/backend/backend/blocks/ai_shortform_video_block.py
@@ -174,7 +174,7 @@ class AIShortformVideoCreatorBlock(Block):
        )
        frame_rate: int = SchemaField(description="Frame rate of the video", default=60)
        generation_preset: GenerationPreset = SchemaField(
-            description="Generation preset for visual style - only effects AI generated visuals",
+            description="Generation preset for visual style - only affects AI-generated visuals",
            default=GenerationPreset.LEONARDO,
            placeholder=GenerationPreset.LEONARDO,
        )
--- a/autogpt_platform/backend/backend/blocks/apollo/models.py
+++ b/autogpt_platform/backend/backend/blocks/apollo/models.py
@@ -381,7 +381,7 @@ Each range you add needs to be a string, with the upper and lower numbers of the
    organization_locations: Optional[list[str]] = SchemaField(
        description="""The location of the company headquarters. You can search across cities, US states, and countries.

-If a company has several office locations, results are still based on the headquarters location. For example, if you search chicago but a company's HQ location is in boston, any Boston-based companies will not appearch in your search results, even if they match other parameters.
+If a company has several office locations, results are still based on the headquarters location. For example, if you search chicago but a company's HQ location is in boston, any Boston-based companies will not appear in your search results, even if they match other parameters.

 To exclude companies based on location, use the organization_not_locations parameter.
 """,
--- a/autogpt_platform/backend/backend/blocks/apollo/organization.py
+++ b/autogpt_platform/backend/backend/blocks/apollo/organization.py
@@ -34,7 +34,7 @@ Each range you add needs to be a string, with the upper and lower numbers of the
        organization_locations: list[str] = SchemaField(
            description="""The location of the company headquarters. You can search across cities, US states, and countries.

-If a company has several office locations, results are still based on the headquarters location. For example, if you search chicago but a company's HQ location is in boston, any Boston-based companies will not appearch in your search results, even if they match other parameters.
+If a company has several office locations, results are still based on the headquarters location. For example, if you search chicago but a company's HQ location is in boston, any Boston-based companies will not appear in your search results, even if they match other parameters.

 To exclude companies based on location, use the organization_not_locations parameter.
 """,
--- a/autogpt_platform/backend/backend/blocks/basic.py
+++ b/autogpt_platform/backend/backend/blocks/basic.py
@@ -81,7 +81,7 @@ class StoreValueBlock(Block):
    def __init__(self):
        super().__init__(
            id="1ff065e9-88e8-4358-9d82-8dc91f622ba9",
-            description="This block forwards an input value as output, allowing reuse without change.",
+            description="A basic block that stores and forwards a value throughout workflows, allowing it to be reused without changes across multiple blocks.",
            categories={BlockCategory.BASIC},
            input_schema=StoreValueBlock.Input,
            output_schema=StoreValueBlock.Output,
@@ -111,11 +111,12 @@ class PrintToConsoleBlock(Block):
    def __init__(self):
        super().__init__(
            id="f3b1c1b2-4c4f-4f0d-8d2f-4c4f0d8d2f4c",
-            description="Print the given text to the console, this is used for a debugging purpose.",
+            description="A debugging block that outputs text to the console for monitoring and troubleshooting workflow execution.",
            categories={BlockCategory.BASIC},
            input_schema=PrintToConsoleBlock.Input,
            output_schema=PrintToConsoleBlock.Output,
            test_input={"text": "Hello, World!"},
+            is_sensitive_action=True,
            test_output=[
                ("output", "Hello, World!"),
                ("status", "printed"),
@@ -137,7 +138,7 @@ class NoteBlock(Block):
    def __init__(self):
        super().__init__(
            id="cc10ff7b-7753-4ff2-9af6-9399b1a7eddc",
-            description="This block is used to display a sticky note with the given text.",
+            description="A visual annotation block that displays a sticky note in the workflow editor for documentation and organization purposes.",
            categories={BlockCategory.BASIC},
            input_schema=NoteBlock.Input,
            output_schema=NoteBlock.Output,
--- a/autogpt_platform/backend/backend/blocks/claude_code.py
+++ b/autogpt_platform/backend/backend/blocks/claude_code.py
@@ -0,0 +1,659 @@
+import json
+import shlex
+import uuid
+from typing import Literal, Optional
+
+from e2b import AsyncSandbox as BaseAsyncSandbox
+from pydantic import BaseModel, SecretStr
+
+from backend.data.block import (
+    Block,
+    BlockCategory,
+    BlockOutput,
+    BlockSchemaInput,
+    BlockSchemaOutput,
+)
+from backend.data.model import (
+    APIKeyCredentials,
+    CredentialsField,
+    CredentialsMetaInput,
+    SchemaField,
+)
+from backend.integrations.providers import ProviderName
+
+
+class ClaudeCodeExecutionError(Exception):
+    """Exception raised when Claude Code execution fails.
+
+    Carries the sandbox_id so it can be returned to the user for cleanup
+    when dispose_sandbox=False.
+    """
+
+    def __init__(self, message: str, sandbox_id: str = ""):
+        super().__init__(message)
+        self.sandbox_id = sandbox_id
+
+
+# Test credentials for E2B
+TEST_E2B_CREDENTIALS = APIKeyCredentials(
+    id="01234567-89ab-cdef-0123-456789abcdef",
+    provider="e2b",
+    api_key=SecretStr("mock-e2b-api-key"),
+    title="Mock E2B API key",
+    expires_at=None,
+)
+TEST_E2B_CREDENTIALS_INPUT = {
+    "provider": TEST_E2B_CREDENTIALS.provider,
+    "id": TEST_E2B_CREDENTIALS.id,
+    "type": TEST_E2B_CREDENTIALS.type,
+    "title": TEST_E2B_CREDENTIALS.title,
+}
+
+# Test credentials for Anthropic
+TEST_ANTHROPIC_CREDENTIALS = APIKeyCredentials(
+    id="2e568a2b-b2ea-475a-8564-9a676bf31c56",
+    provider="anthropic",
+    api_key=SecretStr("mock-anthropic-api-key"),
+    title="Mock Anthropic API key",
+    expires_at=None,
+)
+TEST_ANTHROPIC_CREDENTIALS_INPUT = {
+    "provider": TEST_ANTHROPIC_CREDENTIALS.provider,
+    "id": TEST_ANTHROPIC_CREDENTIALS.id,
+    "type": TEST_ANTHROPIC_CREDENTIALS.type,
+    "title": TEST_ANTHROPIC_CREDENTIALS.title,
+}
+
+
+class ClaudeCodeBlock(Block):
+    """
+    Execute tasks using Claude Code (Anthropic's AI coding assistant) in an E2B sandbox.
+
+    Claude Code can create files, install tools, run commands, and perform complex
+    coding tasks autonomously within a secure sandbox environment.
+    """
+
+    # Use base template - we'll install Claude Code ourselves for latest version
+    DEFAULT_TEMPLATE = "base"
+
+    class Input(BlockSchemaInput):
+        e2b_credentials: CredentialsMetaInput[
+            Literal[ProviderName.E2B], Literal["api_key"]
+        ] = CredentialsField(
+            description=(
+                "API key for the E2B platform to create the sandbox. "
+                "Get one on the [e2b website](https://e2b.dev/docs)"
+            ),
+        )
+
+        anthropic_credentials: CredentialsMetaInput[
+            Literal[ProviderName.ANTHROPIC], Literal["api_key"]
+        ] = CredentialsField(
+            description=(
+                "API key for Anthropic to power Claude Code. "
+                "Get one at [Anthropic's website](https://console.anthropic.com)"
+            ),
+        )
+
+        prompt: str = SchemaField(
+            description=(
+                "The task or instruction for Claude Code to execute. "
+                "Claude Code can create files, install packages, run commands, "
+                "and perform complex coding tasks."
+            ),
+            placeholder="Create a hello world index.html file",
+            default="",
+            advanced=False,
+        )
+
+        timeout: int = SchemaField(
+            description=(
+                "Sandbox timeout in seconds. Claude Code tasks can take "
+                "a while, so set this appropriately for your task complexity. "
+                "Note: This only applies when creating a new sandbox. "
+                "When reconnecting to an existing sandbox via sandbox_id, "
+                "the original timeout is retained."
+            ),
+            default=300,  # 5 minutes default
+            advanced=True,
+        )
+
+        setup_commands: list[str] = SchemaField(
+            description=(
+                "Optional shell commands to run before executing Claude Code. "
+                "Useful for installing dependencies or setting up the environment."
+            ),
+            default_factory=list,
+            advanced=True,
+        )
+
+        working_directory: str = SchemaField(
+            description="Working directory for Claude Code to operate in.",
+            default="/home/user",
+            advanced=True,
+        )
+
+        # Session/continuation support
+        session_id: str = SchemaField(
+            description=(
+                "Session ID to resume a previous conversation. "
+                "Leave empty for a new conversation. "
+                "Use the session_id from a previous run to continue that conversation."
+            ),
+            default="",
+            advanced=True,
+        )
+
+        sandbox_id: str = SchemaField(
+            description=(
+                "Sandbox ID to reconnect to an existing sandbox. "
+                "Required when resuming a session (along with session_id). "
+                "Use the sandbox_id from a previous run where dispose_sandbox was False."
+            ),
+            default="",
+            advanced=True,
+        )
+
+        conversation_history: str = SchemaField(
+            description=(
+                "Previous conversation history to continue from. "
+                "Use this to restore context on a fresh sandbox if the previous one timed out. "
+                "Pass the conversation_history output from a previous run."
+            ),
+            default="",
+            advanced=True,
+        )
+
+        dispose_sandbox: bool = SchemaField(
+            description=(
+                "Whether to dispose of the sandbox immediately after execution. "
+                "Set to False if you want to continue the conversation later "
+                "(you'll need both sandbox_id and session_id from the output)."
+            ),
+            default=True,
+            advanced=True,
+        )
+
+    class FileOutput(BaseModel):
+        """A file extracted from the sandbox."""
+
+        path: str
+        relative_path: str  # Path relative to working directory (for GitHub, etc.)
+        name: str
+        content: str
+
+    class Output(BlockSchemaOutput):
+        response: str = SchemaField(
+            description="The output/response from Claude Code execution"
+        )
+        files: list["ClaudeCodeBlock.FileOutput"] = SchemaField(
+            description=(
+                "List of text files created/modified by Claude Code during this execution. "
+                "Each file has 'path', 'relative_path', 'name', and 'content' fields."
+            )
+        )
+        conversation_history: str = SchemaField(
+            description=(
+                "Full conversation history including this turn. "
+                "Pass this to conversation_history input to continue on a fresh sandbox "
+                "if the previous sandbox timed out."
+            )
+        )
+        session_id: str = SchemaField(
+            description=(
+                "Session ID for this conversation. "
+                "Pass this back along with sandbox_id to continue the conversation."
+            )
+        )
+        sandbox_id: Optional[str] = SchemaField(
+            description=(
+                "ID of the sandbox instance. "
+                "Pass this back along with session_id to continue the conversation. "
+                "This is None if dispose_sandbox was True (sandbox was disposed)."
+            ),
+            default=None,
+        )
+        error: str = SchemaField(description="Error message if execution failed")
+
+    def __init__(self):
+        super().__init__(
+            id="4e34f4a5-9b89-4326-ba77-2dd6750b7194",
+            description=(
+                "Execute tasks using Claude Code in an E2B sandbox. "
+                "Claude Code can create files, install tools, run commands, "
+                "and perform complex coding tasks autonomously."
+            ),
+            categories={BlockCategory.DEVELOPER_TOOLS, BlockCategory.AI},
+            input_schema=ClaudeCodeBlock.Input,
+            output_schema=ClaudeCodeBlock.Output,
+            test_credentials={
+                "e2b_credentials": TEST_E2B_CREDENTIALS,
+                "anthropic_credentials": TEST_ANTHROPIC_CREDENTIALS,
+            },
+            test_input={
+                "e2b_credentials": TEST_E2B_CREDENTIALS_INPUT,
+                "anthropic_credentials": TEST_ANTHROPIC_CREDENTIALS_INPUT,
+                "prompt": "Create a hello world HTML file",
+                "timeout": 300,
+                "setup_commands": [],
+                "working_directory": "/home/user",
+                "session_id": "",
+                "sandbox_id": "",
+                "conversation_history": "",
+                "dispose_sandbox": True,
+            },
+            test_output=[
+                ("response", "Created index.html with hello world content"),
+                (
+                    "files",
+                    [
+                        {
+                            "path": "/home/user/index.html",
+                            "relative_path": "index.html",
+                            "name": "index.html",
+                            "content": "<html>Hello World</html>",
+                        }
+                    ],
+                ),
+                (
+                    "conversation_history",
+                    "User: Create a hello world HTML file\n"
+                    "Claude: Created index.html with hello world content",
+                ),
+                ("session_id", str),
+                ("sandbox_id", None),  # None because dispose_sandbox=True in test_input
+            ],
+            test_mock={
+                "execute_claude_code": lambda *args, **kwargs: (
+                    "Created index.html with hello world content",  # response
+                    [
+                        ClaudeCodeBlock.FileOutput(
+                            path="/home/user/index.html",
+                            relative_path="index.html",
+                            name="index.html",
+                            content="<html>Hello World</html>",
+                        )
+                    ],  # files
+                    "User: Create a hello world HTML file\n"
+                    "Claude: Created index.html with hello world content",  # conversation_history
+                    "test-session-id",  # session_id
+                    "sandbox_id",  # sandbox_id
+                ),
+            },
+        )
+
+    async def execute_claude_code(
+        self,
+        e2b_api_key: str,
+        anthropic_api_key: str,
+        prompt: str,
+        timeout: int,
+        setup_commands: list[str],
+        working_directory: str,
+        session_id: str,
+        existing_sandbox_id: str,
+        conversation_history: str,
+        dispose_sandbox: bool,
+    ) -> tuple[str, list["ClaudeCodeBlock.FileOutput"], str, str, str]:
+        """
+        Execute Claude Code in an E2B sandbox.
+
+        Returns:
+            Tuple of (response, files, conversation_history, session_id, sandbox_id)
+        """
+
+        # Validate that sandbox_id is provided when resuming a session
+        if session_id and not existing_sandbox_id:
+            raise ValueError(
+                "sandbox_id is required when resuming a session with session_id. "
+                "The session state is stored in the original sandbox. "
+                "If the sandbox has timed out, use conversation_history instead "
+                "to restore context on a fresh sandbox."
+            )
+
+        sandbox = None
+        sandbox_id = ""
+
+        try:
+            # Either reconnect to existing sandbox or create a new one
+            if existing_sandbox_id:
+                # Reconnect to existing sandbox for conversation continuation
+                sandbox = await BaseAsyncSandbox.connect(
+                    sandbox_id=existing_sandbox_id,
+                    api_key=e2b_api_key,
+                )
+            else:
+                # Create new sandbox
+                sandbox = await BaseAsyncSandbox.create(
+                    template=self.DEFAULT_TEMPLATE,
+                    api_key=e2b_api_key,
+                    timeout=timeout,
+                    envs={"ANTHROPIC_API_KEY": anthropic_api_key},
+                )
+
+                # Install Claude Code from npm (ensures we get the latest version)
+                install_result = await sandbox.commands.run(
+                    "npm install -g @anthropic-ai/claude-code@latest",
+                    timeout=120,  # 2 min timeout for install
+                )
+                if install_result.exit_code != 0:
+                    raise Exception(
+                        f"Failed to install Claude Code: {install_result.stderr}"
+                    )
+
+                # Run any user-provided setup commands
+                for cmd in setup_commands:
+                    setup_result = await sandbox.commands.run(cmd)
+                    if setup_result.exit_code != 0:
+                        raise Exception(
+                            f"Setup command failed: {cmd}\n"
+                            f"Exit code: {setup_result.exit_code}\n"
+                            f"Stdout: {setup_result.stdout}\n"
+                            f"Stderr: {setup_result.stderr}"
+                        )
+
+            # Capture sandbox_id immediately after creation/connection
+            # so it's available for error recovery if dispose_sandbox=False
+            sandbox_id = sandbox.sandbox_id
+
+            # Generate or use provided session ID
+            current_session_id = session_id if session_id else str(uuid.uuid4())
+
+            # Build base Claude flags
+            base_flags = "-p --dangerously-skip-permissions --output-format json"
+
+            # Add conversation history context if provided (for fresh sandbox continuation)
+            history_flag = ""
+            if conversation_history and not session_id:
+                # Inject previous conversation as context via system prompt
+                # Use consistent escaping via _escape_prompt helper
+                escaped_history = self._escape_prompt(
+                    f"Previous conversation context: {conversation_history}"
+                )
+                history_flag = f" --append-system-prompt {escaped_history}"
+
+            # Build Claude command based on whether we're resuming or starting new
+            # Use shlex.quote for working_directory and session IDs to prevent injection
+            safe_working_dir = shlex.quote(working_directory)
+            if session_id:
+                # Resuming existing session (sandbox still alive)
+                safe_session_id = shlex.quote(session_id)
+                claude_command = (
+                    f"cd {safe_working_dir} && "
+                    f"echo {self._escape_prompt(prompt)} | "
+                    f"claude --resume {safe_session_id} {base_flags}"
+                )
+            else:
+                # New session with specific ID
+                safe_current_session_id = shlex.quote(current_session_id)
+                claude_command = (
+                    f"cd {safe_working_dir} && "
+                    f"echo {self._escape_prompt(prompt)} | "
+                    f"claude --session-id {safe_current_session_id} {base_flags}{history_flag}"
+                )
+
+            # Capture timestamp before running Claude Code to filter files later
+            # Capture timestamp 1 second in the past to avoid race condition with file creation
+            timestamp_result = await sandbox.commands.run(
+                "date -u -d '1 second ago' +%Y-%m-%dT%H:%M:%S"
+            )
+            if timestamp_result.exit_code != 0:
+                raise RuntimeError(
+                    f"Failed to capture timestamp: {timestamp_result.stderr}"
+                )
+            start_timestamp = (
+                timestamp_result.stdout.strip() if timestamp_result.stdout else None
+            )
+
+            result = await sandbox.commands.run(
+                claude_command,
+                timeout=0,  # No command timeout - let sandbox timeout handle it
+            )
+
+            # Check for command failure
+            if result.exit_code != 0:
+                error_msg = result.stderr or result.stdout or "Unknown error"
+                raise Exception(
+                    f"Claude Code command failed with exit code {result.exit_code}:\n"
+                    f"{error_msg}"
+                )
+
+            raw_output = result.stdout or ""
+
+            # Parse JSON output to extract response and build conversation history
+            response = ""
+            new_conversation_history = conversation_history or ""
+
+            try:
+                # The JSON output contains the result
+                output_data = json.loads(raw_output)
+                response = output_data.get("result", raw_output)
+
+                # Build conversation history entry
+                turn_entry = f"User: {prompt}\nClaude: {response}"
+                if new_conversation_history:
+                    new_conversation_history = (
+                        f"{new_conversation_history}\n\n{turn_entry}"
+                    )
+                else:
+                    new_conversation_history = turn_entry
+
+            except json.JSONDecodeError:
+                # If not valid JSON, use raw output
+                response = raw_output
+                turn_entry = f"User: {prompt}\nClaude: {response}"
+                if new_conversation_history:
+                    new_conversation_history = (
+                        f"{new_conversation_history}\n\n{turn_entry}"
+                    )
+                else:
+                    new_conversation_history = turn_entry
+
+            # Extract files created/modified during this run
+            files = await self._extract_files(
+                sandbox, working_directory, start_timestamp
+            )
+
+            return (
+                response,
+                files,
+                new_conversation_history,
+                current_session_id,
+                sandbox_id,
+            )
+
+        except Exception as e:
+            # Wrap exception with sandbox_id so caller can access/cleanup
+            # the preserved sandbox when dispose_sandbox=False
+            raise ClaudeCodeExecutionError(str(e), sandbox_id) from e
+
+        finally:
+            if dispose_sandbox and sandbox:
+                await sandbox.kill()
+
+    async def _extract_files(
+        self,
+        sandbox: BaseAsyncSandbox,
+        working_directory: str,
+        since_timestamp: str | None = None,
+    ) -> list["ClaudeCodeBlock.FileOutput"]:
+        """
+        Extract text files created/modified during this Claude Code execution.
+
+        Args:
+            sandbox: The E2B sandbox instance
+            working_directory: Directory to search for files
+            since_timestamp: ISO timestamp - only return files modified after this time
+
+        Returns:
+            List of FileOutput objects with path, relative_path, name, and content
+        """
+        files: list[ClaudeCodeBlock.FileOutput] = []
+
+        # Text file extensions we can safely read as text
+        text_extensions = {
+            ".txt",
+            ".md",
+            ".html",
+            ".htm",
+            ".css",
+            ".js",
+            ".ts",
+            ".jsx",
+            ".tsx",
+            ".json",
+            ".xml",
+            ".yaml",
+            ".yml",
+            ".toml",
+            ".ini",
+            ".cfg",
+            ".conf",
+            ".py",
+            ".rb",
+            ".php",
+            ".java",
+            ".c",
+            ".cpp",
+            ".h",
+            ".hpp",
+            ".cs",
+            ".go",
+            ".rs",
+            ".swift",
+            ".kt",
+            ".scala",
+            ".sh",
+            ".bash",
+            ".zsh",
+            ".sql",
+            ".graphql",
+            ".env",
+            ".gitignore",
+            ".dockerfile",
+            "Dockerfile",
+            ".vue",
+            ".svelte",
+            ".astro",
+            ".mdx",
+            ".rst",
+            ".tex",
+            ".csv",
+            ".log",
+        }
+
+        try:
+            # List files recursively using find command
+            # Exclude node_modules and .git directories, but allow hidden files
+            # like .env and .gitignore (they're filtered by text_extensions later)
+            # Filter by timestamp to only get files created/modified during this run
+            safe_working_dir = shlex.quote(working_directory)
+            timestamp_filter = ""
+            if since_timestamp:
+                timestamp_filter = f"-newermt {shlex.quote(since_timestamp)} "
+            find_result = await sandbox.commands.run(
+                f"find {safe_working_dir} -type f "
+                f"{timestamp_filter}"
+                f"-not -path '*/node_modules/*' "
+                f"-not -path '*/.git/*' "
+                f"2>/dev/null"
+            )
+
+            if find_result.stdout:
+                for file_path in find_result.stdout.strip().split("\n"):
+                    if not file_path:
+                        continue
+
+                    # Check if it's a text file we can read
+                    is_text = any(
+                        file_path.endswith(ext) for ext in text_extensions
+                    ) or file_path.endswith("Dockerfile")
+
+                    if is_text:
+                        try:
+                            content = await sandbox.files.read(file_path)
+                            # Handle bytes or string
+                            if isinstance(content, bytes):
+                                content = content.decode("utf-8", errors="replace")
+
+                            # Extract filename from path
+                            file_name = file_path.split("/")[-1]
+
+                            # Calculate relative path by stripping working directory
+                            relative_path = file_path
+                            if file_path.startswith(working_directory):
+                                relative_path = file_path[len(working_directory) :]
+                                # Remove leading slash if present
+                                if relative_path.startswith("/"):
+                                    relative_path = relative_path[1:]
+
+                            files.append(
+                                ClaudeCodeBlock.FileOutput(
+                                    path=file_path,
+                                    relative_path=relative_path,
+                                    name=file_name,
+                                    content=content,
+                                )
+                            )
+                        except Exception:
+                            # Skip files that can't be read
+                            pass
+
+        except Exception:
+            # If file extraction fails, return empty results
+            pass
+
+        return files
+
+    def _escape_prompt(self, prompt: str) -> str:
+        """Escape the prompt for safe shell execution."""
+        # Use single quotes and escape any single quotes in the prompt
+        escaped = prompt.replace("'", "'\"'\"'")
+        return f"'{escaped}'"
+
+    async def run(
+        self,
+        input_data: Input,
+        *,
+        e2b_credentials: APIKeyCredentials,
+        anthropic_credentials: APIKeyCredentials,
+        **kwargs,
+    ) -> BlockOutput:
+        try:
+            (
+                response,
+                files,
+                conversation_history,
+                session_id,
+                sandbox_id,
+            ) = await self.execute_claude_code(
+                e2b_api_key=e2b_credentials.api_key.get_secret_value(),
+                anthropic_api_key=anthropic_credentials.api_key.get_secret_value(),
+                prompt=input_data.prompt,
+                timeout=input_data.timeout,
+                setup_commands=input_data.setup_commands,
+                working_directory=input_data.working_directory,
+                session_id=input_data.session_id,
+                existing_sandbox_id=input_data.sandbox_id,
+                conversation_history=input_data.conversation_history,
+                dispose_sandbox=input_data.dispose_sandbox,
+            )
+
+            yield "response", response
+            # Always yield files (empty list if none) to match Output schema
+            yield "files", [f.model_dump() for f in files]
+            # Always yield conversation_history so user can restore context on fresh sandbox
+            yield "conversation_history", conversation_history
+            # Always yield session_id so user can continue conversation
+            yield "session_id", session_id
+            # Always yield sandbox_id (None if disposed) to match Output schema
+            yield "sandbox_id", sandbox_id if not input_data.dispose_sandbox else None
+
+        except ClaudeCodeExecutionError as e:
+            yield "error", str(e)
+            # If sandbox was preserved (dispose_sandbox=False), yield sandbox_id
+            # so user can reconnect to or clean up the orphaned sandbox
+            if not input_data.dispose_sandbox and e.sandbox_id:
+                yield "sandbox_id", e.sandbox_id
+        except Exception as e:
+            yield "error", str(e)
--- a/autogpt_platform/backend/backend/blocks/data_manipulation.py
+++ b/autogpt_platform/backend/backend/blocks/data_manipulation.py
@@ -159,7 +159,7 @@ class FindInDictionaryBlock(Block):
    def __init__(self):
        super().__init__(
            id="0e50422c-6dee-4145-83d6-3a5a392f65de",
-            description="Lookup the given key in the input dictionary/object/list and return the value.",
+            description="A block that looks up a value in a dictionary, list, or object by key or index and returns the corresponding value.",
            input_schema=FindInDictionaryBlock.Input,
            output_schema=FindInDictionaryBlock.Output,
            test_input=[
@@ -680,3 +680,58 @@ class ListIsEmptyBlock(Block):

    async def run(self, input_data: Input, **kwargs) -> BlockOutput:
        yield "is_empty", len(input_data.list) == 0
+
+
+class ConcatenateListsBlock(Block):
+    class Input(BlockSchemaInput):
+        lists: List[List[Any]] = SchemaField(
+            description="A list of lists to concatenate together. All lists will be combined in order into a single list.",
+            placeholder="e.g., [[1, 2], [3, 4], [5, 6]]",
+        )
+
+    class Output(BlockSchemaOutput):
+        concatenated_list: List[Any] = SchemaField(
+            description="The concatenated list containing all elements from all input lists in order."
+        )
+        error: str = SchemaField(
+            description="Error message if concatenation failed due to invalid input types."
+        )
+
+    def __init__(self):
+        super().__init__(
+            id="3cf9298b-5817-4141-9d80-7c2cc5199c8e",
+            description="Concatenates multiple lists into a single list. All elements from all input lists are combined in order.",
+            categories={BlockCategory.BASIC},
+            input_schema=ConcatenateListsBlock.Input,
+            output_schema=ConcatenateListsBlock.Output,
+            test_input=[
+                {"lists": [[1, 2, 3], [4, 5, 6]]},
+                {"lists": [["a", "b"], ["c"], ["d", "e", "f"]]},
+                {"lists": [[1, 2], []]},
+                {"lists": []},
+            ],
+            test_output=[
+                ("concatenated_list", [1, 2, 3, 4, 5, 6]),
+                ("concatenated_list", ["a", "b", "c", "d", "e", "f"]),
+                ("concatenated_list", [1, 2]),
+                ("concatenated_list", []),
+            ],
+        )
+
+    async def run(self, input_data: Input, **kwargs) -> BlockOutput:
+        concatenated = []
+        for idx, lst in enumerate(input_data.lists):
+            if lst is None:
+                # Skip None values to avoid errors
+                continue
+            if not isinstance(lst, list):
+                # Type validation: each item must be a list
+                # Strings are iterable and would cause extend() to iterate character-by-character
+                # Non-iterable types would raise TypeError
+                yield "error", (
+                    f"Invalid input at index {idx}: expected a list, got {type(lst).__name__}. "
+                    f"All items in 'lists' must be lists (e.g., [[1, 2], [3, 4]])."
+                )
+                return
+            concatenated.extend(lst)
+        yield "concatenated_list", concatenated
--- a/Show More
+++ b/Show More