updated kb to support 1536 dimension vectors for models other than text embedding 3 small

fix(docs): update requirements to be more accurate for deploying the app
2026-01-25 14:58:14 -05:00 · 2026-01-25 10:11:34 -08:00 · 2026-01-25 09:56:20 -08:00
14 changed files with 120 additions and 176 deletions
--- a/.devcontainer/docker-compose.yml
+++ b/.devcontainer/docker-compose.yml
@@ -44,7 +44,7 @@ services:
    deploy:
      resources:
        limits:
-          memory: 4G
+          memory: 1G
    environment:
      - NODE_ENV=development
      - DATABASE_URL=postgresql://postgres:postgres@db:5432/simstudio
--- a/apps/docs/content/docs/de/self-hosting/index.mdx
+++ b/apps/docs/content/docs/de/self-hosting/index.mdx
@@ -10,12 +10,20 @@ Stellen Sie Sim auf Ihrer eigenen Infrastruktur mit Docker oder Kubernetes berei

 ## Anforderungen

-| Ressource | Minimum | Empfohlen |
-|----------|---------|-------------|
-| CPU | 2 Kerne | 4+ Kerne |
-| RAM | 12 GB | 16+ GB |
-| Speicher | 20 GB SSD | 50+ GB SSD |
-| Docker | 20.10+ | Neueste Version |
+| Ressource | Klein | Standard | Produktion |
+|----------|-------|----------|------------|
+| CPU | 2 Kerne | 4 Kerne | 8+ Kerne |
+| RAM | 12 GB | 16 GB | 32+ GB |
+| Speicher | 20 GB SSD | 50 GB SSD | 100+ GB SSD |
+| Docker | 20.10+ | 20.10+ | Neueste Version |
+
+**Klein**: Entwicklung, Tests, Einzelnutzer (1-5 Nutzer)
+**Standard**: Teams (5-50 Nutzer), moderate Arbeitslasten
+**Produktion**: Große Teams (50+ Nutzer), Hochverfügbarkeit, intensive Workflow-Ausführung
+
+<Callout type="info">
+Die Ressourcenanforderungen werden durch Workflow-Ausführung (isolated-vm Sandboxing), Dateiverarbeitung (In-Memory-Dokumentenparsing) und Vektoroperationen (pgvector) bestimmt. Arbeitsspeicher ist typischerweise der limitierende Faktor, nicht CPU. Produktionsdaten zeigen, dass die Hauptanwendung durchschnittlich 4-8 GB und bei hoher Last bis zu 12 GB benötigt.
+</Callout>

 ## Schnellstart

--- a/apps/docs/content/docs/en/self-hosting/index.mdx
+++ b/apps/docs/content/docs/en/self-hosting/index.mdx
@@ -16,12 +16,20 @@ Deploy Sim on your own infrastructure with Docker or Kubernetes.

 ## Requirements

-| Resource | Minimum | Recommended |
-|----------|---------|-------------|
-| CPU | 2 cores | 4+ cores |
-| RAM | 12 GB | 16+ GB |
-| Storage | 20 GB SSD | 50+ GB SSD |
-| Docker | 20.10+ | Latest |
+| Resource | Small | Standard | Production |
+|----------|-------|----------|------------|
+| CPU | 2 cores | 4 cores | 8+ cores |
+| RAM | 12 GB | 16 GB | 32+ GB |
+| Storage | 20 GB SSD | 50 GB SSD | 100+ GB SSD |
+| Docker | 20.10+ | 20.10+ | Latest |
+
+**Small**: Development, testing, single user (1-5 users)
+**Standard**: Teams (5-50 users), moderate workloads
+**Production**: Large teams (50+ users), high availability, heavy workflow execution
+
+<Callout type="info">
+Resource requirements are driven by workflow execution (isolated-vm sandboxing), file processing (in-memory document parsing), and vector operations (pgvector). Memory is typically the constraining factor rather than CPU. Production telemetry shows the main app uses 4-8 GB average with peaks up to 12 GB under heavy load.
+</Callout>

 ## Quick Start

--- a/apps/docs/content/docs/es/self-hosting/index.mdx
+++ b/apps/docs/content/docs/es/self-hosting/index.mdx
@@ -10,12 +10,20 @@ Despliega Sim en tu propia infraestructura con Docker o Kubernetes.

 ## Requisitos

-| Recurso | Mínimo | Recomendado |
-|----------|---------|-------------|
-| CPU | 2 núcleos | 4+ núcleos |
-| RAM | 12 GB | 16+ GB |
-| Almacenamiento | 20 GB SSD | 50+ GB SSD |
-| Docker | 20.10+ | Última versión |
+| Recurso | Pequeño | Estándar | Producción |
+|----------|---------|----------|------------|
+| CPU | 2 núcleos | 4 núcleos | 8+ núcleos |
+| RAM | 12 GB | 16 GB | 32+ GB |
+| Almacenamiento | 20 GB SSD | 50 GB SSD | 100+ GB SSD |
+| Docker | 20.10+ | 20.10+ | Última versión |
+
+**Pequeño**: Desarrollo, pruebas, usuario único (1-5 usuarios)
+**Estándar**: Equipos (5-50 usuarios), cargas de trabajo moderadas
+**Producción**: Equipos grandes (50+ usuarios), alta disponibilidad, ejecución intensiva de workflows
+
+<Callout type="info">
+Los requisitos de recursos están determinados por la ejecución de workflows (sandboxing isolated-vm), procesamiento de archivos (análisis de documentos en memoria) y operaciones vectoriales (pgvector). La memoria suele ser el factor limitante, no la CPU. La telemetría de producción muestra que la aplicación principal usa 4-8 GB en promedio con picos de hasta 12 GB bajo carga pesada.
+</Callout>

 ## Inicio rápido

--- a/apps/docs/content/docs/fr/self-hosting/index.mdx
+++ b/apps/docs/content/docs/fr/self-hosting/index.mdx
@@ -10,12 +10,20 @@ Déployez Sim sur votre propre infrastructure avec Docker ou Kubernetes.

 ## Prérequis

-| Ressource | Minimum | Recommandé |
-|----------|---------|-------------|
-| CPU | 2 cœurs | 4+ cœurs |
-| RAM | 12 Go | 16+ Go |
-| Stockage | 20 Go SSD | 50+ Go SSD |
-| Docker | 20.10+ | Dernière version |
+| Ressource | Petit | Standard | Production |
+|----------|-------|----------|------------|
+| CPU | 2 cœurs | 4 cœurs | 8+ cœurs |
+| RAM | 12 Go | 16 Go | 32+ Go |
+| Stockage | 20 Go SSD | 50 Go SSD | 100+ Go SSD |
+| Docker | 20.10+ | 20.10+ | Dernière version |
+
+**Petit** : Développement, tests, utilisateur unique (1-5 utilisateurs)
+**Standard** : Équipes (5-50 utilisateurs), charges de travail modérées
+**Production** : Grandes équipes (50+ utilisateurs), haute disponibilité, exécution intensive de workflows
+
+<Callout type="info">
+Les besoins en ressources sont déterminés par l'exécution des workflows (sandboxing isolated-vm), le traitement des fichiers (analyse de documents en mémoire) et les opérations vectorielles (pgvector). La mémoire est généralement le facteur limitant, pas le CPU. La télémétrie de production montre que l'application principale utilise 4-8 Go en moyenne avec des pics jusqu'à 12 Go sous forte charge.
+</Callout>

 ## Démarrage rapide

--- a/apps/docs/content/docs/ja/self-hosting/index.mdx
+++ b/apps/docs/content/docs/ja/self-hosting/index.mdx
@@ -10,12 +10,20 @@ DockerまたはKubernetesを使用して、自社のインフラストラクチ

 ## 要件

-| リソース | 最小 | 推奨 |
-|----------|---------|-------------|
-| CPU | 2コア | 4+コア |
-| RAM | 12 GB | 16+ GB |
-| ストレージ | 20 GB SSD | 50+ GB SSD |
-| Docker | 20.10+ | 最新版 |
+| リソース | スモール | スタンダード | プロダクション |
+|----------|---------|-------------|----------------|
+| CPU | 2コア | 4コア | 8+コア |
+| RAM | 12 GB | 16 GB | 32+ GB |
+| ストレージ | 20 GB SSD | 50 GB SSD | 100+ GB SSD |
+| Docker | 20.10+ | 20.10+ | 最新版 |
+
+**スモール**: 開発、テスト、シングルユーザー（1-5ユーザー）
+**スタンダード**: チーム（5-50ユーザー）、中程度のワークロード
+**プロダクション**: 大規模チーム（50+ユーザー）、高可用性、高負荷ワークフロー実行
+
+<Callout type="info">
+リソース要件は、ワークフロー実行（isolated-vmサンドボックス）、ファイル処理（メモリ内ドキュメント解析）、ベクトル演算（pgvector）によって決まります。CPUよりもメモリが制約要因となることが多いです。本番環境のテレメトリによると、メインアプリは平均4-8 GB、高負荷時は最大12 GBを使用します。
+</Callout>

 ## クイックスタート

--- a/apps/docs/content/docs/zh/self-hosting/index.mdx
+++ b/apps/docs/content/docs/zh/self-hosting/index.mdx
@@ -10,12 +10,20 @@ import { Callout } from 'fumadocs-ui/components/callout'

 ## 要求

-| 资源 | 最低要求 | 推荐配置 |
-|----------|---------|-------------|
-| CPU | 2 核 | 4 核及以上 |
-| 内存 | 12 GB | 16 GB 及以上 |
-| 存储 | 20 GB SSD | 50 GB 及以上 SSD |
-| Docker | 20.10+ | 最新版本 |
+| 资源 | 小型 | 标准 | 生产环境 |
+|----------|------|------|----------|
+| CPU | 2 核 | 4 核 | 8+ 核 |
+| 内存 | 12 GB | 16 GB | 32+ GB |
+| 存储 | 20 GB SSD | 50 GB SSD | 100+ GB SSD |
+| Docker | 20.10+ | 20.10+ | 最新版本 |
+
+**小型**: 开发、测试、单用户（1-5 用户）
+**标准**: 团队（5-50 用户）、中等工作负载
+**生产环境**: 大型团队（50+ 用户）、高可用性、密集工作流执行
+
+<Callout type="info">
+资源需求由工作流执行（isolated-vm 沙箱）、文件处理（内存中文档解析）和向量运算（pgvector）决定。内存通常是限制因素，而不是 CPU。生产遥测数据显示，主应用平均使用 4-8 GB，高负载时峰值可达 12 GB。
+</Callout>

 ## 快速开始

--- a/apps/sim/lib/copilot/tools/server/workflow/edit-workflow.ts
+++ b/apps/sim/lib/copilot/tools/server/workflow/edit-workflow.ts
@@ -2508,10 +2508,6 @@ async function validateWorkflowSelectorIds(
    for (const subBlockConfig of blockConfig.subBlocks) {
      if (!SELECTOR_TYPES.has(subBlockConfig.type)) continue

-      // Skip oauth-input - credentials are pre-validated before edit application
-      // This allows existing collaborator credentials to remain untouched
-      if (subBlockConfig.type === 'oauth-input') continue
-
      const subBlockValue = blockData.subBlocks?.[subBlockConfig.id]?.value
      if (!subBlockValue) continue

@@ -2577,105 +2573,6 @@ async function validateWorkflowSelectorIds(
  return errors
 }

-/**
- * Pre-validates oauth-input (credential) values in operations before they are applied.
- * Removes invalid credential inputs from operations so they are never applied.
- * Returns validation errors for any removed credentials.
- */
-async function preValidateCredentialInputs(
-  operations: EditWorkflowOperation[],
-  context: { userId: string }
-): Promise<{ filteredOperations: EditWorkflowOperation[]; errors: ValidationError[] }> {
-  const logger = createLogger('PreValidateCredentials')
-  const errors: ValidationError[] = []
-
-  // Collect all credential values from operations that need validation
-  const credentialInputs: Array<{
-    operationIndex: number
-    blockId: string
-    blockType: string
-    fieldName: string
-    value: string
-  }> = []
-
-  operations.forEach((op, opIndex) => {
-    if (!op.params?.inputs || !op.params?.type) return
-
-    const blockConfig = getBlock(op.params.type)
-    if (!blockConfig) return
-
-    // Find oauth-input subblocks in this block type
-    for (const subBlockConfig of blockConfig.subBlocks) {
-      if (subBlockConfig.type !== 'oauth-input') continue
-
-      const inputValue = op.params.inputs[subBlockConfig.id]
-      if (!inputValue || typeof inputValue !== 'string' || inputValue.trim() === '') continue
-
-      credentialInputs.push({
-        operationIndex: opIndex,
-        blockId: op.block_id,
-        blockType: op.params.type,
-        fieldName: subBlockConfig.id,
-        value: inputValue,
-      })
-    }
-  })
-
-  if (credentialInputs.length === 0) {
-    return { filteredOperations: operations, errors }
-  }
-
-  logger.info('Pre-validating credential inputs', {
-    credentialCount: credentialInputs.length,
-    userId: context.userId,
-  })
-
-  // Validate all credential IDs at once
-  const allCredentialIds = credentialInputs.map((c) => c.value)
-  const validationResult = await validateSelectorIds('oauth-input', allCredentialIds, context)
-  const invalidSet = new Set(validationResult.invalid)
-
-  if (invalidSet.size === 0) {
-    return { filteredOperations: operations, errors }
-  }
-
-  // Deep clone operations so we can modify them
-  const filteredOperations = JSON.parse(JSON.stringify(operations)) as EditWorkflowOperation[]
-
-  // Remove invalid credential inputs from operations
-  for (const credInput of credentialInputs) {
-    if (!invalidSet.has(credInput.value)) continue
-
-    // Remove this credential input from the operation
-    const op = filteredOperations[credInput.operationIndex]
-    if (op.params?.inputs?.[credInput.fieldName]) {
-      delete op.params.inputs[credInput.fieldName]
-      logger.info('Removed invalid credential from operation', {
-        blockId: credInput.blockId,
-        field: credInput.fieldName,
-        invalidValue: credInput.value,
-      })
-    }
-
-    // Add error for LLM feedback
-    const warningInfo = validationResult.warning ? `. ${validationResult.warning}` : ''
-    errors.push({
-      blockId: credInput.blockId,
-      blockType: credInput.blockType,
-      field: credInput.fieldName,
-      value: credInput.value,
-      error: `Invalid credential ID "${credInput.value}" - credential does not exist or user doesn't have access${warningInfo}`,
-    })
-  }
-
-  logger.warn('Filtered out invalid credentials from operations', {
-    invalidCount: invalidSet.size,
-    errors: errors.map((e) => ({ blockId: e.blockId, field: e.field })),
-  })
-
-  return { filteredOperations, errors }
-}
-
 async function getCurrentWorkflowStateFromDb(
  workflowId: string
 ): Promise<{ workflowState: any; subBlockValues: Record<string, Record<string, any>> }> {
@@ -2760,28 +2657,12 @@ export const editWorkflowServerTool: BaseServerTool<EditWorkflowParams, any> = {
    // Get permission config for the user
    const permissionConfig = context?.userId ? await getUserPermissionConfig(context.userId) : null

-    // Pre-validate credential inputs before applying operations
-    // This filters out invalid credentials so they never get applied
-    let operationsToApply = operations
-    const credentialErrors: ValidationError[] = []
-    if (context?.userId) {
-      const { filteredOperations, errors: credErrors } = await preValidateCredentialInputs(
-        operations,
-        { userId: context.userId }
-      )
-      operationsToApply = filteredOperations
-      credentialErrors.push(...credErrors)
-    }
-
    // Apply operations directly to the workflow state
    const {
      state: modifiedWorkflowState,
      validationErrors,
      skippedItems,
-    } = applyOperationsToWorkflowState(workflowState, operationsToApply, permissionConfig)
-
-    // Add credential validation errors
-    validationErrors.push(...credentialErrors)
+    } = applyOperationsToWorkflowState(workflowState, operations, permissionConfig)

    // Get workspaceId for selector validation
    let workspaceId: string | undefined
--- a/apps/sim/lib/knowledge/embeddings.ts
+++ b/apps/sim/lib/knowledge/embeddings.ts
@@ -8,6 +8,17 @@ const logger = createLogger('EmbeddingUtils')

 const MAX_TOKENS_PER_REQUEST = 8000
 const MAX_CONCURRENT_BATCHES = env.KB_CONFIG_CONCURRENCY_LIMIT || 50
+const EMBEDDING_DIMENSIONS = 1536
+
+/**
+ * Check if the model supports custom dimensions.
+ * text-embedding-3-* models support the dimensions parameter.
+ * Checks for 'embedding-3' to handle Azure deployments with custom naming conventions.
+ */
+function supportsCustomDimensions(modelName: string): boolean {
+  const name = modelName.toLowerCase()
+  return name.includes('embedding-3') && !name.includes('ada')
+}

 export class EmbeddingAPIError extends Error {
  public status: number
@@ -93,15 +104,19 @@ async function getEmbeddingConfig(
 async function callEmbeddingAPI(inputs: string[], config: EmbeddingConfig): Promise<number[][]> {
  return retryWithExponentialBackoff(
    async () => {
+      const useDimensions = supportsCustomDimensions(config.modelName)
+
      const requestBody = config.useAzure
        ? {
            input: inputs,
            encoding_format: 'float',
+            ...(useDimensions && { dimensions: EMBEDDING_DIMENSIONS }),
          }
        : {
            input: inputs,
            model: config.modelName,
            encoding_format: 'float',
+            ...(useDimensions && { dimensions: EMBEDDING_DIMENSIONS }),
          }

      const response = await fetch(config.apiUrl, {
--- a/docker-compose.local.yml
+++ b/docker-compose.local.yml
@@ -52,7 +52,7 @@ services:
    deploy:
      resources:
        limits:
-          memory: 8G
+          memory: 1G
    healthcheck:
      test: ['CMD', 'wget', '--spider', '--quiet', 'http://127.0.0.1:3002/health']
      interval: 90s
--- a/docker-compose.ollama.yml
+++ b/docker-compose.ollama.yml
@@ -56,7 +56,7 @@ services:
    deploy:
      resources:
        limits:
-          memory: 8G
+          memory: 1G
    healthcheck:
      test: ['CMD', 'wget', '--spider', '--quiet', 'http://127.0.0.1:3002/health']
      interval: 90s
--- a/docker-compose.prod.yml
+++ b/docker-compose.prod.yml
@@ -42,7 +42,7 @@ services:
    deploy:
      resources:
        limits:
-          memory: 4G
+          memory: 1G
    environment:
      - DATABASE_URL=postgresql://${POSTGRES_USER:-postgres}:${POSTGRES_PASSWORD:-postgres}@db:5432/${POSTGRES_DB:-simstudio}
      - NEXT_PUBLIC_APP_URL=${NEXT_PUBLIC_APP_URL:-http://localhost:3000}
--- a/helm/sim/examples/values-production.yaml
+++ b/helm/sim/examples/values-production.yaml
@@ -10,13 +10,13 @@ global:
 app:
  enabled: true
  replicaCount: 2
-  
+
  resources:
    limits:
-      memory: "6Gi"
+      memory: "8Gi"
      cpu: "2000m"
    requests:
-      memory: "4Gi"
+      memory: "6Gi"
      cpu: "1000m"
  
  # Production URLs (REQUIRED - update with your actual domain names)
@@ -49,14 +49,14 @@ app:
 realtime:
  enabled: true
  replicaCount: 2
-  
+
  resources:
    limits:
-      memory: "4Gi"
-      cpu: "1000m"
-    requests:
-      memory: "2Gi"
+      memory: "1Gi"
      cpu: "500m"
+    requests:
+      memory: "512Mi"
+      cpu: "250m"
  
  env:
    NEXT_PUBLIC_APP_URL: "https://sim.acme.ai"
--- a/helm/sim/values.yaml
+++ b/helm/sim/values.yaml
@@ -29,10 +29,10 @@ app:
  # Resource limits and requests
  resources:
    limits:
-      memory: "4Gi"
+      memory: "8Gi"
      cpu: "2000m"
    requests:
-      memory: "2Gi"
+      memory: "4Gi"
      cpu: "1000m"

  # Node selector for pod scheduling (leave empty to allow scheduling on any node)
@@ -232,24 +232,24 @@ app:
 realtime:
  # Enable/disable the realtime service
  enabled: true
-  
+
  # Image configuration
  image:
    repository: simstudioai/realtime
    tag: latest
    pullPolicy: Always
-  
+
  # Number of replicas
  replicaCount: 1
-  
+
  # Resource limits and requests
  resources:
    limits:
-      memory: "2Gi"
-      cpu: "1000m"
-    requests:
      memory: "1Gi"
      cpu: "500m"
+    requests:
+      memory: "512Mi"
+      cpu: "250m"
  
  # Node selector for pod scheduling (leave empty to allow scheduling on any node)
  nodeSelector: {}
Author	SHA1	Message	Date
Waleed Latif	d26c8f2369	updated kb to support 1536 dimension vectors for models other than text embedding 3 small	2026-01-25 10:11:34 -08:00
Waleed Latif	399e632f8f	fix(docs): update requirements to be more accurate for deploying the app	2026-01-25 09:56:20 -08:00