github/sim

mirror of https://github.com/simstudioai/sim.git synced 2026-01-07 22:24:06 -05:00

Files

Waleed 096af4fdfa feat(imap): added support for imap trigger (#2663 )

* feat(tools): added support for imap trigger

* feat(imap): added parity, tested

* ack PR comments

* final cleanup

2026-01-02 15:28:00 -08:00

examples

feat(og): add opengraph images for templates, blogs, and updated existing opengraph image for all other pages (#2466 )

2025-12-18 19:15:06 -08:00

templates

fix(docker): resolve @sim/logger module not found in realtime container (#2637 )

2025-12-29 23:06:28 -08:00

.helmignore

feat(helm): added helm charts for self-hosting (#813 )

2025-07-28 18:03:47 -07:00

Chart.yaml

feat(domain): drop the 'studio' (#818 )

2025-07-29 12:51:43 -07:00

README.md

improvement(helm): added additional envvars to helm charts (#1695 )

2025-10-21 12:02:51 -07:00

values.schema.json

improvement(helm): added more to helm charts, remove instance selector for various cloud providers (#2412 )

2025-12-16 18:24:00 -08:00

values.yaml

feat(imap): added support for imap trigger (#2663 )

2026-01-02 15:28:00 -08:00

README.md

Sim Helm Chart

This Helm chart deploys Sim, a lightweight AI agent workflow platform, on Kubernetes.

Prerequisites

Kubernetes 1.19+
Helm 3.0+
PV provisioner support in the underlying infrastructure (for persistent storage)

Installation

Quick Start

Install the chart from this repository:

# From the repository root
helm install sim ./helm/sim

Custom Configuration

Install with custom values:

helm install sim ./helm/sim -f custom-values.yaml

Configuration Examples

The chart includes several pre-configured values files for different scenarios:

Example File	Description	Use Case
`values-development.yaml`	Minimal resources, no SSL	Local development and testing
`values-production.yaml`	High availability, security-focused	Generic production deployment
`values-external-db.yaml`	External database configuration	Production with managed database
`values-azure.yaml`	Azure AKS optimized	Azure Kubernetes Service
`values-aws.yaml`	AWS EKS optimized	Amazon Elastic Kubernetes Service
`values-gcp.yaml`	GCP GKE optimized	Google Kubernetes Engine

Development Environment

helm install sim-dev ./helm/sim \
  --values ./helm/sim/examples/values-development.yaml \
  --namespace simstudio-dev --create-namespace

Production Environment

helm install sim-prod ./helm/sim \
  --values ./helm/sim/examples/values-production.yaml \
  --namespace simstudio-prod --create-namespace

Azure Environment

helm install sim-azure ./helm/sim \
  --values ./helm/sim/examples/values-azure.yaml \
  --namespace simstudio --create-namespace

AWS Environment (EKS)

helm install sim-aws ./helm/sim \
  --values ./helm/sim/examples/values-aws.yaml \
  --namespace simstudio --create-namespace

GCP Environment (GKE)

helm install sim-gcp ./helm/sim \
  --values ./helm/sim/examples/values-gcp.yaml \
  --namespace simstudio --create-namespace

External Database (Managed Services)

helm install sim-prod ./helm/sim \
  --values ./helm/sim/examples/values-external-db.yaml \
  --set externalDatabase.host="your-rds-endpoint.com" \
  --set externalDatabase.username="simstudio_user" \
  --set externalDatabase.password="secure-password" \
  --set externalDatabase.database="simstudio_prod" \
  --namespace simstudio --create-namespace

Cloud-Specific Features

Each cloud platform example includes optimized configurations:

Azure (AKS)

Storage: Premium managed disks (managed-csi-premium)
Node Selectors: Role-based node targeting (node-role: application, node-role: datalake)
GPU Support: NVIDIA GPU nodes with tolerations
Ingress: NGINX ingress controller with SSL redirect

AWS (EKS)

Storage: EBS GP3 volumes for optimal performance
EBS CSI Driver: Required for persistent storage (install as EKS add-on)
Node Selectors: Instance type targeting (t3.large, r5.large, g4dn.xlarge)
GPU Support: GPU-optimized instances (G4, P3 families)
Ingress: Application Load Balancer (ALB) with AWS Certificate Manager
IAM: Service Account annotations for IAM roles

Prerequisites for AWS:

# Install EBS CSI driver add-on
aws eks create-addon --cluster-name your-cluster --addon-name aws-ebs-csi-driver

# Create IAM role for EBS CSI driver (if using IRSA)
aws iam create-role --role-name AmazonEKS_EBS_CSI_DriverRole \
  --assume-role-policy-document file://ebs-csi-trust-policy.json
aws iam attach-role-policy --role-name AmazonEKS_EBS_CSI_DriverRole \
  --policy-arn arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy

GCP (GKE)

Storage: Persistent Disk with standard and premium options
Node Selectors: Node pool and machine family targeting
GPU Support: Tesla T4/V100 GPUs with GKE accelerator labels
Ingress: Google Cloud Load Balancer with managed certificates
Workload Identity: Service Account annotations for GCP IAM

Configuration

The following table lists the configurable parameters and their default values.

Global Parameters

Parameter	Description	Default
`global.imageRegistry`	Global Docker image registry	`"ghcr.io"`
`global.useRegistryForAllImages`	Use custom registry for all images (not just simstudioai/*)	`false`
`global.imagePullSecrets`	Global Docker registry secret names	`[]`
`global.storageClass`	Global storage class for PVCs	`""`
`global.commonLabels`	Common labels to add to all resources	`{}`

Application Parameters

Parameter	Description	Default
`app.enabled`	Enable the main application	`true`
`app.replicaCount`	Number of app replicas	`1`
`app.image.repository`	App image repository	`simstudioai/sim`
`app.image.tag`	App image tag	`latest`
`app.image.pullPolicy`	App image pull policy	`Always`
`app.resources`	App resource limits and requests	See values.yaml
`app.nodeSelector`	App node selector	`{}`
`app.podSecurityContext`	App pod security context	`fsGroup: 1001`
`app.securityContext`	App container security context	`runAsNonRoot: true, runAsUser: 1001`
`app.service.type`	App service type	`ClusterIP`
`app.service.port`	App service port	`3000`
`app.service.targetPort`	App service target port	`3000`
`app.livenessProbe`	App liveness probe configuration	See values.yaml
`app.readinessProbe`	App readiness probe configuration	See values.yaml
`app.env`	App environment variables	See values.yaml

Realtime Service Parameters

Parameter	Description	Default
`realtime.enabled`	Enable the realtime service	`true`
`realtime.replicaCount`	Number of realtime replicas	`1`
`realtime.image.repository`	Realtime image repository	`simstudioai/realtime`
`realtime.image.tag`	Realtime image tag	`latest`
`realtime.image.pullPolicy`	Realtime image pull policy	`Always`
`realtime.resources`	Realtime resource limits and requests	See values.yaml
`realtime.nodeSelector`	Realtime node selector	`{}`
`realtime.podSecurityContext`	Realtime pod security context	`fsGroup: 1001`
`realtime.securityContext`	Realtime container security context	`runAsNonRoot: true, runAsUser: 1001`
`realtime.service.type`	Realtime service type	`ClusterIP`
`realtime.service.port`	Realtime service port	`3002`
`realtime.service.targetPort`	Realtime service target port	`3002`
`realtime.livenessProbe`	Realtime liveness probe configuration	See values.yaml
`realtime.readinessProbe`	Realtime readiness probe configuration	See values.yaml
`realtime.env`	Realtime environment variables	See values.yaml

PostgreSQL Parameters

Parameter	Description	Default
`postgresql.enabled`	Enable internal PostgreSQL	`true`
`postgresql.image.repository`	PostgreSQL image repository	`pgvector/pgvector`
`postgresql.image.tag`	PostgreSQL image tag	`pg17`
`postgresql.image.pullPolicy`	PostgreSQL image pull policy	`IfNotPresent`
`postgresql.auth.username`	PostgreSQL username	`postgres`
`postgresql.auth.password`	PostgreSQL password	`""` (REQUIRED)
`postgresql.auth.database`	PostgreSQL database name	`sim`
`postgresql.nodeSelector`	PostgreSQL node selector	`{}`
`postgresql.resources`	PostgreSQL resource limits and requests	See values.yaml
`postgresql.podSecurityContext`	PostgreSQL pod security context	`fsGroup: 999`
`postgresql.securityContext`	PostgreSQL container security context	`runAsUser: 999`
`postgresql.persistence.enabled`	Enable PostgreSQL persistence	`true`
`postgresql.persistence.storageClass`	PostgreSQL storage class	`""`
`postgresql.persistence.size`	PostgreSQL PVC size	`10Gi`
`postgresql.persistence.accessModes`	PostgreSQL PVC access modes	`["ReadWriteOnce"]`
`postgresql.tls.enabled`	Enable PostgreSQL SSL/TLS	`false`
`postgresql.tls.certificatesSecret`	PostgreSQL TLS certificates secret	`postgres-tls-secret`
`postgresql.config.maxConnections`	PostgreSQL max connections	`1000`
`postgresql.config.sharedBuffers`	PostgreSQL shared buffers	`"1280MB"`
`postgresql.config.maxWalSize`	PostgreSQL max WAL size	`"4GB"`
`postgresql.config.minWalSize`	PostgreSQL min WAL size	`"80MB"`
`postgresql.service.type`	PostgreSQL service type	`ClusterIP`
`postgresql.service.port`	PostgreSQL service port	`5432`
`postgresql.service.targetPort`	PostgreSQL service target port	`5432`
`postgresql.livenessProbe`	PostgreSQL liveness probe configuration	See values.yaml
`postgresql.readinessProbe`	PostgreSQL readiness probe configuration	See values.yaml

External Database Parameters

Parameter	Description	Default
`externalDatabase.enabled`	Use external database instead of internal PostgreSQL	`false`
`externalDatabase.host`	External database host	`"external-db.example.com"`
`externalDatabase.port`	External database port	`5432`
`externalDatabase.username`	External database username	`postgres`
`externalDatabase.password`	External database password	`""`
`externalDatabase.database`	External database name	`sim`
`externalDatabase.sslMode`	External database SSL mode	`require`

Ollama Parameters

Parameter	Description	Default
`ollama.enabled`	Enable Ollama for local AI models	`false`
`ollama.image.repository`	Ollama image repository	`ollama/ollama`
`ollama.image.tag`	Ollama image tag	`latest`
`ollama.image.pullPolicy`	Ollama image pull policy	`Always`
`ollama.replicaCount`	Number of Ollama replicas	`1`
`ollama.gpu.enabled`	Enable GPU support for Ollama	`false`
`ollama.gpu.count`	Number of GPUs to allocate	`1`
`ollama.nodeSelector`	Ollama node selector	`accelerator: nvidia`
`ollama.tolerations`	Ollama tolerations for GPU nodes	See values.yaml
`ollama.resources`	Ollama resource limits and requests	See values.yaml
`ollama.env`	Ollama environment variables	See values.yaml
`ollama.persistence.enabled`	Enable Ollama persistence	`true`
`ollama.persistence.storageClass`	Ollama storage class	`""`
`ollama.persistence.size`	Ollama PVC size	`100Gi`
`ollama.persistence.accessModes`	Ollama PVC access modes	`["ReadWriteOnce"]`
`ollama.service.type`	Ollama service type	`ClusterIP`
`ollama.service.port`	Ollama service port	`11434`
`ollama.service.targetPort`	Ollama service target port	`11434`
`ollama.startupProbe`	Ollama startup probe configuration	See values.yaml
`ollama.livenessProbe`	Ollama liveness probe configuration	See values.yaml
`ollama.readinessProbe`	Ollama readiness probe configuration	See values.yaml

Ingress Parameters

Parameter	Description	Default
`ingress.enabled`	Enable ingress	`false`
`ingress.className`	Ingress class name	`nginx`
`ingress.annotations`	Ingress annotations	See values.yaml
`ingress.app.host`	App ingress hostname	`sim.local`
`ingress.app.paths`	App ingress paths	`[{path: "/", pathType: "Prefix"}]`
`ingress.realtime.host`	Realtime ingress hostname	`sim-ws.local`
`ingress.realtime.paths`	Realtime ingress paths	`[{path: "/", pathType: "Prefix"}]`
`ingress.tls.enabled`	Enable TLS for ingress	`false`
`ingress.tls.secretName`	TLS secret name	`sim-tls-secret`

Autoscaling Parameters

Parameter	Description	Default
`autoscaling.enabled`	Enable Horizontal Pod Autoscaler	`false`
`autoscaling.minReplicas`	Minimum number of replicas	`1`
`autoscaling.maxReplicas`	Maximum number of replicas	`10`
`autoscaling.targetCPUUtilizationPercentage`	Target CPU utilization	`80`
`autoscaling.targetMemoryUtilizationPercentage`	Target memory utilization	`80`
`autoscaling.customMetrics`	Custom metrics for scaling	`[]`
`autoscaling.behavior`	Scaling behavior configuration	`{}`

Monitoring Parameters

Parameter	Description	Default
`monitoring.serviceMonitor.enabled`	Enable ServiceMonitor for Prometheus	`false`
`monitoring.serviceMonitor.labels`	Additional labels for ServiceMonitor	`{}`
`monitoring.serviceMonitor.annotations`	Additional annotations for ServiceMonitor	`{}`
`monitoring.serviceMonitor.path`	Metrics endpoint path	`/metrics`
`monitoring.serviceMonitor.interval`	Scrape interval	`30s`
`monitoring.serviceMonitor.scrapeTimeout`	Scrape timeout	`10s`
`monitoring.serviceMonitor.targetLabels`	Target labels to add to scraped metrics	`[]`
`monitoring.serviceMonitor.metricRelabelings`	Metric relabeling configurations	`[]`
`monitoring.serviceMonitor.relabelings`	Relabeling configurations	`[]`

Security Parameters

Parameter	Description	Default
`networkPolicy.enabled`	Enable network policies	`false`
`networkPolicy.ingress`	Custom ingress rules	`[]`
`networkPolicy.egress`	Custom egress rules	`[]`
`podDisruptionBudget.enabled`	Enable pod disruption budget	`false`
`podDisruptionBudget.minAvailable`	Minimum available pods	`1`

Migration Parameters

Parameter	Description	Default
`migrations.enabled`	Enable database migrations job	`true`
`migrations.image.repository`	Migrations image repository	`simstudioai/migrations`
`migrations.image.tag`	Migrations image tag	`latest`
`migrations.image.pullPolicy`	Migrations image pull policy	`Always`
`migrations.resources`	Migrations resource limits and requests	See values.yaml
`migrations.podSecurityContext`	Migrations pod security context	`fsGroup: 1001`
`migrations.securityContext`	Migrations container security context	`runAsNonRoot: true, runAsUser: 1001`

CronJob Parameters

Parameter	Description	Default
`cronjobs.enabled`	Enable all scheduled cron jobs	`true`
`cronjobs.image.repository`	CronJob image repository for HTTP requests	`curlimages/curl`
`cronjobs.image.tag`	CronJob image tag	`8.5.0`
`cronjobs.image.pullPolicy`	CronJob image pull policy	`IfNotPresent`
`cronjobs.resources`	CronJob resource limits and requests	See values.yaml
`cronjobs.restartPolicy`	CronJob pod restart policy	`OnFailure`
`cronjobs.activeDeadlineSeconds`	CronJob active deadline in seconds	`300`
`cronjobs.startingDeadlineSeconds`	CronJob starting deadline in seconds	`60`
`cronjobs.podSecurityContext`	CronJob pod security context	`fsGroup: 1001`
`cronjobs.securityContext`	CronJob container security context	`runAsNonRoot: true, runAsUser: 1001`
`cronjobs.jobs.scheduleExecution.enabled`	Enable schedule execution cron job	`true`
`cronjobs.jobs.scheduleExecution.name`	Schedule execution job name	`schedule-execution`
`cronjobs.jobs.scheduleExecution.schedule`	Schedule execution cron schedule	`"/1 * * *"`
`cronjobs.jobs.scheduleExecution.path`	Schedule execution API path	`"/api/schedules/execute"`
`cronjobs.jobs.scheduleExecution.concurrencyPolicy`	Schedule execution concurrency policy	`Forbid`
`cronjobs.jobs.scheduleExecution.successfulJobsHistoryLimit`	Schedule execution successful jobs history	`3`
`cronjobs.jobs.scheduleExecution.failedJobsHistoryLimit`	Schedule execution failed jobs history	`1`
`cronjobs.jobs.gmailWebhookPoll.enabled`	Enable Gmail webhook polling cron job	`true`
`cronjobs.jobs.gmailWebhookPoll.name`	Gmail webhook polling job name	`gmail-webhook-poll`
`cronjobs.jobs.gmailWebhookPoll.schedule`	Gmail webhook polling cron schedule	`"/1 * * *"`
`cronjobs.jobs.gmailWebhookPoll.path`	Gmail webhook polling API path	`"/api/webhooks/poll/gmail"`
`cronjobs.jobs.gmailWebhookPoll.concurrencyPolicy`	Gmail webhook polling concurrency policy	`Forbid`
`cronjobs.jobs.gmailWebhookPoll.successfulJobsHistoryLimit`	Gmail webhook polling successful jobs history	`3`
`cronjobs.jobs.gmailWebhookPoll.failedJobsHistoryLimit`	Gmail webhook polling failed jobs history	`1`
`cronjobs.jobs.outlookWebhookPoll.enabled`	Enable Outlook webhook polling cron job	`true`
`cronjobs.jobs.outlookWebhookPoll.name`	Outlook webhook polling job name	`outlook-webhook-poll`
`cronjobs.jobs.outlookWebhookPoll.schedule`	Outlook webhook polling cron schedule	`"/1 * * *"`
`cronjobs.jobs.outlookWebhookPoll.path`	Outlook webhook polling API path	`"/api/webhooks/poll/outlook"`
`cronjobs.jobs.outlookWebhookPoll.concurrencyPolicy`	Outlook webhook polling concurrency policy	`Forbid`
`cronjobs.jobs.outlookWebhookPoll.successfulJobsHistoryLimit`	Outlook webhook polling successful jobs history	`3`
`cronjobs.jobs.outlookWebhookPoll.failedJobsHistoryLimit`	Outlook webhook polling failed jobs history	`1`

Shared Storage Parameters

Parameter	Description	Default
`sharedStorage.enabled`	Enable shared storage for multi-pod data sharing	`false`
`sharedStorage.storageClass`	Storage class for shared volumes (must support ReadWriteMany)	`""`
`sharedStorage.defaultAccessModes`	Default access modes for shared volumes	`["ReadWriteMany"]`
`sharedStorage.volumes`	Array of shared volume definitions	`[]`
`sharedStorage.volumes[].name`	Shared volume name	Required
`sharedStorage.volumes[].size`	Shared volume size	Required
`sharedStorage.volumes[].accessModes`	Shared volume access modes	Uses default
`sharedStorage.volumes[].storageClass`	Shared volume storage class	Uses global
`sharedStorage.volumes[].annotations`	Shared volume annotations	`{}`
`sharedStorage.volumes[].selector`	Shared volume selector	`{}`

Telemetry Parameters

Parameter	Description	Default
`telemetry.enabled`	Enable telemetry and observability collection	`false`
`telemetry.replicaCount`	Number of telemetry collector replicas	`1`
`telemetry.image.repository`	Telemetry collector image repository	`otel/opentelemetry-collector-contrib`
`telemetry.image.tag`	Telemetry collector image tag	`0.91.0`
`telemetry.image.pullPolicy`	Telemetry collector image pull policy	`IfNotPresent`
`telemetry.resources`	Telemetry collector resource limits and requests	See values.yaml
`telemetry.nodeSelector`	Telemetry collector node selector	`{}`
`telemetry.tolerations`	Telemetry collector tolerations	`[]`
`telemetry.affinity`	Telemetry collector affinity	`{}`
`telemetry.service.type`	Telemetry collector service type	`ClusterIP`
`telemetry.jaeger.enabled`	Enable Jaeger tracing backend	`false`
`telemetry.jaeger.endpoint`	Jaeger collector endpoint	`"http://jaeger-collector:14250"`
`telemetry.jaeger.tls.enabled`	Enable TLS for Jaeger connection	`false`
`telemetry.prometheus.enabled`	Enable Prometheus metrics backend	`false`
`telemetry.prometheus.endpoint`	Prometheus remote write endpoint	`"http://prometheus-server/api/v1/write"`
`telemetry.prometheus.auth`	Prometheus authentication header	`""`
`telemetry.otlp.enabled`	Enable generic OTLP backend	`false`
`telemetry.otlp.endpoint`	OTLP collector endpoint	`"http://otlp-collector:4317"`
`telemetry.otlp.tls.enabled`	Enable TLS for OTLP connection	`false`

Service Account Parameters

Parameter	Description	Default
`serviceAccount.create`	Create a service account	`true`
`serviceAccount.annotations`	Service account annotations	`{}`
`serviceAccount.name`	Service account name (auto-generated if empty)	`""`

Common Parameters

Parameter	Description	Default
`nameOverride`	Override the name of the chart	`""`
`fullnameOverride`	Override the fullname of the chart	`""`
`extraVolumes`	Additional volumes for all pods	`[]`
`extraVolumeMounts`	Additional volume mounts for all containers	`[]`
`extraEnvVars`	Additional environment variables for all containers	`[]`
`podAnnotations`	Additional annotations for all pods	`{}`
`podLabels`	Additional labels for all pods	`{}`
`affinity`	Affinity settings for all pods	`{}`
`tolerations`	Tolerations for all pods	`[]`

Enterprise Features

Autoscaling

Enable automatic horizontal scaling based on CPU and memory usage:

autoscaling:
  enabled: true
  minReplicas: 2
  maxReplicas: 20
  targetCPUUtilizationPercentage: 70
  targetMemoryUtilizationPercentage: 80

Shared Storage

Enable shared storage for multi-pod data sharing and enterprise workflows:

sharedStorage:
  enabled: true
  storageClass: "managed-csi-premium"
  volumes:
    - name: output-share
      size: 100Gi
      accessModes:
        - ReadWriteMany
    - name: model-share
      size: 200Gi
      accessModes:
        - ReadWriteMany
    - name: logs-share
      size: 50Gi
      accessModes:
        - ReadWriteMany

This creates persistent volume claims that can be shared across multiple pods for:

Output data sharing between workflow steps
Model storage and caching
Centralized logging and audit trails
Temporary data exchange

Telemetry and Observability

Enable comprehensive telemetry collection with OpenTelemetry:

telemetry:
  enabled: true
  resources:
    limits:
      memory: "1Gi"
      cpu: "500m"
    requests:
      memory: "512Mi"
      cpu: "200m"
  
  # Enable Jaeger for distributed tracing
  jaeger:
    enabled: true
    endpoint: "http://jaeger-collector:14250"
  
  # Enable Prometheus for metrics
  prometheus:
    enabled: true
    endpoint: "http://prometheus-server/api/v1/write"
    auth: "Bearer your-prometheus-token"
  
  # Enable generic OTLP for flexibility
  otlp:
    enabled: true
    endpoint: "http://otlp-collector:4317"

This automatically configures:

OpenTelemetry Collector for metrics, traces, and logs
Automatic service discovery for Sim components
Environment variable injection for applications
Support for multiple observability backends

GPU Support

Enable GPU device plugin support for AI workloads:

ollama:
  enabled: true
  gpu:
    enabled: true
    count: 1
  nodeSelector:
    accelerator: nvidia
  tolerations:
    - key: "sku"
      operator: "Equal"
      value: "gpu"
      effect: "NoSchedule"

This deploys:

NVIDIA Device Plugin DaemonSet
RuntimeClass for NVIDIA container runtime
Proper node scheduling and resource allocation

Monitoring Integration

Enable Prometheus monitoring with ServiceMonitor:

monitoring:
  serviceMonitor:
    enabled: true
    labels:
      monitoring: "prometheus"
    interval: 15s

Network Security

Enable network policies for micro-segmentation:

networkPolicy:
  enabled: true

This creates network policies that:

Allow communication between Sim components
Restrict unnecessary network access
Permit DNS resolution and HTTPS egress
Support custom ingress/egress rules

CronJobs for Scheduled Tasks

Enable automated scheduled tasks functionality:

cronjobs:
  enabled: true
  
  # Customize individual jobs
  jobs:
    scheduleExecution:
      enabled: true
      schedule: "*/1 * * * *"  # Every minute
    
    gmailWebhookPoll:
      enabled: true
      schedule: "*/1 * * * *"  # Every minute
    
    outlookWebhookPoll:
      enabled: true
      schedule: "*/1 * * * *"  # Every minute
    
      
  # Global job configuration
  resources:
    limits:
      memory: "256Mi"
      cpu: "200m"
    requests:
      memory: "128Mi"
      cpu: "100m"

This creates Kubernetes CronJob resources that:

Execute HTTP requests to your application's API endpoints
Handle retries and error logging automatically
Use minimal resources with curl-based containers
Support individual enable/disable per job
Follow Kubernetes security best practices

High Availability

Configure pod disruption budgets and anti-affinity:

podDisruptionBudget:
  enabled: true
  minAvailable: 1

affinity:
  podAntiAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        podAffinityTerm:
          labelSelector:
            matchExpressions:
              - key: app.kubernetes.io/name
                operator: In
                values: ["simstudio"]
          topologyKey: kubernetes.io/hostname

Upgrading

To upgrade your release:

helm upgrade sim ./helm/sim

Uninstalling

To uninstall/delete the release:

helm uninstall sim

Security Considerations

Production Secrets

For production deployments, make sure to:

Change default secrets: Update BETTER_AUTH_SECRET, ENCRYPTION_KEY, and INTERNAL_API_SECRET with secure, randomly generated values using openssl rand -hex 32
Use strong database passwords: Set postgresql.auth.password to a strong password
Enable TLS: Configure postgresql.tls.enabled=true and provide proper certificates
Configure ingress TLS: Enable HTTPS with proper SSL certificates

Required Secrets:

BETTER_AUTH_SECRET: Authentication JWT signing (minimum 32 characters)
ENCRYPTION_KEY: Encrypts sensitive data like environment variables (minimum 32 characters)
INTERNAL_API_SECRET: Internal service-to-service authentication (minimum 32 characters)

Optional Security (Recommended for Production):

CRON_SECRET: Authenticates scheduled job requests to API endpoints (required only if cronjobs.enabled=true)
API_ENCRYPTION_KEY: Encrypts API keys at rest in database (must be exactly 64 hex characters). If not set, API keys are stored in plain text. Generate using: openssl rand -hex 32 (outputs 64 hex chars representing 32 bytes)

Example secure values:

app:
  env:
    BETTER_AUTH_SECRET: "your-secure-random-string-here"
    ENCRYPTION_KEY: "your-secure-encryption-key-here"
    INTERNAL_API_SECRET: "your-secure-internal-api-secret-here"
    CRON_SECRET: "your-secure-cron-secret-here"
    API_ENCRYPTION_KEY: "your-64-char-hex-string-for-api-key-encryption"  # Optional but recommended

postgresql:
  auth:
    password: "your-secure-database-password"
  tls:
    enabled: true
    certificatesSecret: "postgres-tls-secret"

ingress:
  enabled: true
  tls:
    enabled: true
    secretName: "simstudio-tls-secret"

Troubleshooting

Common Issues

Database Connection Issues
- Check if PostgreSQL pod is running: kubectl get pods -l app.kubernetes.io/component=postgresql
- Verify database credentials in the secret: kubectl get secret <release>-postgresql-secret -o yaml
Migration Issues
- Check migration job logs: kubectl logs job/<release>-migrations
- Ensure database is accessible from the migration job
Image Pull Issues
- Verify image names and tags in values.yaml
- Check if image pull secrets are configured correctly

Getting Logs

# App logs
kubectl logs deployment/<release>-app

# Realtime logs
kubectl logs deployment/<release>-realtime

# PostgreSQL logs
kubectl logs statefulset/<release>-postgresql

# Migration logs
kubectl logs job/<release>-migrations

Support

Documentation: https://docs.sim.ai
GitHub Issues: https://github.com/simstudioai/sim/issues
Discord: https://discord.gg/Hr4UWYEcTT