Waleed dbef14ba26 feat(knowledge): connectors, user exclusions, expanded tools & airtable integration (#3230)
* feat(knowledge): connectors, user exclusions, expanded tools & airtable integration

* improvements

* removed redundant util

* ack PR comments

* remove module level cache, use syncContext between paginated calls to avoid redundant schema fetches

* regen migrations, ack PR comments

* ack PR comment

* added tests

* ack comments

* ack comments

* feat(db): add knowledge connector migration after merge

Generated migration 0162 for knowledge_connector and
knowledge_connector_sync_log tables after resolving merge
conflicts with feat/mothership-copilot.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(connectors): audit fixes for sync engine, connectors, and knowledge tools

- Extract shared computeContentHash to connectors/utils.ts (dedup across 7 connectors)
- Include error'd connectors in cron auto-retry query
- Add syncContext caching for Confluence (cloudId, spaceId)
- Batch Confluence label fetches with concurrency limit of 10
- Enforce maxPages in Confluence v2 path
- Clean up stale storage files on document update
- Retry stuck documents (pending/failed) after sync completes
- Soft-delete documents and reclaim tag slots on connector deletion
- Add incremental sync support to ConnectorConfig interface
- Fix offset:0 falsy check in list_documents tool

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* perf(connectors): deep audit — extract shared utils, fix pagination, optimize API calls

- Extract shared htmlToPlainText to connectors/utils.ts (dedup Confluence + Google Drive)
- Add syncContext caching for Jira cloudId, Notion/Linear/Google Drive cumulative limits
- Fix cumulative maxPages/maxIssues/maxFiles enforcement across pagination pages
- Bump Notion page_size from 20 to 100 (5x fewer API round-trips)
- Batch Notion child page fetching with concurrency=5 (was serial N+1)
- Bump Confluence v2 limit from 50 to 250 (v2 API supports it)
- Pass syncContext through Confluence CQL path for cumulative tracking
- Upgrade GitHub tree truncation warning to error level
- Fix sync-engine test mock to include inArray export

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(connectors): extract tag helpers, fix Notion maxPages, rewrite broken tests

- Add parseTagDate and joinTagArray helpers to connectors/utils.ts
- Update all 7 connectors to use shared tag mapping helpers (removes 12+ duplication instances)
- Fix Notion listFromParentPage cumulative maxPages check (was using local count)
- Rewrite 3 broken connector route test files to use vi.hoisted() + static vi.mock()
  pattern instead of deprecated vi.doMock/vi.resetModules (all 86 tests now pass)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(connectors): add loading skeletons, delete pending state, and pause feedback

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(knowledge): escape LIKE wildcards, guard restore from un-deleting, fix offset=0

- Escape %, _, \ in tag filter LIKE patterns to prevent incorrect matches
- Add isNull(deletedAt) guard to restore operation to prevent un-deleting soft-deleted docs
- Change offset check from falsy to != null so offset=0 is not dropped

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 15:40:00 -08:00
2026-03-04 11:12:51 -08:00
2026-03-03 14:46:09 -08:00

Sim Logo

The open-source platform to build AI agents and run your agentic workforce. Connect 1,000+ integrations and LLMs to orchestrate agentic workflows.

Sim.ai Discord Twitter Documentation

Ask DeepWiki Set Up with Cursor

Build Workflows with Ease

Design agent workflows visually on a canvas—connect agents, tools, and blocks, then run them instantly.

Workflow Builder Demo

Supercharge with Copilot

Leverage Copilot to generate nodes, fix errors, and iterate on flows directly from natural language.

Copilot Demo

Integrate Vector Databases

Upload documents to a vector store and let agents answer questions grounded in your specific content.

Knowledge Uploads and Retrieval Demo

Quickstart

Cloud-hosted: sim.ai

Sim.ai

Self-hosted: NPM Package

npx simstudio

http://localhost:3000

Note

Docker must be installed and running on your machine.

Options

Flag Description
-p, --port <port> Port to run Sim on (default 3000)
--no-pull Skip pulling latest Docker images

Self-hosted: Docker Compose

git clone https://github.com/simstudioai/sim.git && cd sim
docker compose -f docker-compose.prod.yml up -d

Open http://localhost:3000

Using Local Models with Ollama

Run Sim with local AI models using Ollama - no external APIs required:

# Start with GPU support (automatically downloads gemma3:4b model)
docker compose -f docker-compose.ollama.yml --profile setup up -d

# For CPU-only systems:
docker compose -f docker-compose.ollama.yml --profile cpu --profile setup up -d

Wait for the model to download, then visit http://localhost:3000. Add more models with:

docker compose -f docker-compose.ollama.yml exec ollama ollama pull llama3.1:8b

Using an External Ollama Instance

If Ollama is running on your host machine, use host.docker.internal instead of localhost:

OLLAMA_URL=http://host.docker.internal:11434 docker compose -f docker-compose.prod.yml up -d

On Linux, use your host's IP address or add extra_hosts: ["host.docker.internal:host-gateway"] to the compose file.

Using vLLM

Sim supports vLLM for self-hosted models. Set VLLM_BASE_URL and optionally VLLM_API_KEY in your environment.

Self-hosted: Dev Containers

  1. Open VS Code with the Remote - Containers extension
  2. Open the project and click "Reopen in Container" when prompted
  3. Run bun run dev:full in the terminal or use the sim-start alias
    • This starts both the main application and the realtime socket server

Self-hosted: Manual Setup

Requirements: Bun, Node.js v20+, PostgreSQL 12+ with pgvector

  1. Clone and install:
git clone https://github.com/simstudioai/sim.git
cd sim
bun install
  1. Set up PostgreSQL with pgvector:
docker run --name simstudio-db -e POSTGRES_PASSWORD=your_password -e POSTGRES_DB=simstudio -p 5432:5432 -d pgvector/pgvector:pg17

Or install manually via the pgvector guide.

  1. Configure environment:
cp apps/sim/.env.example apps/sim/.env
cp packages/db/.env.example packages/db/.env
# Edit both .env files to set DATABASE_URL="postgresql://postgres:your_password@localhost:5432/simstudio"
  1. Run migrations:
cd packages/db && bunx drizzle-kit migrate --config=./drizzle.config.ts
  1. Start development servers:
bun run dev:full  # Starts both Next.js app and realtime socket server

Or run separately: bun run dev (Next.js) and cd apps/sim && bun run dev:sockets (realtime).

Copilot API Keys

Copilot is a Sim-managed service. To use Copilot on a self-hosted instance:

  • Go to https://sim.ai → Settings → Copilot and generate a Copilot API key
  • Set COPILOT_API_KEY environment variable in your self-hosted apps/sim/.env file to that value

Environment Variables

Key environment variables for self-hosted deployments. See .env.example for defaults or env.ts for the full list.

Variable Required Description
DATABASE_URL Yes PostgreSQL connection string with pgvector
BETTER_AUTH_SECRET Yes Auth secret (openssl rand -hex 32)
BETTER_AUTH_URL Yes Your app URL (e.g., http://localhost:3000)
NEXT_PUBLIC_APP_URL Yes Public app URL (same as above)
ENCRYPTION_KEY Yes Encrypts environment variables (openssl rand -hex 32)
INTERNAL_API_SECRET Yes Encrypts internal API routes (openssl rand -hex 32)
API_ENCRYPTION_KEY Yes Encrypts API keys (openssl rand -hex 32)
COPILOT_API_KEY No API key from sim.ai for Copilot features

Tech Stack

Contributing

We welcome contributions! Please see our Contributing Guide for details.

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Made with ❤️ by the Sim Team

Description
No description provided
Readme Apache-2.0 597 MiB
Languages
TypeScript 71.8%
MDX 27.7%
CSS 0.2%
Python 0.1%