<aside> 📊 Standing weekly audit (Thursdays) — initiated by Aki Balogh 2026-05-14. Goal: evaluate when third-party platforms beat in-house builds on cost, quality, or efficiency. Pinecone/Notion docs already indexed — skip, call Notion AI directly.

</aside>

Methodology & Scope

Scanned all in-house components in /workspace/skills/ + /workspace/global/. Web research on current best-in-class alternatives (vendor docs, pricing, benchmarks, May 2026). Moat components excluded from audit: skill system core, swarm/IPC, agent containers, egress firewall, credential proxy, scheduler engine.

Tier criteria: Red = clear win on cost+capability+low lock-in, act now. Yellow = meaningful improvement, needs trade-off analysis or Aki/Jesse input. Green = our build is better or vendor not mature.

In-House Component Inventory

Voice transcription (local Whisper) | Hugging Face Transformers.js on CPU | Medium maintenance (model updates, container RAM)

SQLite FTS5 caches (Slack, Notion, Meet, Fathom, etc.) | Local full-text search, <100ms | Low maintenance (sync crons)

Notion QA skill (notion-qa) | Polls Kadeem's Notion AI agent, 8-min latency | High maintenance (fragile agent dependency)

Memory system (MEMORY.md + notion-brain-append) | File-based cross-session memory + Notion Brain | Low maintenance (stable, agent-managed)

Email reader (Gmail OAuth) | Per-user Gmail read/search with injection protection | Low maintenance

Zapier integration (ninety-api) | Routes Ninety.io through Zapier MCP, per-execution pricing | Medium maintenance

Knowledge compiler | Multi-source synthesis across Slack/Notion/Meet/Fathom | Low maintenance

Salesforce / Calendar / Fathom caches | SQLite mirrors of CRM, calendar, meeting transcripts | Low maintenance

Slack reader / monitor / exporter / team-digest | Channel reads, monitoring, digests with business logic | Low maintenance

🔴 High Impact Swap Candidates

1. Voice Transcription: Local Whisper → OpenAI Whisper API or Deepgram

Current: Local Whisper (Hugging Face Transformers.js) on CPU in containers. ~10-30s per clip. ~300MB model load per cold start. Mediocre accuracy on accented speech.

Alternatives: (a) OpenAI Whisper API: $0.006/min — cheapest, ~200ms latency. 100 min/month = $0.60. No maintenance. (b) Deepgram Nova-3: $0.022/min — 5.26% WER (best-in-class accuracy), native speaker diarization, 450ms streaming. (c) AssemblyAI Universal-2: best for noisy real-world audio.

Recommendation: Swap to OpenAI Whisper API (cost-optimized) or Deepgram Nova-3 (if diarization needed). Change is 1 API call in use-local-whisper skill. Reversible in <1 hour. Lock-in risk: low.

Assign: Jesse to implement.