<aside> 📊 Standing weekly audit (Thursdays) — initiated by Aki Balogh 2026-05-14. Goal: evaluate when third-party platforms beat in-house builds on cost, quality, or efficiency. Pinecone/Notion docs already indexed — skip, call Notion AI directly.
</aside>
Scanned all in-house components in /workspace/skills/ + /workspace/global/. Web research on current best-in-class alternatives (vendor docs, pricing, benchmarks, May 2026). Moat components excluded from audit: skill system core, swarm/IPC, agent containers, egress firewall, credential proxy, scheduler engine.
Tier criteria: Red = clear win on cost+capability+low lock-in, act now. Yellow = meaningful improvement, needs trade-off analysis or Aki/Jesse input. Green = our build is better or vendor not mature.
Voice transcription (local Whisper) | Hugging Face Transformers.js on CPU | Medium maintenance (model updates, container RAM)
SQLite FTS5 caches (Slack, Notion, Meet, Fathom, etc.) | Local full-text search, <100ms | Low maintenance (sync crons)
Notion QA skill (notion-qa) | Polls Kadeem's Notion AI agent, 8-min latency | High maintenance (fragile agent dependency)
Memory system (MEMORY.md + notion-brain-append) | File-based cross-session memory + Notion Brain | Low maintenance (stable, agent-managed)
Email reader (Gmail OAuth) | Per-user Gmail read/search with injection protection | Low maintenance
Zapier integration (ninety-api) | Routes Ninety.io through Zapier MCP, per-execution pricing | Medium maintenance
Knowledge compiler | Multi-source synthesis across Slack/Notion/Meet/Fathom | Low maintenance
Salesforce / Calendar / Fathom caches | SQLite mirrors of CRM, calendar, meeting transcripts | Low maintenance
Slack reader / monitor / exporter / team-digest | Channel reads, monitoring, digests with business logic | Low maintenance
Current: Local Whisper (Hugging Face Transformers.js) on CPU in containers. ~10-30s per clip. ~300MB model load per cold start. Mediocre accuracy on accented speech.
Alternatives: (a) OpenAI Whisper API: $0.006/min — cheapest, ~200ms latency. 100 min/month = $0.60. No maintenance. (b) Deepgram Nova-3: $0.022/min — 5.26% WER (best-in-class accuracy), native speaker diarization, 450ms streaming. (c) AssemblyAI Universal-2: best for noisy real-world audio.
Recommendation: Swap to OpenAI Whisper API (cost-optimized) or Deepgram Nova-3 (if diarization needed). Change is 1 API call in use-local-whisper skill. Reversible in <1 hour. Lock-in risk: low.
Assign: Jesse to implement.