NanoClaw isn't one agent — it's a fleet of short-lived Claude Code containers driven by a host-side scheduler, with three persistence layers stitched together: a task queue (what to run), a memory store (what was learned), and a context layer (what the agent sees at boot). Each container is ephemeral. The persistence is not.
NanoClaw is BitSafe's attempt to answer a specific question: what does a company look like when an AI agent has continuous organizational memory and the authority to act on it? Most AI tools today fall into one of two categories — Q&A systems (Notion AI, ChatGPT, Claude.ai) that wait for a human to ask, search a fixed corpus, and return text; or coding agents (Claude Code, Cursor) that operate on a single repo per session and lose context between runs. Neither category remembers the business across days, initiates work on its own schedule, or takes consequential actions on behalf of the team.
NanoClaw is built around three commitments those categories don't make:
Every agent run boots with awareness of (a) what the company is currently working on (WORK_CONTEXT.md, refreshed hourly), (b) recent thread history per channel/DM, (c) long-term memory per user and per group, and (d) 24 indexed knowledge caches covering Slack, Notion, email, GitHub, Canton/Splice/CIP docs, and source code. A new employee onboarded today gets the same domain expertise as a 2-year veteran; a question about a deal from 8 months ago is answered with the same fidelity as one from yesterday. This is what lets NanoClaw cover a chief-of-staff function — it doesn't need to be re-briefed on what the company does, who customers are, or what's blocked. It already knows.
NanoClaw runs ~80 scheduled tasks at any time — daily BD digests, hourly knowledge-compiler crons, heartbeat monitors, dev-pipeline auto-promotes, DR drills, customer-channel watchers. The system initiates work on its own schedule and surfaces results. A Q&A system can tell you the answer if you ask; NanoClaw notices the question is worth asking and brings the answer to you. This shifts the company from human-driven triage (someone has to remember to check) to agent-driven monitoring (the system surfaces only what changed). Cost-per-watch drops to near-zero, so the company can afford to monitor things it previously couldn't justify watching.
NanoClaw doesn't just answer — it ships code through CI to dev to prod, posts in Slack with named identities, drafts and sends email, files research items in the ARQ, creates Notion pages, runs database queries, manages the market-maker bot. It has graduated permission tiers: routine actions flow freely; high-risk actions (mass cross-channel posting, prod deploys with schema changes, financial moves) gate on explicit human approval through the admin-bot RPC pattern or 3-of-3 review. Every action it takes correctly is one less thing a human had to do. Mistakes surface via audit logs and severity-tagged admin pings; corrections become memory entries that prevent the same class of error in the future.
BitSafe's plan is to hire as much top talent as is ROI-positive and as much as we can afford. NanoClaw isn't a substitute for that — it's a force multiplier on it. Every hire we make operates at materially higher leverage because the system handles the part of the job that scales poorly with headcount.
The deeper bet is that "company with persistent agent memory + autonomous execution" is a different category of company than "company that uses AI tools." NanoClaw is the attempt to build the former at BitSafe — and to learn, in production, what the constraints and economics of that category actually are.
Every recurring or future-dated job lives in a single SQLite table on the host VM (store/messages.db), replicated to GCS via Litestream every ~1 second. Three schedule types: cron (e.g. 0 9 ** * for 9am daily), interval (milliseconds between runs), and once (one-shot, auto-deleted after firing).
When a task fires, the host spawns a fresh Claude Code container, mounts the workspace, runs the prompt, captures output, and kills the container. crontab inside a container is useless — it dies on restart. Every job must be registered via the schedule_task MCP tool, which writes to the host DB and survives container churn, host reboots, and VM rebuilds.
~80 active scheduled tasks as of May 2026: BD digests, knowledge compilers, doc cache syncs, DR drills, agent-credit watchdogs, design pipelines.
NanoClaw's "what should I work on" is not in code — it's in Notion. Three databases drive behavior:
The Skills database follows the same pattern — 74 skills defined as Notion pages, synced hourly to /workspace/skills/<name>/SKILL.md. On-disk is read-only cache. Edit in Notion; the next sync ships it globally to every future agent invocation without a deploy.
Each agent has a memory directory at /home/node/.claude/projects/-workspace-group/memory/ with one file per memory and a flat MEMORY.md index. Four typed memory categories:
The core discipline: if you didn't write it down, it doesn't exist. "Noted" without a file write = not remembered. MEMORY.md is loaded into every conversation context. Details are paged in on demand via memory-search (FTS5 full-text search across all memory files).
The index is intentionally capped at 200 lines. The model has a finite context window — keeping the loaded surface small leaves room for actual work.
Every container boot reads, in order:
The anti-pattern we ripped out: stuffing the entire conversation into every prompt. Long sessions degrade as context fills — we call this "context rot." The fix: write intermediate results to files, reference them by path, let the agent re-read what it actually needs. Sub-agents get fresh context windows for research-heavy work and return condensed results.
Build tasks dispatch sub-agents in isolated git worktrees so they can write to the same repo without colliding. Two coordination primitives keep parallel work safe:
Agent swarms use named sender identities (Researcher, Coder, Reviewer) that appear as distinct bot identities in Slack, making multi-step workflows readable. Sub-agents NEVER call send_message — only the main agent sends output to the user.
NanoClaw doesn't call APIs at agent runtime if it can help it. 24 data sources are mirrored to local SQLite with FTS5 indexes, updated by background crons:
search-all queries all 24 caches in parallel in ~400ms. Three caches (Slack, Fathom, Calendar) are now SQLCipher-encrypted at rest — Phase 2 shipped May 2026, Phase 3 (Notion, Salesforce) in queue.
Agents never say "I don't have access" without searching local caches first. The rule: search-all before any external API call. This dramatically reduces latency and API cost on information retrieval.
NanoClaw runs a three-environment topology: prod (nanoclaw-01, us-central1-c), dev (nanoclaw-staging, us-central1-a), and test (Litestream replica + Sunday DR drill).
The standard flow for a functional change: push a branch → CI runs (lint, typecheck, Vitest) → if green, staging-deploy job rebuilds the dev VM → auto-merge-after-staging-smoke job sleeps 30 min watching journalctl on nanoclaw-staging → if clean, promotes to prod main → prod cron picks up the restart within 5 min.
Hard exceptions that always require manual review: container/Dockerfile, src/db.ts schema migrations, scripts/setup-egress-firewall.sh, package.json major version bumps. The auto job refuses these; a human runs promote-to-prod.sh manually after review.
Litestream replicates store/messages.db to GCS continuously (~1s RPO). The Sunday DR drill (run-litestream-drill.sh) is the standing health check for the test environment.
Six layers. One principle: write everything to durable storage; treat each agent invocation as fresh. Containers die — memory, tasks, caches, and skills persist. This means the system gets smarter over time without any individual agent needing to "remember" between sessions.
<aside> 📖
This is Part 2 of a two-part series. Read Part 1: Building a Company-Wide AI Assistant — Architecture, Security, and Self-Improvement
</aside>