Most teams using AI today are stitching it onto a workspace that wasn't built for it. Individual ChatGPT accounts. A few Notion agents. A scattered handful of automations. Each one helpful in isolation, none of them compounding. The team gets a little faster, then plateaus.
We took a different approach. We rebuilt the company itself so AI could use it.
Three layers do the work at BitSafe:
flowchart TB
NC["NanoClaw — the autonomous fleet<br>~80 scheduled tasks: monitoring, executing, surfacing"]
CL["Claude — the daily leverage layer<br>Notion AI inside the workspace · Claude apps outside it"]
NO["Notion — the system of record<br>Structured, queryable memory of how the company operates"]
NC -->|"writes results back into"| NO
CL -->|"operates on"| NO
style NO fill:#F4652F,color:#fff
This is Part 1 of a four-part series on how that stack came together at BitSafe. Aki has already written about NanoClaw — the harness, the security model, the local caches, the knowledge compiler. We won't re-cover that ground here. This series is about the part that makes the rest of it work: Notion. Specifically, what it took to turn Notion from a wiki into a substrate that AI agents can actually do work on top of.
It helps to be precise about what each layer is for, because the temptation when you have all three is to let them compete. They shouldn't.
Notion is structured. Pages have parents. Databases have schemas. Properties have types. Relations connect things. When NanoClaw asks "which Canton ecosystem apps have an open opportunity in Negotiation with a close date this quarter," it gets an answer because the company is shaped like a question that can be asked. None of that is automatic — somebody designed the schema. That somebody is the entire reason this works.
Claude is unstructured leverage, and it shows up two ways at BitSafe. From inside Notion, where Notion AI is Claude under the hood — anyone in the workspace gets agentic Q&A, summarization, and writing assistance against the company's real state, without leaving the page they're on. And from outside Notion, through the standalone Claude apps: Claude Code for engineering, plus Claude Chat, Cowork, and Design for the rest of the team. The pattern is the same in both modes — drop into a Claude conversation and operate on real data: read a Notion doc, summarize three weeks of Slack, draft a partner brief, refactor a script. Claude Code does the same for engineers: it edits real files, runs real commands, ships real diffs. The thing Claude doesn't do well is remember — every session starts fresh, and the context it has is whatever Notion AI auto-retrieves or whatever a human pastes in. That's why this layer sits on top of Notion, not next to it.
NanoClaw is the persistent fleet. It boots ephemeral Claude Code containers on a schedule — daily BD digests, hourly knowledge compilers, customer-channel watchers, dev-pipeline auto-promotes — and writes the results back into Slack and Notion. It has continuous business context (24 indexed knowledge caches, including a full Notion mirror), graduated permissions to actually act, and a memory store that grows with use. It is, functionally, a chief-of-staff layer the whole company shares.
The thesis of this series is simple: the AI layer is only as good as the substrate underneath it. Skip the substrate and your agents hallucinate, your prompts get longer, your outputs get worse. Build the substrate and the AI layer compounds — every new skill works for everyone, every correction lands once and stays landed, every new data source plugs into the same retrieval pattern.
That's what Notion is for us. Not a wiki. Not a project tracker. The company's structured memory.
We didn't always think this way.
A year ago, our Notion looked like most companies' Notion. Each team had their own corner. Marketing had its projects-and-tasks setup. Sales had a sprawl of Google Docs and a separately-managed Salesforce instance. Engineering had multiple disconnected databases. Everyone was productive individually. Nothing was queryable across.
We started building NanoClaw and immediately ran into a wall: the agents could read everything, but nothing they read was structured the same way twice. A "company" in Salesforce did not match a "company" in Notion. A "project" in marketing did not match a "project" in product. The agents were brilliant inside any one silo and useless across them.
You can solve that problem two ways. You can write more elaborate prompts and more retrieval logic to paper over the inconsistency. Or you can fix the inconsistency. The first approach scales linearly with how many edge cases you can think of. The second approach scales with how disciplined you are willing to be once.
We picked the second. The rest of this article — and most of this series — is about what that discipline actually looks like.
The phrase we keep coming back to is handbook-first. Every decision worth remembering becomes a document. Every recurring process becomes an SOP. Every meeting produces a structured record (decisions, owners, due dates), not a transcript anyone has to re-read.
Two principles shape what makes it onto the page.
Signal, not noise. A page that nobody reads is worse than no page, because it dilutes search. So every database has a Status (Drafting / In Review / Published / Archived) and a Verification property. Pages get archived aggressively. The bar to publish is low; the bar to stay published is owner accountability.
The campsite principle. When we restructured, we didn't do a big-bang reorg. We don't migrate inactive content. The rule is: if you touch it, you clean it up to the new standard. Active workstreams move first; the long tail decays naturally. This is the only way we've found to do a real workspace transformation without a months-long migration project that nobody finishes.
Both principles share a common ancestor: the assumption that the workspace is a tool for the people working in it, not an archive for posterity. The agents are downstream consumers of that same tool. If the workspace is good for the team, it's good for the agents too. If it's bad for the team, no amount of clever prompting will save the agents.
The hardest part of running a company-wide Notion is the same as the hardest part of running any shared system: governance. Who can change what.
We use three tiers.
Everyday users. Most of the team. They read, write, capture, and update — inside the structures someone else built. They create Tasks, log meetings, draft Documents, capture Companies in the CRM. They're not expected to design schemas or move databases.
Champions. One per department. They own the surface area of their domain — the Documents their team produces, the dashboards, the SOPs. They can adjust views and templates inside their area. They cannot change global schemas. They are the layer that keeps each function coherent without bottlenecking on a single architect.
Architects. A small number of people. They own global schemas, top-level page hierarchies, and any change that touches more than one pillar. New properties on the Companies database, new Pillar databases, integration tokens, custom agent permissions — all gated here.
This sounds heavy. It isn't. The point is not to slow people down. It's to give the system a shape that the AI layer can rely on. When NanoClaw queries "all Companies with engagement status Live Partner," it has to return the same answer to everyone, every time. That requires that engagement status exists as a property, has a fixed set of options, and is the same for everyone who's filling it in. Architects exist to make that true.
The same model is what lets us be aggressive about giving the AI layer write access. The CRM Capture Agent can create new Companies, log Opportunities, and append Updates. To be clear about the mechanics: Notion's permission model would let an agent rename properties or restructure a database if we granted those permissions — and on a few low-stakes databases we do. On the schemas the rest of the company depends on, we don't. Those are locked to the Architect tier by policy, not by capability. That trade-off — capability for predictability — is one we make explicitly per database, and it's what makes a write-capable agent safe to deploy at scale.
Some of what the trifecta produces is hard to measure. Some of it is easy.
The easy part:
The harder-to-measure part: the system gets better the more you use it. Every recurring SOP we write becomes a skill the AI layer can run. Every correction we save becomes a memory entry an agent stops repeating. Every new database becomes another structured surface NanoClaw can query without a code change. The marginal cost of capability decreases over time.
That's the bet. A company running this trifecta for twelve months has meaningfully more leverage than one running it for one. It compounds.
We did not get here cleanly. Three things we'd have done sooner if we were starting over:
Part 2 of this series goes deep on the actual architecture — Pillars, Projects, Documents, Meetings, the supporting databases, and the dashboard layer that ties it all together.
<aside> 📚
Keep reading
Start with the hub: The Infrastructure Mindset, Turned Inward — How BitSafe Runs on AI
How BitSafe Runs on Notion — the brain:
Part 1: Notion as the Company OS (you're reading this) · Part 2: The Architecture · Part 3: Agents, Automations, and the AI Layer · Part 4: Replacing Salesforce with Notion · Part 5: The Agent Governance Model
The NanoClaw series — the reach:
Part 1: Building a Company-Wide AI Assistant · Part 2: The Architecture · Part 3: The Autonomous Engine · Part 4: The Substrate · Part 5: Working With NanoClaw · Companion: Cost Discipline
Standalone deep-dives:
Why Not Just Use the Claude App? · The Invisible Seam · Measuring an AI OS, Honestly
</aside>