Home  /  Learn  /  Securing an AI agent that holds money

Toolkit · Sovereign agents

How to Secure an AI Agent That Holds Money: The Vault & Co-Signer Checklist (2026)

Published July 4, 2026 · Vita Indarra

Short answer: a software spending limit defends an agent against its own bad judgment — but it's code running on the agent's own machine, so a compromised host can patch it away. For an agent holding real money, you move the boundary across a line its machine cannot reach: a two-signature vault where an independent co-signer, running elsewhere, re-checks every action against the same rules and signs only if it passes. A fully compromised agent's lone signature can't move a coin. Below is the checklist — including the two properties that keep the co-signer from becoming a jailer.

Why the software boundary isn't the last word

Everything a good agent has — a deterministic Permission Envelope outside the model, caps it can't exceed, an allowlist it can't leave — still runs as code, in a process, on a host. For the overwhelming majority of agents, that's correct and sufficient; it defends against a wrong, drifting, or injected model, which is most of what you need. But if an attacker owns the host, the boundary you put outside the model is no longer outside them: they can patch it, set its caps to infinity, delete it. For an agent holding serious value, that's the last unaddressed hole — and you close it with math, not more software.

The vault checklist

AGENT holds/moves: ____________________   Worst single action: ____________________

[ ] 1. SPLIT AUTHORITY — two signatures to move funds
      [ ] the agent's key AND an independent policy co-signer's key
      [ ] neither can act alone; an attacker must compromise BOTH, in two places

[ ] 2. CO-SIGNER = the Envelope, relocated across a key line
      [ ] runs OUTSIDE the agent's host / trust domain
      [ ] holds ONE of the two keys only
      [ ] independently DECODES each action and signs only if it passes the same
          deterministic bounds (within cap, allowlisted destination, within budget)

[ ] 3. NON-CUSTODIAL — it can veto, it can't steal
      [ ] one key, never both -> its only power is refusal
      [ ] moving funds still needs the agent's signature too

[ ] 4. CENSORSHIP-RESISTANT — it can't hold funds hostage
      [ ] a timelocked recovery key can move funds WITHOUT the co-signer after a delay
      [ ] so a vanished/malicious co-signer can veto today, never trap forever

[ ] 5. DEFENSE IN DEPTH — layers that fail independently
      [ ] advisory monitor (watches the mind; never load-bearing)
      [ ] software envelope (bounds actions; the everyday boundary)
      [ ] cryptographic co-signer (bounds across a key line; last resort)
      [ ] each assumes the layer above it failed; depth matched to the stakes

[ ] 6. TEST IT
      [ ] a compromised agent's over-cap transfer -> UNFINALIZABLE ("missing signature")
      [ ] off-allowlist destination -> refused
      [ ] recovery works after the timelock (the co-signer can't become a jailer)

Replay the compromise

Picture an attacker who owns the agent's machine completely. They make the agent sign a transaction draining everything to their own address — signed correctly with the agent's key. And it does not move, because it's missing the co-signer's signature, and the co-signer, running elsewhere, decodes the transfer, sees it's over the cap to an unknown destination, and refuses. The transaction is not "flagged" or "delayed" — it is unfinalizable. The funds physically cannot move without an approval the attacker never obtained. That's the difference math makes: the boundary held against an adversary who owned everything on the agent's side of the line.

Don't over-build

Most agents should stop at the software envelope; reaching for cryptographic co-signing on a low-stakes internal tool is over-engineering. The question is the cost of the worst single action: a wasted afternoon needs a software boundary; irrecoverable funds or a company-ending breach is where the boundary has to move across a line the host can't cross. Match the depth to the stakes.

Frequently asked

Does this only work for Bitcoin agents?

The pattern is general — a second, independently-hosted approver holding a credential the agent lacks, applying deterministic bounds. In enterprise terms it's dual authorization, a KMS approval flow, or an independent validation service. Bitcoin's native two-signature machinery is just the most hostile environment to prove it in.

Isn't the agent still useful if it can't act alone?

Fully — legitimate actions flow through: the co-signer signs everything that passes the bounds. Only the catastrophic action (drain everything to an unknown address) becomes impossible, which was never a feature you wanted.

Where does the spending limit itself come from?

The same deterministic bounds as the software layer — see the Permission Envelope spec. The co-signer just enforces them across a key line instead of in-process.

Go deeper

The field guide behind this checklist

This vault is the hard boundary of Sovereign Machines — how autonomous AI agents earn, hold, and spend Bitcoin over the Lightning Network, bounded so you can actually trust them. The producer's plumbing, the miniscript vault policy, the co-signer, and the honest account of what was tested on real networks and what wasn't. Live on Amazon.

← More field notes