Home / Learn / Securing an AI agent that holds money
Toolkit · Sovereign agents
Published July 4, 2026 · Vita Indarra
Short answer: a software spending limit defends an agent against its own bad judgment — but it's code running on the agent's own machine, so a compromised host can patch it away. For an agent holding real money, you move the boundary across a line its machine cannot reach: a two-signature vault where an independent co-signer, running elsewhere, re-checks every action against the same rules and signs only if it passes. A fully compromised agent's lone signature can't move a coin. Below is the checklist — including the two properties that keep the co-signer from becoming a jailer.
Everything a good agent has — a deterministic Permission Envelope outside the model, caps it can't exceed, an allowlist it can't leave — still runs as code, in a process, on a host. For the overwhelming majority of agents, that's correct and sufficient; it defends against a wrong, drifting, or injected model, which is most of what you need. But if an attacker owns the host, the boundary you put outside the model is no longer outside them: they can patch it, set its caps to infinity, delete it. For an agent holding serious value, that's the last unaddressed hole — and you close it with math, not more software.
AGENT holds/moves: ____________________ Worst single action: ____________________
[ ] 1. SPLIT AUTHORITY — two signatures to move funds
[ ] the agent's key AND an independent policy co-signer's key
[ ] neither can act alone; an attacker must compromise BOTH, in two places
[ ] 2. CO-SIGNER = the Envelope, relocated across a key line
[ ] runs OUTSIDE the agent's host / trust domain
[ ] holds ONE of the two keys only
[ ] independently DECODES each action and signs only if it passes the same
deterministic bounds (within cap, allowlisted destination, within budget)
[ ] 3. NON-CUSTODIAL — it can veto, it can't steal
[ ] one key, never both -> its only power is refusal
[ ] moving funds still needs the agent's signature too
[ ] 4. CENSORSHIP-RESISTANT — it can't hold funds hostage
[ ] a timelocked recovery key can move funds WITHOUT the co-signer after a delay
[ ] so a vanished/malicious co-signer can veto today, never trap forever
[ ] 5. DEFENSE IN DEPTH — layers that fail independently
[ ] advisory monitor (watches the mind; never load-bearing)
[ ] software envelope (bounds actions; the everyday boundary)
[ ] cryptographic co-signer (bounds across a key line; last resort)
[ ] each assumes the layer above it failed; depth matched to the stakes
[ ] 6. TEST IT
[ ] a compromised agent's over-cap transfer -> UNFINALIZABLE ("missing signature")
[ ] off-allowlist destination -> refused
[ ] recovery works after the timelock (the co-signer can't become a jailer)
Picture an attacker who owns the agent's machine completely. They make the agent sign a transaction draining everything to their own address — signed correctly with the agent's key. And it does not move, because it's missing the co-signer's signature, and the co-signer, running elsewhere, decodes the transfer, sees it's over the cap to an unknown destination, and refuses. The transaction is not "flagged" or "delayed" — it is unfinalizable. The funds physically cannot move without an approval the attacker never obtained. That's the difference math makes: the boundary held against an adversary who owned everything on the agent's side of the line.
Most agents should stop at the software envelope; reaching for cryptographic co-signing on a low-stakes internal tool is over-engineering. The question is the cost of the worst single action: a wasted afternoon needs a software boundary; irrecoverable funds or a company-ending breach is where the boundary has to move across a line the host can't cross. Match the depth to the stakes.
The pattern is general — a second, independently-hosted approver holding a credential the agent lacks, applying deterministic bounds. In enterprise terms it's dual authorization, a KMS approval flow, or an independent validation service. Bitcoin's native two-signature machinery is just the most hostile environment to prove it in.
Fully — legitimate actions flow through: the co-signer signs everything that passes the bounds. Only the catastrophic action (drain everything to an unknown address) becomes impossible, which was never a feature you wanted.
The same deterministic bounds as the software layer — see the Permission Envelope spec. The co-signer just enforces them across a key line instead of in-process.
Go deeper
This vault is the hard boundary of Sovereign Machines — how autonomous AI agents earn, hold, and spend Bitcoin over the Lightning Network, bounded so you can actually trust them. The producer's plumbing, the miniscript vault policy, the co-signer, and the honest account of what was tested on real networks and what wasn't. Live on Amazon.