Is a software spending limit enough to secure an AI agent that holds money?

For most agents, yes — a deterministic software boundary (a Permission Envelope) defends against a wrong, drifting, or injected model. But that boundary is code running on the agent's own machine, so if an attacker fully compromises the host, they can bypass it. For an agent holding serious value, that's the last hole, and it's closed by moving the boundary across a trust line the host cannot cross — a second signature held elsewhere.

What is a co-signer for an AI agent, and how does it work?

A co-signer is the Envelope's logic relocated across a key line. Moving funds now requires two signatures: the agent's and an independent co-signer's, running on a separate machine holding only one of the two keys. When the agent proposes a transaction, the co-signer independently decodes it and signs only if it passes the same deterministic bounds. A fully compromised agent's lone signature is worthless, because the transaction is missing a required signature the attacker never obtained.

How do you keep the co-signer from becoming a single point of failure?

Two properties. Non-custodial: the co-signer holds only one of the two required keys, so it can veto a bad action but can never move funds alone — its only power is refusal. Censorship-resistant: a timelocked recovery path means that if the co-signer vanishes or turns malicious and refuses to sign, a separate recovery key can move the funds after a delay. Split authority for safety, a recovery path for liveness — so the co-signer can stop a compromised agent without becoming a party that can rob or imprison it.

Toolkit · Sovereign agents

How to Secure an AI Agent That Holds Money: The Vault & Co-Signer Checklist (2026)

Published July 4, 2026 · Vita Indarra

Short answer: a software spending limit defends an agent against its own bad judgment — but it's code running on the agent's own machine, so a compromised host can patch it away. For an agent holding real money, you move the boundary across a line its machine cannot reach: a two-signature vault where an independent co-signer, running elsewhere, re-checks every action against the same rules and signs only if it passes. A fully compromised agent's lone signature can't move a coin. Below is the checklist — including the two properties that keep the co-signer from becoming a jailer.

Why the software boundary isn't the last word

Everything a good agent has — a deterministic Permission Envelope outside the model, caps it can't exceed, an allowlist it can't leave — still runs as code, in a process, on a host. For the overwhelming majority of agents, that's correct and sufficient; it defends against a wrong, drifting, or injected model, which is most of what you need. But if an attacker owns the host, the boundary you put outside the model is no longer outside them: they can patch it, set its caps to infinity, delete it. For an agent holding serious value, that's the last unaddressed hole — and you close it with math, not more software.

The vault checklist

AGENT holds/moves: ____________________   Worst single action: ____________________

[ ] 1. SPLIT AUTHORITY — two signatures to move funds
      [ ] the agent's key AND an independent policy co-signer's key
      [ ] neither can act alone; an attacker must compromise BOTH, in two places

[ ] 2. CO-SIGNER = the Envelope, relocated across a key line
      [ ] runs OUTSIDE the agent's host / trust domain
      [ ] holds ONE of the two keys only
      [ ] independently DECODES each action and signs only if it passes the same
          deterministic bounds (within cap, allowlisted destination, within budget)

[ ] 3. NON-CUSTODIAL — it can veto, it can't steal
      [ ] one key, never both -> its only power is refusal
      [ ] moving funds still needs the agent's signature too

[ ] 4. CENSORSHIP-RESISTANT — it can't hold funds hostage
      [ ] a timelocked recovery key can move funds WITHOUT the co-signer after a delay
      [ ] so a vanished/malicious co-signer can veto today, never trap forever

[ ] 5. DEFENSE IN DEPTH — layers that fail independently
      [ ] advisory monitor (watches the mind; never load-bearing)
      [ ] software envelope (bounds actions; the everyday boundary)
      [ ] cryptographic co-signer (bounds across a key line; last resort)
      [ ] each assumes the layer above it failed; depth matched to the stakes

[ ] 6. TEST IT
      [ ] a compromised agent's over-cap transfer -> UNFINALIZABLE ("missing signature")
      [ ] off-allowlist destination -> refused
      [ ] recovery works after the timelock (the co-signer can't become a jailer)

Replay the compromise

Picture an attacker who owns the agent's machine completely. They make the agent sign a transaction draining everything to their own address — signed correctly with the agent's key. And it does not move, because it's missing the co-signer's signature, and the co-signer, running elsewhere, decodes the transfer, sees it's over the cap to an unknown destination, and refuses. The transaction is not "flagged" or "delayed" — it is unfinalizable. The funds physically cannot move without an approval the attacker never obtained. That's the difference math makes: the boundary held against an adversary who owned everything on the agent's side of the line.

Don't over-build

Most agents should stop at the software envelope; reaching for cryptographic co-signing on a low-stakes internal tool is over-engineering. The question is the cost of the worst single action: a wasted afternoon needs a software boundary; irrecoverable funds or a company-ending breach is where the boundary has to move across a line the host can't cross. Match the depth to the stakes.

Frequently asked

Does this only work for Bitcoin agents?

The pattern is general — a second, independently-hosted approver holding a credential the agent lacks, applying deterministic bounds. In enterprise terms it's dual authorization, a KMS approval flow, or an independent validation service. Bitcoin's native two-signature machinery is just the most hostile environment to prove it in.

Isn't the agent still useful if it can't act alone?

Fully — legitimate actions flow through: the co-signer signs everything that passes the bounds. Only the catastrophic action (drain everything to an unknown address) becomes impossible, which was never a feature you wanted.

Where does the spending limit itself come from?

The same deterministic bounds as the software layer — see the Permission Envelope spec. The co-signer just enforces them across a key line instead of in-process.

Go deeper

The field guide behind this checklist

This vault is the hard boundary of Sovereign Machines — how autonomous AI agents earn, hold, and spend Bitcoin over the Lightning Network, bounded so you can actually trust them. The producer's plumbing, the miniscript vault policy, the co-signer, and the honest account of what was tested on real networks and what wasn't. Live on Amazon.

Sovereign Machines · $7.99 Which book should I read first?

← More field notes