Janus is a small proof of concept for the agentic Rule of Two: an agent should never hold all three of these at once—untrusted input, sensitive data, and the ability to mutate the outside world. If it does, a single injected prompt can read private data and exfiltrate it. The rule doesn’t “solve” prompt injection, but it does deterministically reduce the severity of failures by constraining the shape of the agent. The code lives at github.com/script3r/janus.
Why this is hard
Agentic systems blend inboxes, internal docs, and external actions in one loop. That is exactly the shape attackers want. The hard part is enforcing boundaries without killing usefulness. The Rule of Two frames the tradeoffs: you can keep power, but you have to pick which two capabilities are safe to combine in a single session.
The prototype in plain terms
Janus enforces “modes.” Each mode only allows two capabilities at a time. To switch modes, the agent is restarted, gets a fresh identity, and only receives a sanitized handoff state. This breaks the chain between untrusted input and outbound actions.
- Mode AB: untrusted input + sensitive data (no outbound comms)
- Mode BC: sensitive data + comms (no untrusted input)
- Mode AC: untrusted input + comms (no sensitive data)
The proof of concept wires this up with a supervisor, a network guardian, and a simple handoff protocol. It’s not production-ready, but it makes the boundary concrete and testable.
End-to-end test walkthrough
The test simulates a realistic failure mode: a malicious email tries to trick the agent into leaking calendar data. The flow looks like this:
sequenceDiagram
autonumber
participant Agent as Agent
participant Egress as Guardian (Egress)
participant Supervisor as Supervisor
participant Asset as Asset (janus-net)
Agent->>Egress: 🚀 GET http://asset-server/messages/read
Egress->>Asset: 🔒 mTLS Request: /messages/read
Agent->>Egress: 🚀 GET http://asset-server/calendar/read
Egress->>Asset: 🔒 mTLS Request: /calendar/read
Agent->>Supervisor: 📩 Petition for Mode BC
Note right of Supervisor: Transitioning to execute sanitized plan
Supervisor-->>Agent: ♻️ Restarted in Mode BC
Agent->>Egress: 🚀 POST http://asset-server/messages/write
Egress->>Asset: 🔒 mTLS Request: /messages/write
Note over Asset: 📧 ACTION: Email Sent
Asset-->>Egress: 200 OK
Egress-->>Agent: 200 OK
Phase 1 runs in Mode AB. The agent can read untrusted input and sensitive data, so it sees the attack and drafts a clean response plan. It cannot send anything outward yet. To proceed, it petitions janusd, which kills the container, resets the identity, and restarts the agent in Mode BC. Only a sanitized handoff state crosses the boundary.
Phase 2 runs in Mode BC. The agent can send messages, but it can’t read untrusted input, so the injected prompt never reaches the new context. The network guardian (janus-net) enforces egress, and the agent sends the safe response based on the sanitized plan.
What I learned
Agentic security isn’t about perfect prompts. It’s about constraining which capabilities can coexist in one session. Janus shows that if you design for lifecycle resets, identity boundaries, and narrow capability sets, you can reduce the blast radius without neutering the agent. The Rule of Two isn’t a finish line—defense in depth, least privilege, and human approvals still matter—but it is a practical baseline for building agents that fail safer.