Inbox triage for chat agents: three rules before shipping

It is 11pm. The agency owner has 312 unread emails, a client asking why their campaign is paused, a supplier chasing a PO, and three cold pitches pretending to be replies. She has been told by four different vendors this quarter that an AI agent will handle her inbox. She tried one. It replied to a journalist with a pricing quote meant for a lead. She turned it off the next morning.

This post is the playbook we give every client before we connect a chat or email agent to a real inbox. Three rules. If your agent does not satisfy all three, keep it in draft-mode and do not let it send.

Rule one: a scope the agent can prove it is inside

Most inbox-agent failures are scope failures. The agent is asked to "handle support email" and nobody has defined what that means. So it answers a legal question from opposing counsel. It quotes a refund policy that was updated last quarter. It agrees to a meeting on a date the founder is on a plane.

The fix is boring and it works: the agent must classify the message into one of a short, closed list of intents before it is allowed to draft a reply. Not "is this a support email?" — that is a tautology. The intents are the concrete things you are willing to have answered without a human in the loop.

For a typical SMB inbox we start with five: order_status, invoice_question, meeting_request, general_info, out_of_scope. Everything that does not fit one of the first four lands in out_of_scope and gets forwarded to a human. The agent never guesses. If confidence is below a threshold, the intent is out_of_scope by default.

INTENTS = {
    "order_status",
    "invoice_question",
    "meeting_request",
    "general_info",
    "out_of_scope",
}

def classify(message: str, llm) -> tuple[str, float]:
    result = llm.classify(
        message,
        labels=sorted(INTENTS),
        system="Pick exactly one label. If unsure, pick out_of_scope.",
    )
    intent = result.label if result.label in INTENTS else "out_of_scope"
    if result.confidence < 0.75:
        intent = "out_of_scope"
    return intent, result.confidence

Two things matter here. First, the label set is closed — the agent cannot invent a sixth intent at 3am. Second, low confidence is not a drafting problem, it is a routing problem. You do not want a cautiously-worded reply to a message you did not understand. You want a human to see it.

Takeaway

A chat agent without a closed intent list is not triaging your inbox, it is gambling with your replies.

Why not one big prompt

You will be tempted to skip the classifier and just give the model a long system prompt that says "only answer questions about X, Y, Z, otherwise escalate." We have tried it. It drifts. The model helpfully answers a question that was 80% inside scope and 20% outside, and the 20% is the part that gets you sued. A separate classifier step is a decision boundary you can log, audit, and tune. A single prompt is vibes.

Rule two: an escalation path that is the default, not the fallback

The second rule reverses how most teams think about human-in-the-loop. The agent does not reply and then escalate if something goes wrong. The agent drafts, and a human approves, until the agent has earned the right to send on its own — per intent, not in aggregate.

We run every new agent in three phases:

Shadow. The agent drafts into a Slack channel. No reply is sent. A human reads the draft alongside the real message and either edits-and-sends from their normal mail client, or ignores the draft entirely. This runs for at least 100 messages per intent.
Suggest. The agent drafts directly into the mail client as an unsent draft. A human opens it, edits, sends. We track edit distance. When median edit distance for an intent drops below a threshold (we use 15% of characters), that intent graduates.
Send. For graduated intents only, the agent sends. Every send still posts to a review channel with a one-click "this was wrong" button that pages the on-call human and demotes the intent back to Suggest.

Two intents can be in different phases at the same time. order_status might be in Send because it is a templated lookup against your order table. meeting_request might still be in Suggest because calendars are hard and nobody has been fired over a late reply but plenty have been fired over a double-booked client lunch.

Warning

Do not graduate an intent based on how the agent feels. Graduate it based on measured edit distance on real messages over a real window. Vibes ship bugs to clients.

Rule three: a write-lock on everything that touches the outside world

The third rule is the one teams skip because it feels like paranoia until the day it is not. An agent that can read your inbox is a research tool. An agent that can write — send email, call an API, update a CRM, move money — is a liability. Treat those two capabilities as separate systems with separate credentials.

In practice this means the agent never holds the sending credential. A narrow service does. The agent calls that service with a structured request, and the service enforces the rules the agent cannot be trusted to enforce on itself: rate limits per recipient, domain allowlists, a hard cap on sends per hour, a block on anything that looks like a new external domain in the first 24 hours of a conversation.

def send_reply(draft: Draft, agent_token: str) -> SendResult:
    # The agent has agent_token. It does NOT have SMTP creds.
    if not allowlist.contains(draft.to_domain):
        return SendResult.rejected("domain_not_allowlisted")
    if rate_limiter.exceeded(draft.to_address):
        return SendResult.rejected("rate_limited")
    if draft.intent not in graduated_intents():
        return SendResult.rejected("intent_not_graduated")
    if contains_payment_instruction(draft.body):
        return SendResult.rejected("payment_language_blocked")
    return mail_gateway.send(draft)  # gateway holds the real creds

The pattern is simple: a credential proxy between the agent and the thing that can cause damage. The agent asks, the proxy decides. The agent never holds the key. Whether you write this as 80 lines of Python or adopt one of the emerging agent-gateway tools, the principle does not change.

The payment-language check in that snippet is not theoretical. Any agent replying to invoice questions will eventually be asked to confirm new bank details, and "confirm" is exactly the word you do not want it to say. Block the vocabulary at the gateway. Let a human handle it.

What these three rules buy you

Put the three together — closed-set classifier, per-intent graduation, write-proxy with hard rules — and you end up with an agent that is useful on day one and boring by month three. Boring is the goal. A chat agent that replies to 60% of your inbox in a way nobody notices is worth ten agents that reply to 100% in a way somebody screenshots.

The OWASP working group on LLM security has been converging on similar ground in their Top 10 for LLM Applications. Prompt injection, excessive agency, and insecure output handling are all variations of the same failure mode: the agent was allowed to act before it was constrained. NIST's AI 600-1 profile makes the same argument in more formal language — confinement before capability. The three rules above are one practical shape of those constraints for an inbox.

A five-minute audit you can run today

If you have a chat or email agent live right now, open three tabs.

Tab one: the agent's system prompt. Is there a closed list of intents it is allowed to handle, or is there a paragraph that starts "You are a helpful assistant that"? If the second, you have no scope.

Tab two: the last 50 messages the agent sent. For each one, ask whether a human read the draft before it went out. If the answer is "the agent has been running itself for weeks," you skipped the graduation phase.

Tab three: the code path that actually calls your mail provider. Is the sending credential in the same process as the LLM call? If yes, you have no write-lock. Any prompt injection that reaches the agent reaches your outbox.

When we built the triage agent for a Rotterdam logistics client, the thing we ran into was exactly rule three: the first version held its own SMTP credential and a single malformed forwarded email convinced it to reply to an entire mailing list. We rebuilt it behind a gateway the same week, and that gateway is now the template for every AI agent we ship. Three rules. In that order.

Frequently asked

How many intents should my classifier start with?+

Four to six. Fewer and you will route too much to humans; more and the classifier gets unreliable. Add new intents only when a real pattern shows up in the escalation queue.

What edit-distance threshold means an intent is ready to send on its own?+

We use a median of 15% character edit distance across the last 50 drafts for that intent. If humans barely touch the draft, the agent has earned send rights. Measure per intent, not in aggregate.

Can I use one big system prompt instead of a separate classifier?+

You can, but it drifts under pressure. A separate classification step gives you a logged decision boundary you can audit and tune. A single prompt gives you vibes and a bad week.

What belongs behind the write-proxy besides sending email?+

Anything that touches the outside world: CRM writes, calendar invites, payment confirmations, webhook calls. If a mistake there would embarrass you in front of a client, it belongs behind the proxy.

Keep reading

Share:X LinkedIn Email

Unopened cream envelope on forest leather blotter, green silk ribbon across it, brass paperclip and linen receipt beside.

Email automation

22 Apr 2026·9 min read

Email agent case study: 6 hours of invoice chasing to 20 minutes

A 14-person agency was spending six hours a week chasing overdue invoices. One email agent, three weeks of work, and the ritual now takes twenty minutes.

email automationai agentscase study

Read

Brass-edged card index drawer ajar with cream cards, one green tab, folded ledger paper and red wax on ivory tabletop.

Data scraping

29 May 2026·9 min read

Lead scraping: surviving spam filters and Dutch case law

Most cold outreach lists are illegal, undeliverable, or both. Here is the small-team playbook we use to scrape, enrich and send mail that lands and stays compliant.

data scrapingemail automationautomation

Read

Antique oak switchboard with four brass sockets, coiled green cloth cables, one chartreuse cable plugged in, index card.

AI agents

21 Apr 2026·8 min read

Four AI-agent archetypes for small business: a field guide

Fourteen agents into production, the pattern is clear: small businesses don't need one magic AI. They need four boring ones, in the right order.

ai agentsautomationchat agents

Read

Inbox triage for chat agents: three rules before you ship

Rule one: a scope the agent can prove it is inside

Why not one big prompt

Rule two: an escalation path that is the default, not the fallback

Rule three: a write-lock on everything that touches the outside world

What these three rules buy you

A five-minute audit you can run today

Frequently asked

Keep reading

Email agent case study: 6 hours of invoice chasing to 20 minutes

Lead scraping: surviving spam filters and Dutch case law

Four AI-agent archetypes for small business: a field guide

Want to build something similar?