It's a Tuesday morning and an operations lead at a 40-person logistics firm in Rotterdam is staring at three browser tabs: a Gmail inbox with 812 unread threads, a Notion wiki nobody has opened since Q3, and a pricing sheet in an Excel file called final_v7_USE_THIS_ONE.xlsx. Her CEO has just asked her, in the hallway, which AI agent they should build first.
She doesn't need a strategy deck. She needs a map.
After shipping fourteen agents into production over the last two years — for legal firms, e-commerce operators, a veterinary chain, a couple of B2B SaaS companies — we've stopped pretending every agent is bespoke. They're not. Most of what a small business actually runs in production falls into one of four archetypes. Pick the wrong one first and you'll waste a quarter. Pick the right one and you'll buy yourself a hire.
Here they are, in the order most teams should build them.
The responder
The responder reads an incoming message — email, WhatsApp, web form, sometimes a phone call transcribed in real time — and replies. That's it. It doesn't decide strategy. It doesn't write code. It drafts or sends a response that sounds like your business, using facts it can look up.
This is the archetype most people mean when they say "AI agent", and it's almost always the right one to build first. Reasons:
- The input is bounded (one message).
- The output is bounded (one reply).
- The failure mode is visible (a bad draft, which a human catches before send).
- The ROI is countable (minutes per message × messages per day).
A responder that drafts-but-doesn't-send is a responder on training wheels. You run it for three weeks with a human clicking "approve", you measure edit distance between the draft and the sent version, and when edit distance drops below a threshold you turn on autosend for specific categories. That's the whole playbook.
If your team is drowning in an inbox, build a responder first. Everything else is a distraction until that fire is out.
When not to pick this
If your inbound volume is low but every message is high-stakes (say, a two-partner law firm handling ten inquiries a day at €20k each), the responder is a waste. You'd be automating the part of the job the partners actually want to do. Skip to the retriever.
The triager
The triager doesn't reply. It reads, classifies, routes, and tags. It decides: is this a support ticket, a sales lead, a complaint, a recruiter pitch, or an invoice? It assigns it an owner. It adds a priority. It writes a one-line summary into the ticket.
This is the agent that saves the person who currently spends the first two hours of every morning sorting things. In our experience, that person exists at every company above 15 people, and they're usually the most overqualified person to be doing the job.
A triager is cheaper and faster to build than a responder because it produces structured output, not prose. It's also easier to evaluate: you have a labelled ground truth (the human's past sorting) and you can compute accuracy directly.
def triage(message: str) -> dict:
prompt = build_prompt(message, categories=CATEGORIES)
result = llm.complete(prompt, schema=TriageSchema)
return {
"category": result.category,
"priority": result.priority,
"owner": route_table[result.category],
"summary": result.summary,
}
Pair a triager with a responder and you have something close to an inbox operating system: messages arrive, get sorted, get drafted, and a human does the final click. Most of our email-agent deployments are this pair.
The retriever
The retriever is the agent people usually call "a RAG system" or "a chatbot over our docs". Someone asks a question, the agent looks across your wiki, your past tickets, your contracts, your product catalogue, and answers with citations.
It's the archetype that sounds the most exciting and delivers the least, early on. Not because the tech doesn't work — it does — but because most small businesses don't have the knowledge base to retrieve from. The Notion wiki is three years out of date. The SharePoint has four copies of the same policy. The Confluence was last loved by someone who left in 2022.
A retriever built on a messy knowledge base will confidently repeat whichever wrong document happens to rank highest. Fix the corpus before you build the agent.
Build the retriever second if your knowledge is clean, or after a two-week documentation sprint if it isn't. It pays off for onboarding, customer support, and the "what's our policy on X" questions that eat a manager's Slack.
The operator
The operator does things. It books meetings, updates the CRM, issues refunds, reschedules deliveries, posts to a Slack channel, opens a Jira ticket, runs a SQL query and emails the result. It has hands.
This is the archetype everyone wants to build first and should build last. Not because operators are exotic — they're the simplest to reason about individually — but because an operator that acts on the wrong classification, or drafts the wrong reply, or retrieves the wrong fact, compounds every mistake the other three make. You want the responder, triager, and retriever stable before you hand any of them write-access to your production systems.
The useful mental model: the operator is the arms and legs. The other three are the senses and the brain. Build the senses first.
Voice as a special case
Voice agents — the ones that answer your phone — are a hybrid of responder and operator, and they deserve their own treatment. The short version: the voice tier is ready for restaurant bookings, appointment scheduling, and tier-one triage. It is not ready to sell your €80k SaaS. Know where on that line your business sits before you commit.
The order to build them
For most companies between €500k and €50M, the honest sequence is:
- Triager on inbound email. One week. Cheapest, lowest-risk, immediate visibility.
- Responder over the top of the triager, draft-only. Two to four weeks. Switch categories to autosend as confidence grows.
- Retriever over your cleaned-up internal docs. Two weeks if the corpus is tidy, four to six if it needs a sprint first.
- Operator last, scoped to one workflow at a time. Start with something reversible (draft a CRM update) before something destructive (issue a refund).
This order also happens to match the risk curve. The triager can't do any harm. The responder can, but a human is in the loop. The retriever can mislead, but only in answer to a direct question. The operator can quietly break a database at 3am.
A note on whose data trains what
The ambient reality right now is that many of the SaaS tools your team uses are quietly updating their terms to train models on your content by default. Check the settings panel of anything you pay for this quarter; the toggle is usually already on.
For a small business, the practical consequence is: the competitive edge of an agent isn't the model. It's the data you feed it that nobody else has — your past emails, your ticket history, your call transcripts, your invoice patterns. Keep that data somewhere you control, and when you build agents, prefer architectures where the proprietary context stays on your side of the wire.
What this looks like in practice
When we built the email-agent stack for a Dutch B2B distributor last autumn, the thing we ran into was that the triager kept misclassifying supplier invoices as customer complaints, because both contained the word "urgent". We ended up solving it by giving the triager a short tool-call to check whether the sender domain appeared in the accounts-payable list before finalising the category — a five-line fix that lifted accuracy from 82% to 97% overnight. That's the kind of thing an AI agents project actually looks like: not a moonshot, just an honest loop of classify, measure, patch.
Before you build anything: open a spreadsheet, list the ten most common inbound messages your team handled last week, and tag each one with triage, reply, lookup, or action. Whichever column is longest is the archetype to build first. That's your map.




