Use AI to automate customer support (and when not to)

Most support teams get pitched the same dream: an AI that answers every customer question, eliminates the backlog, and lets your team focus on strategy. What they actually need is much smaller and much more useful.

The truth about AI in customer support is that it works best on a narrow slice of your tickets. Those are the ones that follow a predictable pattern—password resets, billing questions, refund status checks, shipping delays. The ones your team has answered 500 times. AI handles those well. Everything else either requires domain knowledge your model doesn't have, or judgment that no automation should make alone.

The data backs this up. Research from Intercom found that AI currently resolves 11–30% of support volume among teams that have adopted it[1]. That's not a failure. That's actually the realistic ceiling for what automation can handle without creating more problems than it solves. The 70% that remains either needs human judgment, product knowledge, or something closer to empathy than a language model can reliably deliver.

The question isn't "should we use AI for support?" It's "which 30% of our work can AI actually handle, and are we willing to optimize for that instead of pretending it handles everything?"

What AI does well in support

AI is very good at classification and retrieval. Show it a support ticket, and it can quickly identify which category it falls into: billing, technical, account access, refund, shipping. That alone saves time. Instead of a human reading the ticket and deciding where it goes, an AI can triage in seconds.

It's also good at providing answers to questions where the answer already exists. If you have a knowledge base, API documentation, or a set of FAQs, an AI can search that content, pull the relevant section, and draft a response. The human support agent then reviews and sends it, or a customer reads it directly in a self-service portal.

Claude and ChatGPT both handle this reasonably well, but with different tradeoffs. Claude tends to be more literal and careful with instructions, which matters if you're giving it a knowledge base to search; it will refuse to extrapolate when it shouldn't. ChatGPT has broader general knowledge but is more prone to hallucinating details that sound right but aren't. For support work, that's a real liability—you'd rather have an AI that says "I don't know" than one that confidently gives the wrong answer. Pick Claude if your knowledge base is incomplete and you need strict adherence to what you've actually documented. Pick ChatGPT if you're doing lighter triage and classification where a bit of general reasoning is helpful.

What makes a difference in practice is workflow. A tool like a purpose-built support platform that integrates AI directly into the support interface is different from a raw ChatGPT prompt. The integrated platforms can pull customer history, order details, and account status automatically. They can format the AI's response before it reaches the customer. They can log what happened and flag escalations for a human to handle. That structural layer is where most of the actual value lives.

Where AI fails in support

The moment a ticket requires context, judgment, or deep empathy, AI gets fragile. Context is the main one. An AI doesn't know that this particular customer has been with you for five years but had a bad experience last month. It doesn't know they're a high-value customer or that they're three days away from their refund window closing. It doesn't know what your internal policy actually is when a rule has edge cases.

Judgment failures are worse because they're invisible. An AI might suggest a refund to keep a customer happy when the customer's complaint is a known issue you're patching next week, or when they simply misunderstood the product. A human would know. An AI that sounds confident but is wrong damages trust faster than no response at all.

Empathy is harder to articulate but easier to feel. A customer who is frustrated doesn't want an efficient response. They want to feel heard. An AI-generated response, even a good one, often reads like an AI-generated response—just competent enough that the customer realizes it's not a person, which can make them angrier.

The honest tradeoff is this: AI can handle volume, but humans build loyalty. If your goal is to minimize tickets and cut costs on repetitive work, you want more AI. If your goal is retention and satisfaction, you want the opposite. Most businesses need both, which is why the 30% threshold exists.

The real decision: which tickets should AI handle?

Before implementing any AI support tool, map your tickets by type and volume. Which categories do you handle most? Which ones are repetitive?

Password resets, "where is my order" inquiries, refund status checks, general pricing questions, subscription management—these are good candidates. They have one answer, the answer is usually in your system already, and there's low risk if the AI gets it slightly wrong because a human can correct it.

Technical troubleshooting for complex features, product feedback, complaints about service quality, billing disputes, refund requests that don't fit your standard policy—these are not good candidates. They need judgment or knowledge that varies by case.

The second filter is risk. If the AI gets a password reset wrong, the customer resubmits. If it denies a refund by accident, you've created a problem. The higher the cost of error, the more you want a human in the loop.

BCG's research on AI implementation offers a useful framework here. The 10-20-70 rule suggests that successful AI deployment is 10% algorithms, 20% technology and data, and 70% people and processes[2]. That means the real work isn't picking a tool. It's redesigning your workflow so the AI handles what it can, but your team knows exactly when to step in.

How to actually implement this

Start with a narrow scope. Pick one category of ticket—let's say refund status requests. Set up a workflow where incoming tickets get routed to an AI that checks your system for the refund status and pulls the last three interactions with that customer. The AI drafts a response. A human reviews it and hits send, or the customer can read it in a self-service portal.

Run this for two weeks. Track how many tickets the AI handles end-to-end versus how many get flagged for human review. Measure whether the human review step actually changes the AI's response. If humans are changing it every time, the setup isn't working. If humans almost never change it, you've found a real automation opportunity.

Then expand to another category. Don't try to automate everything at once. This disciplined approach, where you measure AI's impact on each ticket type before scaling, is where most of the value comes from. The tools themselves matter less than this rigor.

One more practical note: be transparent with customers about when they're talking to AI. A customer asking "why is this taking so long" and getting an automated response feels deceptive if they didn't know it was coming. A customer who sees a clearly labeled "AI response" and has an easy way to escalate to a human feels like they got quick help. The framing changes the experience.

When not to use AI for support

If your support volume is very low, automation won't pay for itself. A team answering 20 tickets a day probably shouldn't spend a week setting this up.

If your customers expect personalization and relationship continuity, AI is the wrong move. A customer-success-heavy business needs humans who remember the customer's context. Similarly, if you're using AI as a way to avoid hiring support staff you actually need, you've picked the wrong problem to solve. AI is good at scaling what works. It's not a substitute for the people who figure out what working actually means.

If your knowledge base is sparse or out of date, an AI will just generate confident-sounding wrong answers faster. Fix the content first.

The teams getting real value from AI in support aren't the ones that wanted to eliminate support. They're the ones that wanted to eliminate the repetitive parts so their humans could focus on judgment calls and customer relationships. That's a specific decision to make, and it starts by admitting that 70% of your work isn't automatable yet, no matter what the vendor promised. The moment you accept that constraint, you stop chasing the wrong goal and start building something that actually works.