Jaswanth

The previous three projects — inbox triage, SQL query, RAG document — each solved one problem with one agent. This project asks a different question: what happens when one agent isn't enough?

Customer support is messy. Someone asks about their order status. Someone else wants to know the return policy. A third person is angry about a broken product. These are fundamentally different problems that require different knowledge, different tools, and different tones.

One monolithic agent trying to handle all of this becomes a confused generalist. It hallucinates order statuses because it doesn't have database access. It makes up policies because it doesn't have the docs. It responds to complaints with the same cheerful tone it uses for FAQs.

The answer is specialization.

The architecture

        Customer message
              │
              ▼
  ┌─────────────────────────┐
  │                         │
  │   ORCHESTRATOR AGENT    │
  │                         │
  │   Reads the message,    │
  │   understands intent,   │
  │   routes to the right   │
  │   specialist.           │
  │                         │
  └─────┬─────┬─────┬──────┘
        │     │     │
        ▼     ▼     ▼
  ┌─────┐ ┌─────┐ ┌─────────┐
  │ORDER│ │ FAQ │ │COMPLAINT│
  │AGENT│ │AGENT│ │  AGENT  │
  └──┬──┘ └──┬──┘ └────┬────┘
     │       │         │
     ▼       ▼         ▼
   SQL     RAG      Empathy
  Query   Search    + Logging
  on DB   on Docs   + Escalation

The orchestrator doesn't answer anything itself. It's a router. It reads the customer's message, figures out what kind of problem it is, and hands it to the agent that's built for that exact job.

The three specialists

Order Agent — has database access. When someone asks "where's my order?" it doesn't guess. It runs a real SQL query against the orders table, finds the actual status, and responds with facts. This is the SQL Query Agent pattern from the previous project, repurposed as a worker.

FAQ Agent — has document access. Company policies, return windows, shipping info, working hours — all embedded as vectors in pgvector. When someone asks "what's your return policy?" it retrieves the relevant chunks and answers from real documentation. This is the RAG pattern, repurposed as a worker.

Complaint Agent — has no tools. Its job is emotional, not informational. It acknowledges the frustration, apologizes sincerely, logs the complaint, and escalates to a human. The system prompt is tuned for empathy, not efficiency. Sometimes the right answer isn't an answer — it's being heard.

How routing works

The orchestrator uses the LLM to classify intent. Not keywords — intent. "My package never arrived" and "order #4521 status" both route to the Order Agent, even though they share zero words. "This is unacceptable, I've been waiting three weeks" routes to Complaint, not Order — because the intent is frustration, not information.

"Where is my order #4521?"
  → Orchestrator: this is an order inquiry
  → Routes to: Order Agent
  → Order Agent: SELECT * FROM orders WHERE id = 4521
  → "Your order #4521 is currently in transit.
     Expected delivery: April 5th."

"What's your return policy?"
  → Orchestrator: this is a policy question
  → Routes to: FAQ Agent
  → FAQ Agent: retrieves return policy chunks
  → "You can return any item within 30 days
     of delivery for a full refund."

"I've been waiting 3 weeks and nobody responds"
  → Orchestrator: this is a complaint
  → Routes to: Complaint Agent
  → "I'm really sorry about this experience.
     Three weeks without a response is not
     acceptable. I've escalated this to our
     support team — someone will reach out
     to you within 24 hours."

Why this matters

Single-agent systems hit a ceiling fast. The moment you need database access AND document retrieval AND emotional intelligence in the same conversation, one agent with one system prompt can't do it well.

Multi-agent systems solve this by letting each agent be excellent at one thing. The orchestrator is excellent at routing. The order agent is excellent at SQL. The FAQ agent is excellent at retrieval. The complaint agent is excellent at empathy.

This is how real production AI systems work. Not one giant model doing everything — a team of focused agents, each with their own tools, their own context, their own personality.

Architecture decisions

Why an orchestrator instead of a classifier?

A traditional classifier maps input to a fixed set of categories. An LLM orchestrator understands nuance — it can handle "I ordered the wrong size, can I exchange it?" which is both an order question AND a policy question. The orchestrator can route to FAQ first (exchange policy), then to Order (initiate the exchange). Sequential routing, not just classification.

Why separate system prompts per agent?

The complaint agent needs to be warm and apologetic. The order agent needs to be precise and factual. These are contradictory personalities. One system prompt can't be both empathetic and clinical. Separation lets each agent have the exact tone its job requires.

Why reuse the patterns from previous projects?

The Order Agent is literally the SQL Query Agent with a customer support system prompt. The FAQ Agent is literally the RAG Document Agent pointed at support docs. Building agents as composable patterns — not monolithic applications — means every new system you build gets faster because you're assembling, not rebuilding.

What this demonstrates

Multi-agent orchestration — routing, not just responding
Agent specialization — each agent has its own tools, context, and personality
Pattern composition — SQL agent + RAG agent + empathy agent, assembled into a system
Intent-based routing — LLM understands what the customer needs, not just what they said
The ceiling of single-agent systems — and how to break through it

Tech stack

Layer	Technology
Language	Java 17
Framework	Spring Boot 3.3.4
AI integration	Spring AI 1.0
LLM	Llama 3.3 via Groq
Embeddings	nomic-embed-text via Ollama
Vector store	pgvector
Database	PostgreSQL
Build	Maven

What's next

Conversation memory — maintain context across multiple messages in a session
Agent handoff protocol — let agents transfer context to each other mid-conversation
Feedback loop — track which routes were correct, improve orchestrator over time
Human escalation API — complaint agent triggers real support ticket creation
Load balancing — route to different LLM providers based on latency and cost

Multi-Agent Support System