Inbox Triage Agent

java · spring ai · groq · llama 3.2github

An agentic AI system built in Java that autonomously monitors an inbox, reads unstructured messages, and classifies each as HIGH, MEDIUM, or LOW priority — using LLM reasoning with zero hardcoded rules.

Built to understand one foundational question: how does an LLM interact with the real world?


The problem

Traditional inbox automation is brittle. Keyword matching, rule engines, "if subject contains URGENT, mark HIGH." It breaks constantly and misses context entirely.

This agent reads actual content and understands intent — the same way a human would. A message about a production outage is HIGH not because it contains a keyword, but because the LLM understands what a production outage means.


How it works

The agent runs a ReAct loop — think, act, observe, repeat — using three tools as its only interface to the filesystem.

User: "Triage my inbox"
         │
         ▼
    LLM (Llama 3.2 via Groq)
         │
         ▼  ReAct loop
  ┌──────────────────────────┐
  │  listFiles()             │  → scans inbox, returns paths
  │  readFile(path)          │  → reads raw content
  │  markFile(path, priority)│  → renames with label
  └──────────────────────────┘
         │
         ▼
    Final summary with reasoning per file

The LLM decides which tools to call, in what order, and what priority to assign — based purely on understanding the content. No orchestration logic written by hand.


Architecture decisions

Why Spring AI over LangChain4j?

Spring AI 1.0 has first-class tool calling support via @Tool annotation. It fits naturally into existing Spring Boot patterns. LangChain4j adds complexity where none is needed for clean, single-purpose tool calling.

Why Groq over OpenAI?

Groq's LPU hardware runs Llama 3.2 at ~800 tokens/second versus ~50 on GPU. For an agent processing many files in sequence, inference speed directly impacts throughput. Cost at current free tier is zero.

Why three separate tools instead of one?

Single responsibility. listFiles has no business reading content. readFile has no business making priority decisions. Each tool is independently testable, and the LLM composes them as needed. If reading fails, listing still works.

The security boundary

Tools are the only way the LLM can touch the filesystem. It cannot list, read, or rename anything outside explicitly exposed @Tool methods. The LLM has no direct filesystem access — Java code does, and Java code is the gatekeeper.


What this demonstrates

Tool calling pattern — how LLMs interact with real systems through structured function calls, not freeform text.

ReAct loop — the think-act-observe cycle that makes an LLM an agent rather than a chatbot. The model reasons about what to do, does it, sees the result, and reasons again.

Tool design — single responsibility, clear descriptions, controlled scope. Good tools are the difference between an agent that works and one that hallucinates its way through a task.

Agentic security model — LLM capabilities are bounded by the tools you expose. Nothing more, nothing less. The model cannot do what the tools don't allow.


Stack

Layer Technology
Language Java 17
Framework Spring Boot 3.3.4
AI integration Spring AI 1.0
LLM Llama 3.2 via Groq API
Build Maven

Sample output

AGENT: I've triaged all 3 files in your inbox:

HIGH_alert1.txt
→ Production server down with payment failures.
  Requires immediate attention.

MEDIUM_task1.txt
→ Wiki update request. Important but not time-critical.
  Can be handled today or tomorrow.

LOW_info1.txt
→ Schedule change notification.
  Informational only, no action required.

What's next

  • Add email ingestion — fetch unread emails, triage them the same way
  • Add notification tool — Slack message for HIGH priority items automatically
  • Move tools to a separate MCP server — any agent can use the same inbox tools
  • Add human-in-the-loop — HIGH priority items require confirmation before marking
© 2026