In the rapidly evolving world of AI engineering, "Multi-Agent Systems" (MAS) have become the buzzword of the day. OpenAI and LangChain encourage us to spin up a "Sales Agent", a "Support Agent", a "Payment Agent", and a "Logistics Agent", all talking to each other to serve a single customer.
But from my experience building Sakura, a complex e-commerce orchestration bot, I've learned a contrarian truth: The more you try to avoid mimicking human task execution, the more you over-engineer.
Despite how complex Sakura may sound, I use only one agent to handle the entire lifecycle:
- Customer Inquiry
- Order Acceptance
- Payment Processing & Verification
- Delivery Fulfillment & Driver Communication
Here is why a single, well-architected agent is often better than a naive swarm, and how I implemented it in Sakura.
The Problem with "Agentic Swarms"
The premise of MAS is specialization. In theory, a "Payment Agent" is an expert at payments. In practice, blindly splitting these flows introduces massive friction:
- Context Loss: Passing state between agents is messy. "Support" needs to know what "Sales" just promised.
- Latency & Cost: Every handoff is likely an LLM call. A simple order flow could spiral into 10+ internal messages.
- Clunky Logic: Debugging a conversation where Agent A misunderstood Agent B is a nightmare.
The Sakura Approach: Vertical Integration
Instead of distinct agents, I use a Single State Graph with distinct modes. The agent changes its "hat" based on the state of the conversation, but it's the same brain, same memory, and same persona.
1. Centralized State Management
In Sakura, I use a strict state machine (graph.py) where the agent's "memory" is a unified dictionary (collected_data).
# backend/app/customer/graph.py
def collect_data(state_config, user_input, collected_data, user_intent):
"""
One function to rule them all.
Whether collecting 'size' (Sales) or 'receipt_image' (Payment),
it goes into the same context.
"""
collect_specs = state_config["collect"]
for collect_spec in collect_specs:
field = collect_spec["field"]
# ... validation logic ...
collected_data[field] = value
return collected_data
This ensures that when the "Payment" stage needs to know the "Agreed Price" from the "Sales" stage, it's just a key lookup (collected_data['agreed_price']), not an inter-agent query.
2. Service-Oriented Tools, Not Agents
When a task is "heavy", I don't spawn an agent; I call a Service.
For example, generating a payment link via Paystack is a deterministic operation. It doesn't need an LLM to "think" about how to create a link. It just needs a function.
# backend/app/customer/graph.py
if trigger_action == "create_payment_link":
payment_result = _generate_payment_link_sync(collected_data)
if payment_result:
collected_data["payment_link"] = payment_result["url"]
The agent decides when to call this (based on the state machine), but the execution is code. Using an agent here would be wasteful.
3. Verification as a Logic Layer
Verifying a payment receipt is complex. It involves Vision AI (OCR) and API checks. A MAS approach might have a "Verifier Agent" look at the image and report back.
In Sakura, it's just another state transition:
# backend/app/customer/graph.py
if next_config.get("trigger_action") == "verify_receipt_image":
# 1. Vision Service extracts text
receipt_data = extract_receipt_data_sync(user_image)
# 2. Logic compares with Paystack
result = verify_receipt_against_paystack(receipt_data, collected_data)
if result.verified:
final_state = "payment_confirmed"
else:
final_state = "payment_mismatch"
The "Verify" logic acts as a filter. The main agent enters the verify_receipt state, runs the logic, and exits into payment_confirmed or payment_mismatch. It's seamless to the user.
4. Direct Communication for Logistics
Communicating with a driver seems like a job for a separate agent. And technically, the driver is a separate user. But the Customer's Agent handles the coordination directly via a shared queue.
# backend/app/customer/graph.py
if trigger_action == "queue_pickup_request":
# The Customer Agent directly pushes a job to the Driver's queue
# No need for a "Logistics Agent" to act as a middleman
driver_chat.add_pickup_notification(
location,
request_id,
order_details
)
The Customer Agent doesn't talk to a "Logistics Agent" to ask it to talk to the driver. It just posts the request. The Driver (human or bot) sees it, responds, and the system updates the state.
When SHOULD You Use Multi-Agent?
I am not against MAS entirely. You should use them when:
- Security Boundaries: If "Sales" and "Refunds" have strictly different database access levels.
- Asynchronous, Long-Running Tasks: If "Researching a product" takes 5 minutes, offload it so the main agent remains responsive.
- Adversarial Flows: If you need one agent to critique another (e.g., Code Reviewer vs Coder).
Conclusion
For 90% of transactional bots—even complex ones like Sakura that handle sales, payments, and dispatch—a single agent with a robust State Machine and Tool/Service Layer is superior. It's faster, cheaper, and easier to reason about.
Don't over-engineer. Build a better brain, not more bodies.
