Orchestration Patterns: Sequential, parallel, evaluate loops — Điều phối multi-agent như lập trình workflow

Học 3 pattern điều phối agent cơ bản: Sequential pipeline, Parallel map-reduce, và Evaluate loops. Kiến trúc thực chiến với GoClaw, CrewAI và LangGraph.

Khi bạn chuyển từ single agent sang multi-agent, điều khó không phải là viết thêm code cho agent thứ hai, mà là trả lời câu hỏi: "Agent này chạy trước, agent này chạy sau, khi nào thì quay lại sửa lỗi?" Bài viết này đi sâu vào ba pattern điều phối (orchestration) cơ bản — Sequential, Parallel, và Evaluate Loops — cách triển khai thực tế trên GoClaw, CrewAI và LangGraph, và tại sao chúng giải quyết được "mode collapse" của LLM đơn lẻ.

Vấn đề

Đặt một LLM đơn lẻ làm mọi thứ — vừa lên kế hoạch, vừa viết code, vừa review — và bạn sẽ thấy hiện tượng context pollution: prompt ban đầu bị "pha loãng" bởi các bước thực thi xen vào, khiến agent quên mục tiêu gốc hoặc bỏ qua ràng buộc quan trọng. Theo benchmark sản xuất, single agent thất bại tới 90% trên task đa bước phức tạp, không phải vì model ngu, mà vì attention mechanism của Transformer ưu tiên token gần nhất — early instructions fade không phải do giới hạn context window, mà do attention dilution.

Cách tiếp cận cũ — dùng một system prompt dài 8K token liệt kê mọi vai trò — tạo ra "jagged abstraction": agent phải đồng thời là kiến trúc sư và thợ xây, khiến cognitive modes xung đột. Kết quả là hallucination cascade: một lỗi ở bước 3 lan truyền qua 20 bước sau, biến output cuối thành vô nghĩa.

Ý tưởng cốt lõi

Orchestration patterns là control-flow primitives cho hệ thống agent: chúng định nghĩa thứ tự, sự phụ thuộc, và cơ chế lặp cải tiện. Có ba pattern cơ bản:

Sequential (Pipeline)

Pattern này xây dựng directed pipeline nơi agent N chỉ nhận output đã biến đổi của agent N-1. State được tinh chỉnh dần qua các stage chuyên biệt: Extract → Transform → Load.

Trong GoClaw, bạn thể hiện điều này qua blocked_by trong task board:

# .go/AGENTS.md
tasks:
  - id: crawl_data
    agent: web_scraper
    action: "fetch pricing from 3 competitors"
    
  - id: analyze_trends
    agent: data_analyst
    action: "calculate moving average"
    blocked_by: [crawl_data]  # Sequential dependency rõ ràng
    
  - id: generate_report
    agent: writer
    action: "draft executive summary"
    blocked_by: [analyze_trends]

CrewAI cung cấp abstraction tương tự qua Process.sequential:

from crewai import Crew, Process

crew = Crew(
    agents=[scraper, analyst, writer],
    tasks=[task_crawl, task_analyze, task_write],
    process=Process.sequential,  # Mỗi task phải xong mới tới task kế
    memory=True  # Shared context giữa các step
)

Chi tiết quan trọng: Sequential không chỉ là "chạy tuần tự" — nó là strict dependency ordering. Agent sau không thể "nhìn trộm" dữ liệu thô từ agent đầu; nó chỉ thấy output đã qua lọc. Điều này tạo "cognitive firewall": analyst không bị nhiễu bởi HTML raw noise từ scraper, chỉ thấy JSON đã extract.

Parallel (Fan-out/Fan-in)

Khi các subtask không phụ thuộc lẫn nhau (research tài liệu A, B, C song song), pattern này map input ra nhiều specialist agent đồng thời, rồi reduce kết quả qua aggregator.

LangGraph triển khai pattern này bằng cách gửi state đến nhiều node song song:

from langgraph.graph import StateGraph

builder = StateGraph(State)
builder.add_node("search_arxiv", arxiv_agent)
builder.add_node("search_semantic", semantic_agent)
builder.add_node("search_pubmed", pubmed_agent)
builder.add_node("synthesize", aggregator_agent)

# Fan-out: cả 3 agent search chạy song song
builder.add_edge("start", "search_arxiv")
builder.add_edge("start", "search_semantic")
builder.add_edge("start", "search_pubmed")

# Fan-in: aggregator chờ cả 3 xong mới chạy
builder.add_edge(["search_arxiv", "search_semantic", "search_pubmed"], "synthesize")

Benchmark cho thấy parallel fan-out giảm 40-60% latency so với sequential pipeline cho document processing workload, với accuracy variance giảm nhờ voting aggregation (arXiv:2603.22651).

Critical detail: Parallel yêu cầu non-intersecting subtasks. Nếu agent A và B cùng sửa một file, bạn cần merge strategy (diff3, hoặc LLM-as-merger) để tránh conflict.

Pattern này áp dụng actor-critic architecture: Generator agent sản xuất draft → Critic agent chấm điểm/góp ý → Generator revise. Loop dừng khi đạt threshold chất lượng hoặc max iterations.

Triển khai trong LangGraph với conditional edges:

def quality_gate(state):
    if state["grade"] == "pass" or state["iterations"] >= 3:
        return END
    return "generator"  # Quay lại viết lại

graph.add_node("generator", writer_agent)
graph.add_node("critic", reviewer_agent)
graph.add_edge("generator", "critic")
graph.add_conditional_edges("critic", quality_gate)

Trong GoClaw, bạn có thể mô phỏng loop qua HEARTBEAT.md hoặc custom hook kiểm tra diff_size sau mỗi lần edit, nhưng evaluate loops hoạt động tốt nhất trên LangGraph/CrewAI nhờ explicit state machine.

Aha moment: Evaluate loops giải quyết vấn đề LLM "tự đánh giá" kém. Khi cùng một instance vừa viết vừa review, nó bị confirmation bias — ưu tiên pattern vừa generate. Tách ra hai agent (hoặc hai system prompts khác nhau) tạo "fresh eyes" cho critique.

Tại sao nó hoạt động

Ba pattern này hiệu quả vì chúng ép buộc separation of cognitive modes — tách biệt generator (sáng tạo) và critic (phê phán), hoặc tách biệt data extraction (nhị thức) và synthesis (sáng tạo).

Sequential ngăn chặn mode collapse: khi một LLM cố vừa plan vừa code vừa test, attention bị phân tán. Pipeline buộc mỗi agent chỉ focus một mode, tương tự như assembly line chia công đoạn.

Parallel khai thác embarrassingly parallel tasks (research nhiều nguồn độc lập). Nhưng lợi ích sâu hơn là context sharding: mỗi agent chỉ load tool schema liên quan đến subdomain (e.g., agent Y chỉ có tool search_medical, không bị rối mắt bởi 40 tool khác), giải quyết "kitchen sink" problem (tool selection accuracy giảm từ 40% xuống 0% khi có 50+ tools).

Evaluate Loops áp dụng nguyên lý gradient descent cho reasoning: mỗi iteration là bước điều chỉnh dựa trên feedback signal. Nhưng khác với code loop vô hạn, LLM loop cần termination gate cứng (critic là deterministic rule hoặc separate LLM với temperature thấp).

Trade-off rõ rệt:

Sequential dễ debug (linear trace) nhưng latency cộng dồn (O(N) calls) và error propagation nghiêm trọng (garbage-in-garbage-out).
Parallel nhanh nhưng tạo merge conflict khi specialists mâu thuẫn (cần consensus logic).
Evaluate loops cải thiện quality nhưng rủi ro infinite oscillation (generator và critic "bắt tay" qua lại không tiến bộ) hoặc evaluator bias (critic đánh giá sai hướng generator đến local minima).

Ý nghĩa thực tế

Benchmarks sản xuất:

Google Research (2025): Multi-agent coordination cải thiện +81% trên parallelizable tasks, nhưng gây overhead tới 70% trên task bản chất sequential (vì coordination cost không giảm critical path).
AIMultiple: Sequential orchestration tiêu tốn token và latency gấp ít nhất 2x so với GroupChat (parallel) modes cho cùng task.
arXiv:2603.22651: Parallel fan-out với merge giảm end-to-end latency 40-60% cho financial analysis (multi-source fusion).

Khi nào dùng pattern nào?

Sequential: ETL pipelines, code review (read → analyze → rewrite), compliance checks (step N cần kết quả step N-1 để validate).
Parallel: Research đa nguồn (arxiv + web + internal docs), data enrichment (bổ sung metadata song song), A/B testing nhiều prompts cùng lúc.
Evaluate Loops: Content generation (blog post, report), code generation với test verification, mathematical reasoning (verify → correct).

Ai đang dùng?

LangGraph: Pipelines phức tạp có loop (ReAct, Reflexion) và conditional branching.
CrewAI: Sequential process cho sales automation (research prospect → draft email → personalize).
GoClaw: Task board với blocked_by cho DevOps automation (deploy staging → test → deploy prod).

Hạn chế:

Sequential: Không recovery giữa pipeline; lỗi ở step 3 đòi restart từ đầu hoặc implement checkpointing thủ công.
Parallel: Tăng cost tuyến tính với số agent (3 agent chạy song song = 3x API cost).
Evaluate Loops: Không có guarantee hội tụ; cần guardrail max_iterations cứng và human-in-the-loop cho vòng lặp cuối.

Đào sâu hơn

Tài liệu chính thức:

Microsoft Azure AI Agent Design Patterns — So sánh sequential, parallel, và hierarchical orchestration.
CrewAI Process Documentation — Triển khai sequential và parallel trong CrewAI.
LangGraph StateGraph — Cấu trúc graph với conditional edges cho evaluate loops.

Bài liên quan TroiSinh:

Paper: "Benchmarking Multi-Agent LLM Architectures for Financial Analysis" (arXiv:2603.22651) — So sánh quantitative giữa sequential và parallel.
Paper: "Real-time Agent Orchestration for Efficient Deep Research" (arXiv:2510.05145) — Optimization cho evaluate loops với adaptive termination.
Blog: Avi Chawla — 7 Patterns in Multi-Agent Systems — Visual breakdown các pattern phổ biến.

Orchestration Patterns: Sequential, parallel, evaluate loops — Điều phối multi-agent như lập trình workflow

Vấn đề

Ý tưởng cốt lõi

Sequential (Pipeline)

Parallel (Fan-out/Fan-in)

Evaluate Loops (Iterative Refinement)

Tại sao nó hoạt động

Ý nghĩa thực tế

Đào sâu hơn

Cùng cụm: Tại sao cần nhiều agent?

Cùng cụm: Agent Teams Architecture

Cùng cụm: Agent Communication

Cùng cụm: Role-based Agents

Đọc tiếp: Hooks & Quality Control

Đọc tiếp: Kiến trúc nâng cao

On this page