Agent Communication: Permission links, direction control, concurrency — Bản đồ giao tiếp cho hệ thống multi-agent

Hiểu rõ Permission links (capability-based), Direction control (4 patterns) và Concurrency models (immutable logs) để thiết kế multi-agent system không bị mấ...

Khi chuyển từ single agent sang multi-agent, bạn đối mặt với hiện tượng "control loss cascade": Agent A gọi Tool B, Tool B trigger Agent C, và bạn mất kiểm soát hoàn toàn luồng thực thi. Bài viết này đi sâu vào ba trụ cột của giao tiếp agent — Permission links (quyền truy cập dựa trên capability), Direction control (bốn pattern luồng điều khiển), và Concurrency (xử lý đồng thời với immutable logs) — những kiến trúc thực chiến từ GoClaw, CrewAI và Google A2A.

Vấn đề

Multi-agent systems phá vỡ mọi giả định của distributed systems truyền thống. Trong Kubernetes hay Raft, mỗi node là deterministic: input X luôn cho output Y. Nhưng LLM agents là non-deterministic state machines — cùng một prompt, hai lần chạy cho kết quả khác nhau. Điều này làm nổ tung các cơ chế đồng bộ cổ điển.

Bạn còn đối mặt với "kitchen sink" problem: khi agent có 5 tools, accuracy chọn tool là ~40%; với 50 tools, accuracy gần như 0% nếu không có progressive disclosure. Và khi Agent A ủy quyền cho Agent B, rồi B lại trigger tool C, người dùng mất khả năng kiểm soát hoàn toàn — "control loss cascade" — vì không có cơ chế rõ ràng để theo dõi ai được phép làm gì.

Ý tưởng cốt lõi

Giải pháp gồm ba trụ cột: Permission links (xác thực dựa trên capability), Direction control (bốn pattern giao tiếp), và Concurrency models (immutable logs thay vì locks).

Permission Links: Capability là địa chỉ

Thay vì hỏi "Ai bạn là ai?" rồi tra cứu quyền trong database (ACL — Access Control Lists), chúng ta dùng capability-based security: URL chính là quyền. Token được nhúng trực tiếp vào địa chỉ resource.

Ví dụ thực tế trong GoClaw:

file://workspace/report.md?cap=HMAC123&op=append&exp=1700

URL này không chỉ chỉ đến file, nó là quyền: cho phép append (op=append) đến hết ngày 1700, với chứng thực HMAC123. Đây là permission link — agent nhận được link này mới có thể hành động, và không thể forge hay escalate quyền.

Có ba hướng link:

Outbound: Agent gọi tool bên ngoài (unidirectional delegation)
Inbound: Agent nhận lệnh từ coordinator (sequential control)
Bidirectional: Peer-to-peer collaboration như Google A2A protocol

Direction Control: Bốn pattern luồng điều khiển

1. Sequential — LangGraph style Continuation-passing: Agent N nhận output của Agent N-1, transform rồi truyền tiếp. Phù hợp pipeline ETL hay reasoning chain nơi bước sau phụ thuộc hoàn toàn bước trước.

2. Delegative — CrewAI role-based team Fork/join pattern: Coordinator agent phân công subtask cho worker agents (ví dụ: Researcher → Writer → Editor), mỗi agent chạy isolated trên context riêng, rồi aggregator merge kết quả. Giống như MapReduce nhưng với LLM.

3. Collaborative — GoClaw agent teams Actor-model message passing: Agents trao đổi trực tiếp qua shared task board. Ví dụ trong GoClaw, agent có thể để lại file /workspace/blocked_by/review.md với nội dung "Waiting for security check", agent khác pickup và tiếp tục. Đây là handoff có trạng thái, không phải fire-and-forget.

4. Emergent — Blackboard/Stigmergy Agents không giao tiếp trực tiếp mà thông qua môi trường chung: append-only file logs. Fast.io dùng pattern này — agents chỉ write new files, không bao giờ overwrite, tạo thành event sourcing log mà bất kỳ agent nào cũng có thể replay.

Concurrency: Immutable logs thay vì Locks

Vì LLM là non-deterministic, bạn không thể dùng read-write locks (agent A đọc, agent B sửa, agent A hallucinate vì data đã cũ). Thay vào đó, dùng Blackboard pattern với append-only logs:

/workspace/
  ├── events/
  │   ├── 001_user_request.md
  │   ├── 002_research_complete.md
  │   └── 003_draft_ready.md

Mỗi agent chỉ append event mới, không bao giờ modify file cũ. Agent khác "quan sát" bằng cách read tất cả events từ 001 đến N, reconstruct state. Đây là event sourcing — concurrency without locks, phù hợp với bản chất stochastic của LLM.

ACP (Agent Control Protocol) của Linux Foundation cũng dùng RESTful state transfer kết hợp admission control gates: agents không truy cập trực tiếp resource, mà thông qua protocol layer kiểm soát.

Tại sao nó hoạt động

Capability-based > ACL vì tránh "confused deputy attack". Trong ACL, một khi authenticated, agent có thể "vô tình" dùng quyền admin vì context bị pollution. Capability URLs ràng buộc quyền vào specific resource — agent không thể "lỡ tay" xóa database vì capability đó chỉ trỏ đến specific file với quyền append-only.

Immutable logs > Read-Write Locks vì LLM không có khái niệm "transaction". Khi agent A đang reasoning về dữ liệu X, agent B sửa X, A sẽ hallucinate vì context window của A vẫn giữ "snapshot" cũ. Append-only logs biến mọi thay đổi thành facts mới — agent A thấy event 003 (draft_ready) và tự adjust, thay vì cố gắng "lock" resource.

Trade-off: Bạn mất khả năng dùng consensus truyền thống (Raft, 2PC). LLM think time tính bằng seconds-to-minutes, làm slow consensus protocols trở nên bất khả thi. Giải pháp là admission control và blackboard — chấp nhận eventual consistency thay vì strong consistency.

Ý nghĩa thực tế

Chỉ số	Single Agent	Multi-Agent (có Communication Patterns)
Tool selection accuracy (50 tools)	~0%	~40% với progressive disclosure
Network traffic	Serialize full payload mỗi lần	Chỉ truyền pointers (Fast.io benchmarks)
Debug complexity	Linear	Cao hơn 2-3x (cần tracing cross-agent)
Throughput	Sequential	Parallelizable tasks tăng 30-40%

Ai đang dùng: AWS (Strands SDK, Bedrock AgentCore), Google (A2A protocol), CrewAI (role-based delegation), GoClaw (shared task board, blocked_by handoff), Fast.io (file-based coordination).

Hạn chế — Khi nào KHÔNG dùng:

Single agent đủ: Nếu task chỉ cần 1-2 tool calls, thêm multi-agent chỉ tăng latency và cost.
Byzantine fault tolerance: Nếu agent có thể bị compromise (malicious agent), các pattern này không đảm bảo consensus.
Agent alignment: Chưa có giải pháp standardized để đảm bảo sub-agent preserve parent intent qua nhiều hops.

Chi phí: Multi-agent đắt hơn rõ rệt (2-3x token cost), và debug khó hơn vì cần trace qua nhiều agent context. Chỉ dùng khi single agent thực sự fail với "cognitive overload" — tức là khi cần tách biệt planning vs execution vs critique thành các agent chuyên biệt.

Đào sâu hơn

Tài liệu chính thức:

ACP Specification — Agent Control Protocol: Admission Control for Agent Actions (Linux Foundation)
Google A2A — Agent-to-Agent protocol với Agent Cards
CrewAI Documentation — Role-based team orchestration

Bài liên quan TroiSinh:

Hooks & Quality Control — Kiểm soát chất lượng khi agent giao tiếp
Advanced Architecture — Scaling lên 1000+ agents

Mở rộng:

Paper: "Beyond Context Sharing: A Unified Agent Communication Protocol" (arXiv:2602.15055v1) — Đề xuất ACP như "TCP/IP của agents"
Paper: "A Communication-Centric Survey of LLM-Based Multi-Agent Systems" (arXiv:2502.14321v2) — Taxonomy đầy đủ về communication topologies

Agent Communication: Permission links, direction control, concurrency — Bản đồ giao tiếp cho hệ thống multi-agent

Vấn đề

Ý tưởng cốt lõi

Permission Links: Capability là địa chỉ

Direction Control: Bốn pattern luồng điều khiển

Concurrency: Immutable logs thay vì Locks

Tại sao nó hoạt động

Ý nghĩa thực tế

Đào sâu hơn

Tại sao cần nhiều agent? Single agent failure modes

Agent Teams: Shared task board, delegation, handoff

Role-based Agents: Mỗi agent một chuyên môn

Orchestration Patterns: Sequential, parallel, evaluate loops

On this page