
By Dr. Magesh Kasthuri, Distinguished Member and Chief Architect, Wipro Limited, and Dharanidharan Murugesan, Technical Lead, Wipro Limited
In the landscape of Generative AI and Agentic AI solutions, the need for efficient and robust multi-agent communication protocols is paramount. Multi-agent systems (MAS) consist of multiple interacting intelligent agents which work collaboratively to solve complex problems. Two prominent protocols used in this domain are the Model Context Protocol (MCP) which provides context to LLMs to connect AI models and the Agent-to-Agent (A2A) protocol (recently announced by Google as an open protocol). This article delves into these protocols, elucidates how multi-agent communication can be executed, and presents industrial examples and use cases.
Multi-Agent Solution Communications
Communication in multi-agent systems can be achieved through various methods, each tailored to the specific needs and constraints of the application. Common techniques include message passing, blackboard systems, and publish-subscribe models.
Message Passing
Message passing is a fundamental method where agents exchange information through direct messages. This approach is beneficial for real-time applications requiring immediate responses.
Blackboard Systems
In blackboard systems, a shared data repository (the blackboard) is used for communication. Agents post and retrieve information from the blackboard, which serves as a centralized coordination hub.
Publish-Subscribe Models
The publish-subscribe model decouples the producer and consumer of information. Agents publish messages to specific topics, and other agents subscribe to these topics to receive relevant updates.
Human – Agent Collaboration Layer:
-
Mixed-Initiative Interaction
The concept of mixed-initiative systems, where both humans and agents can initiate actions, is well-established. The Joint-Initiative Supervised Autonomy (JISA) framework helps agents to ask for human help when they are unsure. It also lets human supervisors’ step in based on what they see.
Example:
Customer Support Chatbots: Modern chatbots can handle routine inquiries autonomously but will escalate complex issues to human agents when necessary. For example, if a chatbot can’t resolve a billing issue, it can transfer the conversation to a human representative
-
Adjustable Autonomy
Adjustable autonomy refers to systems where the level of autonomy can be dynamically modified based on context or user preference, allowing human operators to adjust the autonomy levels of agents in real-time to suit varying operational demands.
Example:
Autonomous Vehicles: Self-driving cars can operate independently but allow human drivers to take control when needed. For instance, a driver can switch to manual mode in challenging driving conditions like heavy rain or snow
-
Explainability Interfaces (XAI)
Explainable AI aims to make AI decision-making processes transparent and understandable to humans. It increases the trust of end users and adoption will also increase.
Example:
Financial Trading Platforms: AI-driven trading systems provide explanations for their investment decisions, helping traders understand the rationale behind buying or selling stocks. This transparency builds trust and helps traders make informed decisions
-
Trust Calibration Protocols
Building and maintaining trust between humans and agents is crucial. Research in human-agent teaming highlights the importance of agents being observable, predictable, and directable to foster trust. Agents should proactively communicate their status and intentions, enabling humans to understand and predict agent behaviour.
Example:
Collaborative Robots (Cobots): In manufacturing, cobots work alongside human workers and continuously communicate their status and intentions. This helps workers predict the cobots’ actions and fosters a reliable working relationship.
-
Adaptive Collaboration Protocols
Adaptive collaboration involves agents learning and adapting to human partners over time. This includes understanding human preferences, communication styles, and decision-making patterns to improve coordination and efficiency in joint tasks. Such adaptability is essential for effective human-agent collaboration.
Example:
Personal Assistants (e.g., Siri, Alexa): These AI assistants learn user preferences over time, adapting their responses and suggestions to better suit individual needs. For example, they might learn your preferred music genres and suggest songs accordingly.
-
Error Recovery and Repair Dialogue
Effective human-agent systems incorporate mechanisms for error detection and recovery. This includes agents recognizing when errors occur, initiating repair dialogues, and allowing for undo or rollback actions to correct mistakes, thereby enhancing system robustness and user trust.
Example:
Online Shopping Platforms: If a checkout error occurs, the system can initiate a repair dialogue. For instance, if there’s a payment failure, it might prompt, “There was an issue with your payment. Would you like to try a different method or review your details?” This helps users resolve issues and continue their purchase.
Model Context Protocol (MCP)
The Model Context Protocol (MCP) is a versatile framework designed to facilitate interaction among agents operating in different contexts. MCP provides a structured approach to handle information exchange, ensuring that agents maintain their operational autonomy while collaborating seamlessly.
Core Components of MCP
MCP is built on a context-centric architecture and sometimes called as the USB-C of AI Applications that defines clear roles for how context (data and tools) is shared with AI models.
Key components include:
Context Objects: These are structured units of data, such as files or database entries, that carry the information models need to reason accurately. They are enriched with metadata like type, version, and identifiers, enabling consistent access and updates during AI sessions.
Context Manager (Host): The host orchestrates the MCP session. It spawns agents, manages lifecycle events, aggregates context from multiple sources, and enforces policies like access control and isolation. It ensures models receive only the permitted and relevant context.
Agents (Clients): Each agent interacts with one context server and acts as the interface between the AI model and external data or tools. They operate in isolation and access only the context allowed by the host, ensuring secure and efficient context usage.
Model Interfaces: These define how the model uses the shared context—through prompts, tools, or resources. MCP provides a consistent way for tools and data to be invoked, allowing agents to dynamically retrieve or operate on context during execution.
Synchronization Mechanisms: MCP supports both push and pull modes. Push allows servers to send updates automatically when data changes, while pull lets clients request data on demand. Together, they maintain up-to-date context across sessions.
Figure: Core concepts of Model Context Protocol
Working Principle of MCP
Session Initialization: Communication begins with a handshake where the client and server agree on protocol version and capabilities. This sets up a shared, stateful session for efficient data exchange.
Pull Model: Clients fetch data as needed using JSON-RPC calls. This model is used when updates are infrequent or predictable, offering simplicity and control.
Push Model: Servers proactively notify clients of context changes. Subscriptions reduce the need for polling and allow real-time synchronization of rapidly changing context.
Versioning: Resources are tracked using version identifiers or timestamps. This ensures clients operate on the latest data and can detect and resolve stale or conflicting states.
Example: An agent connected to a Git server subscribes to a repository. When a new commit is pushed, the server notifies the agent, which then updates its context with the latest code.
MCP vs Traditional Message passing
Aspect | MCP | Traditional Message Passing |
Context-Awareness | Keeps track of context over time, enabling more informed interactions. | Treats each message independently without memory of past exchanges. |
Standard Interfaces | Unifies how models access data and tools across services, minimizing custom integration. | Needs tailored APIs and data handling for each use case. |
Stateful vs Stateless | Maintains active sessions that preserve context and reduce redundancy. | Reinitializes context for every request, leading to inefficiencies. |
Modularity | Allows new context sources to be integrated easily without modifying the agent logic. | Often hardcodes integration logic, reducing flexibility. |
Efficiency | Uses subscriptions and context caching to minimize unnecessary data transfer. | Typically lacks optimization for dynamic, changing data. |
Security & Privacy in MCP
Encrypted Transport: All MCP communication should use secure channels like TLS to prevent data interception or tampering. Encryption ensures context remains confidential in transit.
Access Control: The host mediates every context interaction, requiring explicit user consent and policy checks. Only authorized agents can fetch or manipulate data.
Trust Management: Clients connect only to verified and approved servers. Identifiers, authentication tokens, and signed metadata help prevent unauthorized access.
Context Isolation: Each server is sandboxed; it only sees the part of the conversation relevant to its resources. This prevents cross-server data leakage and maintains compartmentalization.
Mitigation Strategies: Data validation, rate limiting, and sandboxing of tool invocations help mitigate risks. Logging and audit trails offer traceability for any misuse or breach.
Below reference architecture explains how multi-agent communication is developed in the healthcare solution using Azure native services and Agentic AI engine for patient diagnostic management.
Figure: Reference Architecture for Healthcare Agents communication using MCP in Azure platform
Features of MCP
- Context Management: MCP excels in managing different contexts, allowing agents to operate efficiently within their specialized domains.
- Interoperability: It ensures interoperability among diverse agents by standardizing communication protocols and data formats.
- Scalability: MCP supports the scalability of systems by enabling the integration of new agents without disrupting the existing setup.
- Fault Tolerance: The protocol incorporates mechanisms for fault detection and recovery, enhancing the reliability of multi-agent systems.
Agent-to-Agent (A2A) Protocol
The Agent-to-Agent (A2A) protocol is another critical communication framework, emphasizing direct interaction between agents. A2A protocol facilitates real-time communication and decision-making processes, making it suitable for dynamic and rapidly changing environments.
Below diagram shows a reference architecture on the retail world – supply chain usecase developed in Azure platform that uses A2A protocol for Agentic integration. LangGraph which is a powerful library on top of LangChain (OpenSource framework to develop applications with LLM) to develop multi-agent application. In this example, the integration of LangGraph with the Azure platform offers a transformative approach to managing retail supply chains. By harnessing the power of multi-agent solutions, businesses can achieve greater efficiency, responsiveness, and customer satisfaction.
Figure: Reference architecture of A2A protocol in multi-agent integration in a Retail Supplychain usecase in Azure platform
Features of A2A Protocol
- Direct Communication: A2A enables direct message exchanges between agents, promoting quick and efficient interactions.
- Real-Time Processing: The protocol supports real-time data processing, crucial for time-sensitive applications.
- Flexibility: A2A offers flexibility in communication patterns, allowing agents to adapt to various interaction scenarios.
- Simplicity: The straightforward design of A2A protocol reduces the complexity involved in establishing inter-agent communications.
Industrial Examples and Use Cases
Manufacturing Automation
In manufacturing automation, multi-agent systems are employed to coordinate the activities of robotic arms, conveyor belts, and quality inspection units. MCP ensures that each component operates in its context while sharing critical operational data, leading to optimized production lines.
Smart Grids
Smart grids utilize MAS to manage the distribution and consumption of electricity. Agents representing different grid components, such as substations and consumers, communicate via A2A protocol to balance supply and demand dynamically, ensuring efficient energy use.
Healthcare
In healthcare, multi-agent systems facilitate patient monitoring, diagnostics, and treatment planning. Agents representing medical devices, healthcare providers, and patients interact through MCP, enabling real-time data exchange and collaborative decision-making.
Comparison of MCP and A2A Protocol
Feature | MCP | A2A Protocol |
Context Management | Excellent | Limited |
Interoperability | High | Moderate |
Scalability | High | Moderate |
Fault Tolerance | Strong | Basic |
Direct Communication | No | Yes |
Real-Time Processing | Moderate | High |
Flexibility | Moderate | High |
Simplicity | Moderate | High |
Future Trends in Multi-Agent Solutions
The future of multi-agent solutions and communication interfaces is poised for significant advancements. Several trends are expected to shape this domain:
Enhanced Interoperability
Future protocols will likely focus on improving interoperability among heterogeneous agents, facilitating seamless integration and collaboration.
AI-Driven Coordination
The incorporation of AI techniques in coordination algorithms will enable more intelligent and autonomous agent interactions, enhancing decision-making processes.
Scalability and Robustness
Advancements in scalable and robust communication frameworks will support the deployment of larger and more complex multi-agent systems across various industries.
Standardization
The development of standardized protocols and interfaces will simplify the deployment and maintenance of multi-agent systems, promoting wider adoption.
Conclusion
Multi-agent communication protocols like MCP and A2A are essential for the efficient operation of Generative AI and Agentic AI solutions. As industries continue to adopt multi-agent systems, the need for robust, scalable, and interoperable communication protocols will grow. By understanding the features and applications of these protocols, stakeholders can better leverage the capabilities of multi-agent systems to address complex challenges and drive innovation.