Agent2Agent (A2A) Protocol Deep Dive: Interoperability for Agents

In short

The A2A protocol enables interoperability between AI agents via a shared JSON-RPC based standard.
Agents discover each other through a standardized `/.well-known/agent-card.json` descriptor listing capabilities, skills, and security.
Two ways for communication: synchronous polling (request/response) or real-time streaming with SSE.
Authentication is flexible with multiple schemes (API Key, OAuth2, etc.).

With many users and companies building agents a new challenge arises: interoperability. How can agents communicate with each other when they are built on different platforms and frameworks? Google announced the Agent2Agent (A2A) protocol a few months ago to tackle this issue. In this article, I want to explain why a protocol is important and how the A2A protocol works.

To learn about creating agent architectures, read my previous article here.

The interoperability problem without a shared agent protocol

Different frameworks are being used to build AI agents, which makes it difficult for these agents to communicate with each other by default. Consider the early days of the internet: there were many different networks that could not communicate with each other. The solution was a shared protocol (TCP/IP) that allowed different networks to connect and communicate. The same principle applies to AI agents. Without a shared protocol, agents built on different frameworks cannot easily communicate, leading to fragmentation and inefficiency. This makes adoption harder and limits the potential of multi-agent systems.

Why function calling took off with MCP?

The rise of the Model Context Protocol (MCP) feels like it happened overnight. Within a few weeks after the launch, the number of MCP servers exploded. Anthropic released the protocol in 2024 when most of the AI labs supported function calling. It allowed models to call functions in a standardized way, making it easier to integrate with external tools and services. The timing of MCP was perfect, because MCP standardized the communication between language models and tools. The number of MCP servers exploded and many developers started to build tools and agents on top of MCP. This is a perfect example how a protocol can lead to faster adoption and ecosystem growth. More about the MCP protocol can be found here.

The MCP protocol is not complex. It is a simple set of rules that defines how messages should be structured and exchanged between language models and tools. There is no magic behind it. It uses a widely adopted format (JSON-RPC) that is easy to understand and implement. Many frameworks created a wrapper around MCP to make it even easier for developers to integrate it into their applications.

What is the Agent2Agent (A2A) protocol?

When building a multi-agent system scoped to a single application, a protocol like A2A is not needed. It would be overkill. But when building such a system for example within an organization or to make an agent publicly available for other agents, a protocol is needed. Especially, when agents are being developed by different departments, companies or individuals. Compare it with websites. Every website is built by different people and companies, but they can all communicate (hyperlinks) because they follow the same protocol (HTTP/HTTPS). The A2A protocol is a simple set of rules that defines how agents should communicate and discover each other. The goal of A2A is to enable interoperability between agents built on different frameworks and platforms. Let’s imagine a use case of agents within a company. Different departments have their own agents that should work together.

The sales agent detects a surge in orders for SKU-X. The sales agent communicates with the marketing agent to slow down ads and with the recruitment agent to open temporary roles to handle the surge. The recruitment agent communicates with the IT agent to scale the infrastructure and prepare devices for new hires.

sequenceDiagram
    autonumber
    participant Sales as SalesAgent
    participant Mkt as MarketingAgent
    participant Rec as RecruitmentAgent
    participant IT as ITAgent

    Sales->>Mkt: Slow SKU-X ads ~30% in NL/DE (24h)
    Mkt-->>Sales: Done. Watching and adjusting.

    Sales->>Rec: Open 10 temp roles. Fast-track hiring.
    Rec-->>Sales: Reqs posted. Screening underway.
    Rec->>Sales: We can handle it with 7 senior associates instead of 10.
    Sales-->>Rec: Good. Proceed with 7 seniors.
    Rec-->>Sales: Adjusted plan confirmed.

    Sales->>IT: Scale checkout & WMS to 3×. Prep 10 seats/devices.
    IT-->>Sales: Scaling in progress. Onboarding kits queued.

    Rec-->>IT: 5 candidates cleared. Please pre-provision accounts.
    IT-->>Rec: Accounts created. Device pickup scheduled.

    IT-->>Sales: Infra scaled. Performance on target.
    Rec-->>Sales: 7 hires start Friday.

    Sales->>Mkt: Lift throttle gradually.
    Mkt-->>Sales: Throttle removed. Monitoring continues.

Discovering agents via Agents cards

In order for agents to communicate with each other, they should be aware of each other’s existence and capabilities. This is where agent discovery comes into play. With A2A each agent should expose an endpoint `/.well-known/agent-card.json` that describes the agent and its capabilities. This endpoint can be stored in a registry or perhaps discovered via search engines. For within an organization a private registry can be used.

{
  "name": "Goodbye Agent",
  "description": "A simple agent that says goodbye.",
  "protocolVersion": "0.3.0",
  "version": "0.1.0",
  "url": "http://localhost:4002/",
  "capabilities": {
    "streaming": false,
    "pushNotifications": false,
    "stateTransitionHistory": false
  },
  "defaultInputModes": [
    "text/plain"
  ],
  "defaultOutputModes": [ 
    "text/plain"
  ],
  "skills": [
    {
      "id": "farewell",
      "name": "Farewell",
      "description": "Say goodbye to users",
      "tags": [
        "goodbye",
        "farewell"
      ]
    }
  ],
  "securitySchemes": {
    "apiKey": {
      "type": "apiKey",
      "in": "header",
      "name": "X-API-Key"
    }
  },
  "security": [
    {
      "apiKey": []
    }
  ]
}

Choosing sync or async messaging between agents

After discovering each other, agents can start communicating. There are two main ways how agents can communicate: synchronous polling (request/response) or real-time streaming with Server-Sent Events (SSE). Let’s start with polling.

Polling with Request/ Response

sequenceDiagram
    autonumber
    participant C as Client
    participant A as Agent

    C->>A: message/send {text: "Do X"}
    A-->>C: Task created {taskId, status=submitted}

    rect rgba(240,240,255,0.5)
    Note over C,A: Client polls for task updates
    C->>A: tasks/get {taskId}
    A-->>C: status=submitted

    C->>A: tasks/get {taskId}
    A-->>C: status=working
    A->>A: Performing work

    C->>A: tasks/get {taskId}
    A-->>C: status=working
    A->>A: Still processing
    end

    C->>A: tasks/get {taskId}
    A-->>C: status=completed (final=true)
    Note over C,A: Terminal state reached - stop polling

This is based on the request/ response pattern. The client sends a request to the agent and waits for a response. If the agent needs more time to complete the task, it returns a taskId that can be used to poll for updates. The agent can return the following status:

Status	Description
submitted	The task has been created or queued but has not started execution yet.
working	The agent is actively executing the task.
input-required	The task is paused, waiting for user or external input to continue.
auth-required	The task is waiting for authentication or authorization before proceeding.
completed	The task finished successfully.
failed	The task encountered an unrecoverable error during execution.
canceled	The task was explicitly canceled by the user, system, or another agent.
rejected	The task was not accepted or could not start due to invalid parameters or permissions.
unknown	The current state of the task cannot be determined.

Of course, agents are free to implement only the required status codes. However, it’s important that the client can handle all status codes.

Let’s look at an example of the request/ response pattern. Below is an example of a client sending a message to an agent.

HTTP POST http://localhost:4002 
Content-Type: application/json

{
  "jsonrpc": "2.0",
  "method": "message/send",
  "params": {
    "message": {
      "messageId": "68e13899-72e7-4b64-9114-9854f3470555",
      "role": "user",
      "parts": [
        {
          "kind": "text",
          "text": "See you soon!"
        }
      ],
      "kind": "message"
    }
  },
  "id": 1
}

As you can see the A2A protocol is based on JSON-RPC. More about JSON-RPC can be found in a previous article here.

The method property defines the action that should be performed. In this case, the clients wants to send a message to the agent. A few other methods that can be used:

message/send
message/stream
task/get
task/cancel
…

The agent will process the request and return a response. Again the agent decides how long it takes to process the request and can decide to return a tasks or message object.

When a Task object is returned the client can poll for updates using the taskId. The status indicates the current state of the task.

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "kind": "task",
    "id": "e8d72fd6-9094-4ec5-bf50-161a0568596b",
    "contextId": "03fadb83-b896-4ba6-8ef1-05035c24265b",
    "status": {
      "state": "working",
      "timestamp": "2025-10-06T17:37:46.398Z"
    },
    "history": [
      {
        "messageId": "68e13899-72e7-4b64-9114-9854f3470555",
        "role": "user",
        "parts": [
          {
            "kind": "text",
            "text": "See you soon!"
          }
        ],
        "kind": "message"
      }
    ]
  }
}

When returning a message object, the client can process the message directly without polling.

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "kind": "message",
    "messageId": "b82c5e91-5358-44cc-964e-b66cb8f93779",
    "role": "agent",
    "parts": [
      {
        "kind": "text",
        "text": "Goodbye! I'm Agent 2. See you later!"
      }
    ],
    "contextId": "45966128-8866-4d24-823e-9648f15affe5"
  }
}

Realtime communication with SSE (Server Sent Events)

sequenceDiagram
    autonumber
    participant C as Client
    participant A as Agent

    C->>A: message/stream {text: "Do X"}
    A-->>C: ack {taskId, status=submitted}
    Note over C,A: SSE stream established. Agent will push updates.

    rect rgba(240,240,255,0.5)
    Note over C,A: Agent pushes progress via SSE
    A-->>C: status-update state=working
    A-->>C: artifact-update chunk=1
    A-->>C: artifact-update chunk=2
    A-->>C: status-update state=working
    end

    A-->>C: status-update state=completed final=true
    Note over C,A: Terminal state reached. Agent closes stream and client stops listening.

The A2A protocol supports SSE for real-time communication between agents. This allows agents to receive updates and notifications without having to constantly poll for changes. Let’s look at an example how it works.

The client initiates a connection to the agent’s SSE endpoint. It keeps the connection open and listens for events. The content type is set to `text/event-stream` and the connection is kept alive.

HTTP POST http://localhost:4002
Content-Type: text/event-stream
connection: keep-alive

{
  "jsonrpc": "2.0",
  "method": "message/stream",
  "params": {
    "message": {
      "messageId": "87b75def-a784-4027-9181-cf322c6fee0b",
      "role": "user",
      "parts": [
        {
          "kind": "text",
          "text": "Hi there!"
        }
      ],
      "kind": "message"
    }
  },
  "id": 1
}

Once the client is connected, the agent can send events to the client as they occur. Usually, this means that the client requests a task and the agent sends updates as the task progresses.

{
  "kind": "task",
  "id": "08b1bdcb-90ac-4d45-80bb-498f14d581d1",
  "contextId": "c136eceb-3a39-4f28-88bc-58f5090a7d0a",
  "status": {
    "state": "working",
    "timestamp": "2025-10-04T11:18:03.794Z"
  }
}
---
{
  "taskId": "67c0503f-e8b7-4ef6-b633-c881deecd565",
  "kind": "message",
  "messageId": "d51328ce-a066-4489-bc2f-2029638c5fa9",
  "role": "agent",
  "parts": [
    {
      "kind": "text",
      "text": "Hello! I'm Agent 1. Nice to meet you!"
    }
  ],
  "contextId": "00d503fb-71ce-4e89-86d6-7ea0cc6ece4a"
}

Agent security and authentication

As defined in the agent card, each agent can define its own security schemes. It basically tells how to authenticate to the agent. The security scheme follows the OpenAPI authentication methods.

http	For HTTP authentication schemes such as Basic, Bearer (JWT) and other standardized HTTP auth methods. Use the `scheme` and `bearerFormat` fields to describe specifics.
apiKey	For API keys sent in headers, query parameters or cookies (e.g., `X-API-Key` header or a session cookie).
oauth2	For OAuth 2.0 flows (authorization code, implicit, client credentials, password) used to obtain access tokens.
openIdConnect	For OpenID Connect discovery and authentication metadata endpoints (OIDC discovery URL).
mutualTLS	For mutual TLS authentication where both client and server authenticate each other using X.509 certificates.

Let’s look at an example of how to authenticate to an agent using an API key.

HTTP POST http://localhost:4002
Content-Type: text/event-stream
connection: keep-alive
X-API-Key: secret

The agent will validate the API key and allow or deny access.

Final thoughts

The MCP protocol was adapted quickly because it solved a real problem: standardizing the communication between language models and tools. The A2A protocol tries to do the same for communication between agents. With more multi-agent systems being built, a protocol is needed to enable interoperability between agents. A2A can play an import role. However, the success of A2A depends on adoption. The more agents that support A2A, the more valuable it becomes. The community embraced MCP quickly with the rise of function calling. Let’s see if A2A can do the same for agents.