ACP: The Agent Connectivity Standard for the AI Era

ACP (Agent Client Protocol) is an open protocol that standardizes the communication between IDEs/UIs and AI Agents, solving the N×M problem of agent-tool integration.

2026-04-25·西山懒懒翁·12 views·

AI Agent Protocol

0 0

1. Let's Start with a Real Scenario

Suppose you want to build your own AI product — a customized code assistant, an internal knowledge-base Agent, a Coding Copilot that fits your team workflow, or simply a personal assistant that auto-writes your daily report, runs scripts, and manages projects on your behalf.

You will quickly face a real question: should you implement a Code Agent from scratch?

Building one yourself means handling prompt engineering, tool invocation, context management, sandbox execution, and permission systems — all of which products like Claude Code, Codex, and Gemini CLI have already polished countless times.
These leading Agents are extremely capable at code understanding and execution, so reinventing the wheel will almost certainly underperform direct reuse.

The more reasonable approach is to treat mature Agents like Claude Code, Codex, or Gemini CLI as the "engine," and only build the upper-layer interaction shell and business orchestration yourself.

But the problem follows immediately:

Claude Code has its own CLI and interaction model
Codex has its own SDK and invocation protocol
Gemini CLI is yet another set of conventions
To run them inside your self-built web UI / internal IDE plugin / automated workflow, every integration needs a custom adapter.

On the other hand, these Agent vendors also struggle: every new IDE, every new platform means doing the front-end integration over again.

This is the classic N × M problem:

Agent A ─┐                  ┌─ Your Product X
Agent B ─┼── N × M adapters ──┼─ Custom IDE Y
Agent C ─┘                  └─ Automation Platform Z

Software engineering has only one way to solve this — extract a standard protocol in the middle, compressing N × M into N + M:

Agent A ─┐                  ┌─ Your Product X
Agent B ─┼────── ACP ───────┼─ Custom IDE Y
Agent C ─┘                  └─ Automation Platform Z

ACP is that layer. It lets you build only the "shell," and reuse the strongest Agent "kernels" in the ecosystem — this is its most pragmatic value at this point in time.

Another practical benefit that is easy to miss: ACP lets you use Claude Code / Codex "without going through the API."
Building your own Agent on top of a model API means every thought, every tool_call, every long-context lookback burns money by the token — run a few heavy jobs and your monthly bill easily hits three figures in USD. But products like Claude Code / Codex come with a subscription quota (Claude Pro / ChatGPT Plus Max plan). When you "borrow" them into your own shell via ACP, you are drawing from a fixed monthly subscription pool, not a pay-per-use API.
For individuals and small teams, this price difference alone is almost enough reason to choose ACP.

2. What is ACP

In one sentence: ACP (Agent Client Protocol) is an open protocol that standardizes the communication between IDE/UI and AI Agent.

A few key facts:

Full name: Agent Client Protocol
Official site: https://agentclientprotocol.com/
Led by: the Zed editor team
Positioning: extract a standard interface between "Agent capability" and "product form"
Key property: bidirectional — not only can the UI call the Agent, the Agent can also request UI capabilities back (reading files, requesting authorization, streaming its thought process, etc.)

Build Intuition with an Analogy

If you are familiar with any of the following protocols, ACP has a direct counterpart:

Protocol	Problem it solves	Analogy
HTTP	Browser ↔ Web service	Any browser accesses any website
LSP	Editor ↔ Language server	Any editor supports any language
MCP	LLM ↔ Tools	Any model calls any tool
ACP	IDE/UI ↔ Agent	Any IDE drives any Agent

If you come from a back-end background, think of it as "HTTP for Agents"; if you have written editor plugins, think of it as "LSP for Agents" — the structure is nearly identical.

The Most Easily Confused Page: ACP vs MCP

This is the question I'm asked the most, so I'll put it right at the front:

	MCP	ACP
Connects	LLM ↔ Tool	Client (IDE) ↔ Agent
Perspective	For the model	For the UI
Typical scenario	Claude calls GitHub API	Zed drives Claude Code to edit code
In one sentence	"Give the model hands"	"Give the Agent a face"

They are not substitutes; they are complements. A Coding Agent internally uses MCP to call tools, and externally uses ACP to talk to the IDE — there is no conflict at all.

3. What the ACP Protocol Specifies

At this point you may wonder: what does the ACP protocol actually specify? Is it complicated?

My answer: ACP itself is not thick. You can keep the whole thing in mind across three dimensions — roles, message model, and transport.

3.1 Two Roles

The ACP world has only two roles:

Role	Responsibility	Analogy
Agent	The one that actually does the work: receives prompts, manages sessions, pushes progress, and requests user authorization or IDE capabilities when needed	Backend service
Client	The one facing the user: sends prompts, renders the intermediate process pushed by the Agent, responds to the Agent's reverse requests	Frontend / Browser

When building an upper-layer product, you either implement the Client (reusing mature Agents like Claude Code / Codex), or you implement the Agent (exposing your own capabilities to ACP hosts like Zed). You rarely do both at once.

3.2 Bidirectional Message Model

What makes ACP most unlike a traditional HTTP API is that its message flow is bidirectional:

Traditional HTTP:
  Client ───── request ─────▶ Server
  Client ◀──── response ───── Server
ACP:
  Client ───── prompt ─────────▶ Agent        ← you ask the Agent to do something
  Client ◀──── read file / request auth ── Agent   ← Agent asks you for help
  Client ◀──── streaming thought / diff ── Agent   ← Agent keeps pushing the process to you
  Client ───── cancel ─────────▶ Agent        ← you can interrupt it anytime

In other words, the Agent is not a passive "request/response" API — it is a participant that can speak up. This is what makes ACP a good fit for the full picture of interactions like "AI pair programming": the thinking process must be visible, dangerous actions must require authorization, and the user must be able to interrupt mid-run.

3.3 What a Complete Interaction Looks Like

Let us string the above together with a complete flow of "user asks the Agent to modify code":

Client (your product)                              Agent (Claude Code, etc.)
    │                                               │
    │────────── initialize ───────────────────────▶│   negotiate protocol version & capabilities
    │◀───────── initialize response ───────────────│
    │                                               │
    │────────── session/new ──────────────────────▶│   open a session
    │◀───────── session_id ────────────────────────│
    │                                               │
    │────────── session/prompt ───────────────────▶│   user: "help me change foo to bar"
    │                                               │
    │◀───────── fs/read_text_file ─────────────────│   Agent: let me read the relevant files first
    │────────── file content ─────────────────────▶│
    │                                               │
    │◀───────── session/update (thought) ──────────│   stream the thinking process
    │◀───────── session/update (tool_call) ────────│   stream the tool calls
    │                                               │
    │◀───────── session/request_permission ────────│   "I want to edit this file, ok?"
    │────────── allow ────────────────────────────▶│
    │                                               │
    │◀───────── session/update (diff) ─────────────│   stream the diff
    │◀───────── session/prompt response ───────────│   task done
    │                                               │
    │────────── session/cancel (optional) ────────▶│   user cancels mid-run

Four key takeaways:

Bidirectional: the Agent is not just passively answering — it actively requests capabilities from the IDE / product
Streaming: session/update streams all the way through, so the user can see "what the AI is thinking"
Permission control: dangerous actions like file edits and shell commands ultimately require the user's approval; the Agent cannot overstep
Interruptible: the user can cancel at any time — the protocol supports it natively

In other words: if your product needs "AI thinking while working, asking permission before doing something dangerous, and being able to be stopped mid-task," ACP is almost a ready-made protocol template.

3.4 Transport: Protocol and Transport Decoupled

ACP cleanly separates "what to say" (semantics) from "how to ship it" (transport). The same protocol can run on three transports:

Transport	Characteristics	Typical scenario
stdio	Subprocess + standard I/O, zero dependencies	Zed default: locally launches the Agent as a subprocess
Streamable HTTP	POST requests + SSE for reverse stream	Remote deployment; most friendly to corporate networks / firewalls
WebSocket	Full-duplex long connection	Remote deployment, low latency

For someone building a product, the choice is straightforward:

Local Agent + desktop product → stdio first; zero deployment, zero network setup
Remote Agent + web product / multi-device access → Streamable HTTP, SSE has the best penetration
Long-connection needed / latency sensitive → WebSocket

Switching transport does not require changing business code — only the connection-init step. This is also one of the things that makes ACP more worthwhile than "rolling your own HTTP API."

3.5 Beyond the Spec: Extension Mechanism

ACP reserves _-prefixed custom methods, so you can add private capabilities without breaking standard compatibility (for example, if your product has a "telemetry reporting" private protocol that you want to ship over the same connection). See the official extension docs. This lets ACP both "be a standard" and "be extensible."

4. Ecosystem Status

Existing Implementations

Zed editor: native ACP support, the first major host
Claude Code Adapter: wraps Claude Code as an ACP Agent so it can run inside Zed
Gemini CLI Adapter: same as above
More adapters are emerging

A good sign for the ecosystem: the same Agent is starting to run in multiple IDEs, and the same IDE is starting to accept multiple Agents — N × M is becoming N + M.

Go SDK: eino-contrib/acp

The official implementations are mainly TypeScript and Rust. Because ByteDance internally uses a Go tech stack with the Eino Agent framework and needs to expose it as an ACP interface, we built a Go SDK: github.com/cloudwego/eino-extensions/acp

What it provides:

Bidirectional RPC abstraction: conn.ClientConnection / conn.AgentConnection hide JSON-RPC details
All three transports supported: stdio / Streamable HTTP / WebSocket, with HTTP & WS based on CloudWeGo Hertz
Remote server: server.ACPServer supports both HTTP and WS upgrade on the same route
Protocol extension: native support for _-prefixed custom methods
Code generation: types and methods are auto-generated from the official schema.json, so the SDK keeps up with protocol upgrades

Full example code is in the repository's examples/ directory, covering both Agent and Client sides.

5. Everyone Deserves Their Own AI Shell

In closing, I want to bring the perspective back to the individual.

In the AI era, everyone deserves to build their own "AI shell" — maybe a command-line tool, a local client, a web app, or even a script behind an IM bot. The form does not matter; what matters is taming AI into a shape that fits your own workflow and continuously multiplies your leverage.

In the past this was a "nice thought, but skip it" idea: front-end / back-end / deployment / permission setup — by the time you finished all that, you were already exhausted. But now Cursor / Claude Code / Codex can take an idea all the way to a runnable prototype, so the door to building your own shell is finally open.

ACP solves exactly this problem.

It lets you directly "borrow" the already-polished Agents like Claude Code, Codex, and Gemini CLI as replaceable kernels, while you only need to do the two things you understand best and are most worth doing yourself:

Workflow orchestration: how to string AI capabilities into your own workflow
Skills that fit your workflow: those "personal capabilities" that only you care about, only you know how to do right

This is a very favorable division of labor: the hardest, most resource-hungry "intelligence kernel" goes to the strongest products; the workflow and Skills that understand you best and cannot be replicated by anyone else — those stay with you.

And because it goes through a protocol, the Agent you run today is Claude Code; tomorrow when a stronger Agent appears, you just swap one ACP connection and it slots in seamlessly — your shell is durable capital, the engine is a replaceable component.

6. To Throw Out a Few Ideas: Some Personal Best Practices

6.1 Code Review Scenario

When you use an Agent to do Code Review, you will mostly hit two pitfalls:

Unstable results: LLM output is inherently random. Run the same diff twice and you may get "there is a problem" once and "no problem" the next time.
Too many false positives: the model tends to mix "style suggestions," "readability rants" and "real bugs" into the same Issue List, forcing you to fish for signal in noise.

So in the shell, I split this into a pipeline:

Reviewer: read the diff end to end, produce a coarse-grained Issue List (noise is allowed)
Double Checker: feed the Issue List back one by one for a second pass of validation; keep only items that "really might be a bug"
Cross Reviewer (another Agent): hand the double-checked result to a different vendor's Agent to Review once more — use the perspective difference of different models to filter out remaining false positives, so what actually goes out are high-confidence issues both sides agree on
Fixer: generate fix patches for the confirmed Issues
Loop (repeatable for N rounds): after merging the patch back into the diff, start a new Session from step 1; repeat N rounds as needed, until the Reviewer no longer reports new issues

With this split, randomness is smoothed by multi-round voting, false positives are filtered by double-Agent cross review, and Review and Fix are no longer the same Agent talking to itself. Under ACP, this whole pipeline is "open a few ACP connections + a piece of orchestration code."

For example, my usual mix: the first two rounds of Reviewer and Double Checker are heavy on volume but relatively simple (filter obvious noise), so I usually let ChatGPT (Codex) handle them — fast and cheap; for round 3 Cross Reviewer, I use Claude Code — slower and more expensive, but a much better fit for high-confidence judgment.

This is the real power of ACP "the engine is a replaceable component" — not just swappable, but mixed-and-matched within a single pipeline. Claude Code today, Codex tomorrow, even different Agents at different stages of the same task; the workflow itself does not need to change.

6.2 Scheduled Tasks: Make the Agent a "Colleague Always on Shift"

As a heavy Claude Code user, I keep up with every update. But just reading the official Change Log is rarely enough — many change descriptions are vague, and the only way to know "how is this new feature actually implemented" is to dig into the source. And Claude Code is closed source — but its compiled bundle cli.js is shipped, and decompiling it back into readable source is a feasible route.

Previously this was: new release → manually download → manually split → manually paste into the model for decompilation → manually cross-check with the Change Log. Each step is easy, but together it is a "worth doing, but too lazy to do" task. So I automated the whole thing into the shell:

Listen: a scheduled shell script polls GitHub every day for Claude Code's latest release version and Change Log
Fetch source: when a new version is detected, auto-fetch the latest cli.js and split it by function/module boundaries into chunks that fit the context
Decompile: feed the chunks to the Agent and let it reconstruct the compressed bundle into readable source
Locate new feature: let the Agent combine the Change Log with the decompilation result to find the implementation location and key code snippets for each new feature
Output: write the analysis as a Feishu document and push the link to my IM

Once configured, the flow is "release goes out → Feishu link arrives." I just open the link and read the conclusion. Same as 6.1, each Agent node in this chain is selected independently on "effect / cost" — the upfront decompilation is heavy but mechanical, so ChatGPT is a good fit; the high-level "what does this mean" step is small but high-stakes, where Claude Code is more reliable.

This is what I mean by "make the Agent a colleague who is always on shift": you are no longer "opening an AI product when you have time to ask" — instead, you hand the things that are "worth knowing but you never have time to follow up on" to scheduled tasks in the shell; by the time you come back, the conclusion is already waiting for you.

To summarize:

The two examples above look stylistically different — one is interactive Code Review, the other is a background scheduled task — but they are really the same thing: based on ACP, orchestrate multiple Agents and let them run the "worth doing but too tedious to do by hand" work for you.

Both of them hit the two key points of personal productivity amplification:

Hand "long tasks" over to the Agent: a one-or-two-minute thing is fine for a human, but once a task takes 5-6 sequential steps to produce a result, that is exactly when an Agent should step in — once the pipeline is built, every call afterwards is an almost-zero marginal cost return
Strip "repetitive operations" off yourself: daily Reviews, per-release decompilation, weekly report writing — these accumulate to take up half of your energy; once you offload them to the shell, you can really spend your time on the things only you can do

What ACP does in all of this is turn "orchestrating multiple Agents" from an engineering problem into a piece of orchestration code. Without ACP, you would need to write a custom adapter for every Agent vendor; with ACP, different Agents are unified behind a single interface, and composing them becomes natural.

Implementing one ACP Client in your shell lets you drop in any CLI-style Agent — Claude Code, Codex, Gemini CLI, and others — like swapping an engine, with no per-vendor adapter required.

7. Conclusion

The value of ACP, at its core, is compressing the N × M adaptation problem in the Agent ecosystem into N + M: turning strong Agents into replaceable "engines," so that everyone can confidently build up their own "shells." What is truly worth accumulating was never the tool itself, but the workflow and skills that you have personally forged, time and time again, around it.

Traditional HTTP: Client ───── request ─────▶ Server Client ◀──── response ───── Server ACP: Client ───── prompt ─────────▶ Agent ← you ask the Agent to do something Client ◀──── read file / request auth ── Agent ← Agent asks you for help Client ◀──── streaming thought / diff ── Agent ← Agent keeps pushing the process to you Client ───── cancel ─────────▶ Agent ← you can interrupt it anytime

Client (your product) Agent (Claude Code, etc.) │ │ │────────── initialize ───────────────────────▶│ negotiate protocol version & capabilities │◀───────── initialize response ───────────────│ │ │ │────────── session/new ──────────────────────▶│ open a session │◀───────── session_id ────────────────────────│ │ │ │────────── session/prompt ───────────────────▶│ user: "help me change foo to bar" │ │ │◀───────── fs/read_text_file ─────────────────│ Agent: let me read the relevant files first │────────── file content ─────────────────────▶│ │ │ │◀───────── session/update (thought) ──────────│ stream the thinking process │◀───────── session/update (tool_call) ────────│ stream the tool calls │ │ │◀───────── session/request_permission ────────│ "I want to edit this file, ok?" │────────── allow ────────────────────────────▶│ │ │ │◀───────── session/update (diff) ─────────────│ stream the diff │◀───────── session/prompt response ───────────│ task done │ │ │────────── session/cancel (optional) ────────▶│ user cancels mid-run

ACP: The Agent Connectivity Standard for the AI Era

1. Let's Start with a Real Scenario

2. What is ACP

Build Intuition with an Analogy

The Most Easily Confused Page: ACP vs MCP

3. What the ACP Protocol Specifies

3.1 Two Roles

3.2 Bidirectional Message Model

3.3 What a Complete Interaction Looks Like

3.4 Transport: Protocol and Transport Decoupled

3.5 Beyond the Spec: Extension Mechanism

4. Ecosystem Status

Existing Implementations

Go SDK: eino-contrib/acp

5. Everyone Deserves Their Own AI Shell

6. To Throw Out a Few Ideas: Some Personal Best Practices

6.1 Code Review Scenario

6.2 Scheduled Tasks: Make the Agent a "Colleague Always on Shift"

7. Conclusion

Comments

Comments

1. Let's Start with a Real Scenario

2. What is ACP

Build Intuition with an Analogy

The Most Easily Confused Page: ACP vs MCP

3. What the ACP Protocol Specifies

3.1 Two Roles

3.2 Bidirectional Message Model

3.3 What a Complete Interaction Looks Like

3.4 Transport: Protocol and Transport Decoupled

3.5 Beyond the Spec: Extension Mechanism

4. Ecosystem Status

Existing Implementations

Go SDK: eino-contrib/acp

5. Everyone Deserves Their Own AI Shell

6. To Throw Out a Few Ideas: Some Personal Best Practices

6.1 Code Review Scenario

6.2 Scheduled Tasks: Make the Agent a "Colleague Always on Shift"

7. Conclusion