What makes Claude Code so damn good (and how to recreate that magic in your agent)
引言
Introduction
Claude Code is the most delightful AI agent/workflow I have used so far. Not only does it make targeted edits or vibe coding throwaway tools less annoying, using Claude Code makes me happy. It has enough autonomy to do interesting things, while not inducing a jarring loss of control like some other tools do. Of course most of the heavy lifting is done by the new Claude 4 model (especially interleaved thinking). But I find Claude Code objectively less annoying to use compared to Cursor, or Github Copilot agents even with the same underlying model! What makes it so damn good?
Claude Code 是我迄今为止使用过的最令人愉悦的 AI Agent/工作流。它不仅让针对性编辑或即兴编码的一次性工具变得不那么烦人,使用 Claude Code 本身就让我感到快乐。它拥有足够的自主性来完成有趣的事情,同时又不会像其他一些工具那样引发失控的不适感。当然,大部分繁重工作是由新的 Claude 4 模型(尤其是交错思考能力)完成的。但我发现,即使使用相同的底层模型,Claude Code 客观上比 Cursor 或 Github Copilot agents 更好用!它为何如此出色?
如何构建类 Claude Code 的 Agent:快速总结
How to build a Claude Code like agent: TL;DR
If there is one thing to take away from this, it is this - Keep Things Simple, Dummy. LLMs are terrible enough to debug and evaluate. Any additional complexity you introduce (multi-agents, agent handoffs or complex RAG search algorithms) only makes debugging 10x harder.
Main takeaways:
- Control Loop: Keep one main loop (with max one branch) and one message history; Use a smaller model for all sorts of things
- Prompts: Use claude.md pattern for user preferences; Use special XML Tags, Markdown, and lots of examples
- Tools: LLM search >>> RAG based search; Design good tools (High vs Low level); Let your agent manage its own todo list
- Steerability: Control tone and style; “PLEASE THIS IS IMPORTANT” is still state of the art; Write the algorithm with heuristics and examples
如果只记住一件事,那就是——保持简单,傻瓜。LLM 已经够难调试和评估的了。你引入的任何额外复杂性(多 Agent、Agent 交接或复杂的 RAG 搜索算法)只会让调试难度增加 10 倍。
主要要点:
- 控制循环:保持一个主循环(最多一个分支)和一个消息历史;在各种场景下使用较小的模型
- 提示词:使用 claude.md 模式管理用户偏好;使用特殊 XML 标签、Markdown 和大量示例
- 工具:LLM 搜索远胜于基于 RAG 的搜索;设计好的工具(高级 vs 低级);让你的 Agent 管理自己的待办列表
- 可引导性:控制语气和风格;“PLEASE THIS IS IMPORTANT” 仍然是当前最佳实践;用启发式和示例编写算法
1. 控制循环设计
1. Control Loop Design
1.1 Keep One Main Loop
Despite multi agent systems being all the rage, Claude Code has just one main thread. It uses a few different types of prompts periodically to summarize the git history, to clobber up the message history into one message or to come up with some fun UX elements. But apart from that, it maintains a flat list of messages. An interesting way it handles hierarchical tasks is by spawning itself as a sub-agent without the ability to spawn more sub-agents. There is a maximum of one branch, the result of which is added to the main message history as a “tool response”.
Debuggability >>> complicated hand-tuned multi-agent lang-chain-graph-node mishmash.
1.1 保持一个主循环
尽管多 Agent 系统正风靡一时,Claude Code 却只有一个主线程。它会定期使用几种不同类型的提示词来总结 git 历史、将消息历史合并为一条消息,或生成一些有趣的 UX 元素。但除此之外,它维护的是一个扁平的消息列表。它处理层次化任务的一个有趣方式是将自己生成为子 Agent,但子 Agent 没有能力再生成更多子 Agent。最多只有一个分支,其结果作为”工具响应”添加到主消息历史中。
可调试性 >>> 复杂的手动调优多 Agent lang-chain-graph-node 大杂烩。
1.2 Use a Smaller model for everything
Over 50% of all important LLM calls made by CC are to claude-3-5-haiku. It is used to read large files, parse web pages, process git history and summarize long conversations. It is also used to come up with the one-word processing label - literally for every key stroke! The smaller models are 70-80% cheaper than the standard ones (Sonnet 4, GPT-4.1). Use them liberally!
1.2 在所有场景使用较小的模型
Claude Code 发出的所有重要 LLM 调用中,超过 50% 都是发给 claude-3-5-haiku 的。它被用来读取大文件、解析网页、处理 git 历史和总结长对话。它还被用来生成单字处理标签——字面上是每个按键!较小的模型比标准模型(Sonnet 4、GPT-4.1)便宜 70-80%。尽情使用它们!
2. 提示词设计
2. Prompts
Claude Code has extremely elaborate prompts filled with heuristics, examples and IMPORTANT (tch-tch) reminders. The system prompt is ~2800 tokens long, with the Tools taking up a whopping 9400 tokens. The user prompt always contains the claude.md file, which can typically be another 1000-2000 tokens.
2.1 Use claude.md for collaborating on user context and preferences
One of the major patterns most coding agent creators have settled on is the context file (aka Cursor Rules / claude.md / agent.md). The difference in Claude Code’s performance with and without claude.md is night and day. It is a great way for the developers to impart context that cannot be inferred from the codebase and to codify all strict preferences.
Claude Code 拥有极其精细的提示词,充满了启发式规则、示例和 IMPORTANT(唉)提醒。系统提示词长约 2800 tokens,而工具部分占据了惊人的 9400 tokens。用户提示词总是包含 claude.md 文件,通常还有额外的 1000-2000 tokens。
2.1 使用 claude.md 协作管理用户上下文和偏好
大多数编码 Agent 创作者已经确定的主要模式之一是上下文文件(又名 Cursor Rules / claude.md / agent.md)。Claude Code 在使用和不使用 claude.md 时的表现判若云泥。这是开发者传递无法从代码库推断的上下文并将所有严格偏好规范化的绝佳方式。
2.2 Special XML Tags, Markdown, and lots of examples
It is fairly established that XML tags and Markdown are two ways to structure a prompt. CC uses both, extensively. Here are a few notable XML tags in Claude Code:
<system-reminder>: This is used at the end of many prompt sections to remind the LLM of thing it presumably otherwise forgets.<good-example>,<bad-example>: These are used to codify heuristics. They can be especially useful when there is a fork in the road with multiple seemingly reasonable paths/tool_calls the model can choose.
2.2 特殊 XML 标签、Markdown 和大量示例
XML 标签和 Markdown 是结构化提示词的两种既定方式。CC 广泛使用两者。以下是 Claude Code 中一些值得注意的 XML 标签:
<system-reminder>:用于许多提示词部分的末尾,提醒 LLM 记住它否则可能会忘记的事情。<good-example>,<bad-example>:用于规范化启发式规则。当存在多个看似合理的路径/工具调用供模型选择时,它们特别有用。
3. 工具设计
3. Tools
3.1 LLM search >>> RAG based search
One significant way in which CC deviates from other popular coding agents is in its rejection of RAG. Claude Code searches your code base just as you would, with really complex ripgrep, jq and find commands. Since the LLM understands code really well, it can use sophisticated regex to find pretty much any codeblock it deems relevant. Sometimes it ends up reading whole files with a smaller model.
RAG sounds like a good idea in theory, but it introduces new (and more importantly, hidden) failure modes. What is the similarity function to use? What reranker? How do you chunk the code? What do you do with large JSON or log files? With LLM Search, it just looks at 10 lines of the json file to understand its structure. If it wants, it looks at 10 more lines - just like you would.
3.1 LLM 搜索远胜于基于 RAG 的搜索
CC 与其他流行编码 Agent 的一个显著不同之处在于它拒绝了 RAG。Claude Code 搜索你的代码库的方式与你一样,使用非常复杂的 ripgrep、jq 和 find 命令。由于 LLM 非常理解代码,它可以使用复杂的正则表达式来找到几乎所有它认为相关的代码块。有时它最终会用较小的模型读取整个文件。
RAG 在理论上听起来是个好主意,但它引入了新的(更重要的是,隐藏的)失败模式。使用什么相似度函数?什么重排序器?如何分块代码?如何处理大型 JSON 或日志文件?使用 LLM 搜索,它只需查看 JSON 文件的 10 行来理解其结构。如果它愿意,它可以再看 10 行——就像你会做的那样。
3.2 How to design good tools? (Low level vs High level tools)
This question keeps anyone who is building an LLM agent up at night. Should you give the model generic tasks (like meaningful actions) or should it be low level (like type and click and bash)? The answer is that it depends (and you should use both).
Claude Code has low level (Bash, Read, Write), medium level (Edit, Grep, Glob) and high level tools (Task, WebFetch, exit_plan_mode). CC can use bash, so why give a separate Grep tool? The real trade-off here is in how often you expect your agent to use the tool vs accuracy of the agent in using the tool.
3.2 如何设计好的工具?(低级 vs 高级工具)
这个问题让每个构建 LLM Agent 的人夜不能寐。你应该给模型通用任务(如有意义的操作)还是低级任务(如输入、点击和 bash)?答案是视情况而定(你应该两者都用)。
Claude Code 有低级(Bash、Read、Write)、中级(Edit、Grep、Glob)和高级工具(Task、WebFetch、exit_plan_mode)。CC 可以使用 bash,那为什么还要提供单独的 Grep 工具?真正的权衡在于你期望 Agent 使用该工具的频率与 Agent 使用该工具的准确性之间的平衡。
3.3 Let the agent manage a todo list
There are many reasons why this is a good idea. Context rot is a common problem in long-running LLM agents. They enthusiastically start out tackling a difficult problem, but over time lose their way and devolve into garbage. CC uses an explicit todo list, but one that the model maintains. This keeps the LLM on track (it has been heavily prompted to refer to the todo list frequently), while at the same time giving the model the flexibility to course correct mid-way in an implementation.
3.3 让 Agent 管理待办列表
这是一个好主意有很多原因。上下文腐烂是长期运行的 LLM Agent 的常见问题。它们热情地开始解决难题,但随着时间的推移会迷失方向并退化为垃圾。CC 使用显式的待办列表,但由模型维护。这让 LLM 保持在正轨上(它已被大量提示经常参考待办列表),同时给予模型在实现过程中途纠正方向的灵活性。
4. 可引导性
4. Steerability
4.1 Tone and Style
CC explicitly attempts to control the aesthetic behavior of the agent. There are sections in the system prompt around tone, style and proactiveness - full of instructions and examples. This is why Claude Code “feels” tasteful in its comments and eagerness.
Some examples of tone and style:
- IMPORTANT: You should NOT answer with unnecessary preamble or postamble (such as explaining your code or summarizing your action), unless the user asks you to.
- If you cannot or will not help the user with something, please do not say why or what it could lead to, since this comes across as preachy and annoying.
- Only use emojis if the user explicitly requests it. Avoid using emojis in all communication unless asked.
4.1 语气和风格
CC 明确尝试控制 Agent 的审美行为。系统提示词中有关于语气、风格和主动性的部分——充满了指示和示例。这就是为什么 Claude Code 在评论和热情方面”感觉”很有品味。
语气和风格的一些示例:
- 重要:你不应该用不必要的开场白或结束语来回答(比如解释你的代码或总结你的行动),除非用户要求你这样做。
- 如果你不能或不愿帮助用户做某事,请不要说明原因或可能导致什么,因为这会显得说教和烦人。
- 仅在用户明确要求时使用表情符号。除非被要求,否则在所有交流中避免使用表情符号。
4.2 “THIS IS IMPORTANT” is still State of the Art
Unfortunately CC is no better when it comes to asking the model to not do something. IMPORTANT, VERY IMPORTANT, NEVER and ALWAYS seem to be the best way to steer the model away from landmines. Some examples:
- IMPORTANT: DO NOT ADD ANY COMMENTS unless asked
- VERY IMPORTANT: You MUST avoid using search commands like
findandgrep. Instead use Grep, Glob, or Task to search. - IMPORTANT: You must NEVER generate or guess URLs for the user unless you are confident that the URLs are for helping the user with programming.
4.2 “THIS IS IMPORTANT” 仍然是当前最佳实践
不幸的是,在要求模型不做某事方面,CC 也好不到哪去。IMPORTANT、VERY IMPORTANT、NEVER 和 ALWAYS 似乎是将模型引导远离雷区的最佳方式。一些示例:
- 重要:除非被要求,否则不要添加任何注释
- 非常重要:你必须避免使用像
find和grep这样的搜索命令。而是使用 Grep、Glob 或 Task 来搜索。 - 重要:你绝不能为用户生成或猜测 URL,除非你有信心这些 URL 是为了帮助用户进行编程。
4.3 Write the Algorithm (with heuristics and examples)
It is extremely important to identify the most important task the LLM needs to perform and write out the algorithm for it. Try to role-play as the LLM and work through examples, identify all the decision points and write them explicitly. It helps if this is in the form of a flow-chart. This helps structure the decision making and aids the LLM in following instructions.
4.3 编写算法(带启发式和示例)
识别 LLM 需要执行的最重要任务并为其编写算法是极其重要的。尝试扮演 LLM 的角色并通过示例进行推导,识别所有决策点并明确写出它们。如果以流程图的形式呈现会有所帮助。这有助于结构化决策过程并帮助 LLM 遵循指示。
结语
Conclusion
The main takeaway, again, is to keep things simple. Extreme scaffolding frameworks will hurt more than help you. Claude Code really made me believe that an “agent” can be simple and yet extremely powerful. We’ve incorporated a bunch of these lessons into MinusX, and are continuing to incorporate more.
If you’re interested in Claude-Codifying your own LLM agent, I’d love to chat - ping me on twitter!
再次强调,主要收获是保持简单。极端的脚手架框架弊大于利。Claude Code 真的让我相信,一个”Agent”可以既简单又极其强大。我们已经将这些经验教训中的许多融入了 MinusX,并将继续融入更多。
如果你有兴趣将你自己的 LLM Agent 打造成像 Claude Code 一样,我很乐意交流——在 Twitter 上联系我!
原文来源:MinusX - What makes Claude Code so damn good 作者:Nuvan D. (MinusX Team) 翻译整理:AI Links