# 动手学 Agent 系列教程 - 完整学习笔记

**原作者：** Tallis  
**整理时间：** 2026-03-26  
**系列文章：** 6 篇  
**GitHub 仓库：** https://github.com/Tallisgo/llm_based_agent

---

## 📚 **目录**

1. Chain of Thought Prompting
2. Self Refine
3. Plan and Execute (上)
4. Plan and Execute (下)
5. Plan and Execute (实现)
6. ReAct

---

## 一、Chain of Thought Prompting

### **什么是 CoT？**

**Chain of Thought Prompting (COT)** 是 2022 年提出的用来提高大模型推理能力的 prompting engineering 方法。

**核心特点：**
- 与标准 prompting 直接输出结果不同
- CoT 会提供"思考过程"来说明最终结果是如何得到的
- 引导大模型 step-by-step 解决问题

### **CoT 的优势**

1. **引导任务分解** - step-by-step 解决问题
2. **提高可解释性** - 便于排查问题
3. **便于使用** - 成本低（只需要提供几个例子）

### **三种 CoT 实现方式**

#### **1. Standard Prompting**
直接输出答案，无推理过程

#### **2. Chain of Thought - Few Shot**
在 few-shot 的基础上增加 reasoning step

#### **3. Chain of Thought - Zero Shot**
在标准提示词最后加入"Let's think step by step"

---

## 二、Self Refine

### **核心思想**

增加了跟 LLM 的交互次数，通过给每一次交互不同的 system prompt 和 input content 来让 LLM 返回不同的结果。

### **工作流程**

```
第一次交互：LLM 进行翻译
    ↓
第二次交互：LLM 评价翻译质量
    ↓
第三次交互：LLM 根据评价优化翻译
```

### **应用场景**

- 翻译优化
- 代码改进
- 文本润色
- 任何需要迭代优化的任务

---

## 三、Plan and Execute

### **为什么要调用外部工具？**

1. **数据获取** - 访问最新数据或特定领域信息
2. **实时信息** - 提供与当前事件相关的回答
3. **增强功能** - 集成外部 API 增加新能力
4. **提高准确性** - 调用专业数据库提高可靠性

### **核心流程**

```
Plan (规划)
    ↓
Execute (执行工具)
    ↓
整合结果
    ↓
输出答案
```

### **工具类型**

1. **搜索工具** - WebSearch (duckduckgo_search)
2. **计算工具** - 数学计算函数
3. **其他 API** - 翻译、语音识别、图像处理等

---

## 四、ReAct

### **核心思想**

**ReAct** 交错执行 **Reasoning** 和 **Act**，并与外部环境进行交互。

### **Prompt 结构**

```
Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [tool_names]
Action Input: the input to the action
Observation: the result of the action
... (重复 N 次)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
```

### **工作流程**

```
Question → Thought → Action → Action Input
    ↓
Observation (调用 Tool)
    ↓
Thought → Action → Action Input
    ↓
Observation (调用 Tool)
    ↓
... (重复 N 次)
    ↓
Thought: I now know the final answer
    ↓
Final Answer
```

---

## 📊 **技术对比**

| 方法 | 核心 | 交互次数 | 特点 |
|------|------|---------|------|
| **CoT** | 思维链 | 1 次 | step-by-step 推理 |
| **Self Refine** | 自我反思 | 3 次 | 翻译→评价→优化 |
| **Plan and Execute** | 规划执行 | 多次 | Plan→Execute→整合 |
| **ReAct** | 推理行动 | 多轮 | Reasoning+Act 交错 |

---

## 🎯 **共同点**

1. **多轮交互** - 都涉及与 LLM 的多次交互
2. **Prompt 工程** - 关键在于构造好的 prompt
3. **任务分解** - 将复杂任务分解为简单步骤
4. **外部工具** - 都可以调用外部工具增强能力

---

## 📈 **学习路线**

```
CoT (推理基础)
    ↓
Self Refine (多轮交互)
    ↓
Plan and Execute (工具调用)
    ↓
ReAct (推理 + 行动)
```

---

## 💡 **关键洞察**

### **1. Prompt 决定一切**

- 控制输入提示词 = 控制 LLM 输出
- 好的 prompt = 好的结果
- 需要反复调试和优化

### **2. 多轮交互的力量**

- 单次交互有局限
- 多轮交互可以让 LLM 扮演不同角色
- 每次交互解决一个子问题

### **3. 工具调用是关键**

- LLM 知识有限且可能过时
- 外部工具可以提供实时、准确信息
- 工具调用扩展了 LLM 的能力边界

### **4. 结构化输出**

- 使用 XML 标签分隔内容
- 使用 JSON 格式规范输出
- 使用正则表达式解析结果

---

## 🔗 **资源链接**

- **GitHub 仓库：** https://github.com/Tallisgo/llm_based_agent
- **原文系列：** 知乎"动手学 agent"系列（6 篇）
- **作者：** Tallis

---

## 📝 **代码示例**

### **LLM 调用函数**

```python
from openai import OpenAI
from typing import List, Dict

def ask(messages: List[Dict]):
    client = OpenAI(api_key='YOUR KEY', base_url="https://api.deepseek.com")
    response = client.chat.completions.create(
        model='deepseek-chat',
        temperature=0,
        messages=messages
    )
    return response
```

### **工具类定义**

```python
class Step:
    def __init__(self, idx, name, tool, args, dependencies):
        self.idx = idx
        self.name = name
        self.tool = tool
        self.args = args
        self.dependencies = dependencies
        self.observation = None

    def exec(self):
        self.observation = self.tool(self.args)
        return self.observation
```

---

## 🌟 **总结**

### **学到了什么？**

1. **CoT** - 让 LLM 逐步推理
2. **Self Refine** - 让 LLM 自我改进
3. **Plan and Execute** - 让 LLM 规划并调用工具
4. **ReAct** - 让 LLM 推理和行动交错执行

### **下一步做什么？**

1. 实践这些方法
2. 构建自己的 agent
3. 优化 prompt 工程
4. 探索更多工具调用

---

**整理完成时间：** 2026-03-26  
**适合人群：** Agent 初学者、LLM 应用开发者  
**前置知识：** Python 基础、LLM 基本概念

**祝你学习愉快！** 🌱📚