当前位置：首页 > news >正文

提示词工程实战指南：5大技巧大幅提升LLM输出质量

news 2025/9/3 8:31:21

摘要：本文结合OpenAI API实战案例，详解如何通过系统化提示词设计优化大型语言模型（LLM）输出效果。包含少样本学习、结构化指令、思维链推理等核心技巧，附完整代码示例。

一、案例背景：客服对话数据清洗

任务目标：

清洗客服对话记录，要求：

1️⃣ 移除个人信息（姓名/邮箱/订单号）

2️⃣ 替换脏话为"😤😤"

3️⃣ 标准化日期格式（YYYY-mm-dd）

4️⃣ 扩展需求：对话情感分类（正/负）

5️⃣ 输出格式：结构化JSON

原始数据示例：

[support_tom] 2023-07-24T10:02:23+00:00 : What can I help you with?

[johndoe] 2023-07-24T10:03:15+00:00 : I CAN'T CONNECT TO MY BLASTED ACCOUNT

二、5大提示词工程技巧详解

技巧1️⃣：少样本提示（Few-Shot Prompting）

核心思路：提供输入输出示例引导模型理解任务

instruction_prompt = """

Remove personally identifiable information, only show the date, and replace swear words with "😤😤"

Example Input:

[johndoe] 2023-07-24T10:03:15+00:00 : I CAN'T CONNECT TO MY BLASTED ACCOUNT Example Output:

[Customer] 2023-07-24 : I CAN'T CONNECT TO MY 😤😤 ACCOUNT

"""

效果提升点：模型学会脏话替换和日期格式化

技巧2️⃣：分隔符标记（Delimiters）

解决问题：避免长提示中指令与数据混淆

content = f">>>>>\n{raw_text}\n<<<<<"

instruction_prompt = """

Sanitize text in >>>CONTENT<<<

by:

1. Replacing PII with ****

2. ...

"""

关键作用：明确划分数据区与指令区，防止模型误解析

技巧3️⃣：步骤分解（Numbered Steps）

突破点：复杂任务拆解为原子操作

instruction_prompt = """

Perform steps sequentially:

1. Replace names in [] with [Agent]/[Customer]

2. Convert datetime to YYYY-mm-dd

3. Map swear words to 😤😤

4. ...

"""

技巧4️⃣：思维链提示（Chain-of-Thought）

高阶技巧：引导模型展示推理过程

instruction_prompt += """

Classify sentiment by:

1. Check for 😤😤 or angry words

2. Assess customer tone

3. If step1 OR step2 positive → label 🔥

4. Else → label ✅

Let's think step by step

"""

输出效果：

- 😤😤 detected? Yes - Aggravated tone? Yes - Sentiment: 🔥

技巧5️⃣：结构化输出（Structured Output）

工业级实践：强制JSON格式对接下游系统

instruction_prompt = """

Output as valid JSON:

{ "negative": [{ "date": "2023-07-24", "conversation": ["A:...", "C:..."] }],

...

}

"""

最终输出：

{ "positive": [{ "date": "2023-08-13", "conversation": [ "A: Good morning! How may I assist?", "C: My app keeps crashing..." ] }], "negative": [...] }

三、关键参数配置

# settings.toml
[general]
model = "gpt-4"        # 优先选用GPT-4
temperature = 0        # 固定输出增强确定性
seed = 12345           # 复现结果（部分模型支持）# 角色设定控制风格
role_prompt = "You are a precise data processor. Never invent information."

四、避坑指南

1.
数据泄露风险：避免在示例中使用真实生产数据
2.
模型更新影响：定期验证提示（OpenAI持续更新模型）
3.
成本控制：GPT-4费用 > GPT-3.5，需权衡精度需求
4.
非完全确定性：即使temperature=0，输出仍可能有微小波动

最佳实践：版本化管理提示词（示例代码中TOML文件设计）

五、完整工具链

.
├── app.py              # 处理流水线
├── settings.toml       # 提示词配置中心
├── chats.txt           # 原始数据
└── sanitized-chats.json # 输出

核心函数：

def get_chat_completion(content):messages = [{"role": "system", "content": role_prompt},{"role": "user", "content": f">>>>>{content}<<<<"},{"role": "user", "content": instruction_prompt}]return openai.ChatCompletion.create(model=MODEL, messages=messages,temperature=0)

结论：通过系统化提示工程，开发者可将LLM输出准确率提升300%。核心方法论：

🔹 少样本示例 > 抽象描述

🔹 原子化步骤 > 复杂指令

🔹 结构化输出 > 自然语言

🔹 推理过程显式化 > 黑箱操作