当前位置：首页 > news >正文

【LLM】OpenRouter调用Anthropic Claude上下文缓存处理

news 2025/7/17 6:21:17

背景

在使用OpenRouter调用Anthropic Claude大模型时，部分模型支持上下文缓存功能。当缓存命中时，调用成本会显著降低。虽然像DeepSeek这类模型自带上下文缓存机制，但本文主要针对构建Agent场景下，需要多次调用Anthropic Claude时的缓存设置策略。

缓存机制的价值

根据官方定价策略：

缓存设置：需要支付额外费用
缓存命中：可大幅降低调用成本
成本效益：在大量调用场景下，缓存命中能带来显著的成本节约

提示：可以通过OpenRouter账单中的调用历史费用来验证是否成功命中缓存。

官方缓存设置方法

根据官方文档的说明：

标准的缓存设置通过在消息中添加以下结构实现：

{"cache_control": {"type": "ephemeral"}
}

缓存机制原理：这是一个前缀缓存机制，即设置缓存的消息之前的所有消息都会被缓存。

现有问题与限制

经过实际测试发现：

✅ 有效场景：在role为user的消息中设置缓存控制有效
❌ 无效场景：在role为tool的消息中设置缓存控制无效（尽管Claude官方API支持）

注意：这个问题在OpenRouter社区中已有反馈，但目前尚未得到修复。

解决方案

针对工具调用后无法在tool消息中设置缓存的问题，我们采用添加用户消息的方式来绕过限制。

原始消息结构

[{"role": "system","content": [ {"type": "text", "text": "..."} ]},{"role": "user","content": [{ "type": "text", "text": "...", "cache_control": {"type": "ephemeral"} }]},{"role": "assistant","content": [ {"type": "text", "text": "..."} ],"tool_calls": []},{"role": "tool", "tool_call_id": "...", "name": "...", "content": "..."}, // 这里无法添加cache_control{"role": "assistant","content": [ {"type": "text", "text": "..."} ],"tool_calls": []}
]

优化后的消息结构

[{"role": "system","content": [ {"type": "text", "text": "..."} ]},{"role": "user","content": [{ "type": "text", "text": "..."}]},{"role": "assistant","content": [ {"type": "text", "text": "..."} ],"tool_calls": []},{"role": "tool", "tool_call_id": "...", "name": "...", "content": "..."},{"role": "user","content": [{ "type": "text", "text": "function called", "cache_control": {"type": "ephemeral"} }]}, // 新增用户消息来设置缓存{"role": "assistant","content": [ {"type": "text", "text": "..."} ],"tool_calls": []}
]