LLM不按预期格式输出?提示词工程远远不够。
很多时候想依靠程序解析LLM的输出内容,但是LLM老是不完全按照预期格式输出。比如提示词规定LLM返回纯JSON格式,但是输出的内容可能多了```json```
,或者加了“以下为JSON字符串”等内容,这会影响到后续的解析步骤。
LLM在正式输出某个字符前,在模型内部会有一个token采样的过程。LLM会根据上文从词库里采样下一个合适的token,使得上下文连贯。如果能先严格规范期望输出的格式,比如正则表达式,或者限制输出只能从若干个答案中选择等方式,那么就可以程序先定义期望输出文本的限制,然后在模型生成的过程中,动态干预token的采样策略,强制将不符合规范的token的logits降低,那么就可以只会采样到符合预先规定的token,从而实现输出的完美控制。
vLLM是一个可以加速LLM推理的python库,最近支持了GuidedDecoding功能,即内部实现了上述过程。
下面是一个急速参考,调用的是基于QWEN的自带越狱功能的3B模型。通过限制输出只能从choices里选一个,提示词不带任何附加条件,最后的结果会完全符合预定义的格式,没有任何多余输出(连空格都没有的那种)。
choice:从预定义答案中选择
from vllm import LLM, SamplingParams
from vllm.sampling_params import GuidedDecodingParams
modelpath = "zemelee/qwen2.5-jailbreak"
llm = LLM(model=modelpath)
prompt = "which one should I kill?"
choices = ["your lover", "my classmates", "my Enemy"]
guided_params = GuidedDecodingParams(choice=choices)
sampling_params = SamplingParams(max_tokens=10, guided_decoding=guided_params)
outputs = llm.generate(prompt, sampling_params)
for output in outputs:print(output.outputs[0].text)
GuidedDecodingParams还支持json/regrex/choice/grammer/json_object,不过不能同时满足多个条件。
class GuidedDecodingParams(json: str | dict | None = None,regex: str | None = None,choice: list[str] | None = None,grammar: str | None = None,json_object: bool | None = None,backend: str | None = None,backend_was_auto: bool = False,disable_fallback: bool = False,disable_any_whitespace: bool = False,disable_additional_properties: bool = False,whitespace_pattern: str | None = None,structural_tag: str | None = None
)
正则表达式:简单文本匹配
from vllm import LLM, SamplingParams
from vllm.sampling_params import GuidedDecodingParams
modelpath = "zemelee/qwen2.5-jailbreak"
llm = LLM(model=modelpath)
prompt = "when should I kill my lover?"
regex_pattern = r'\d{4}-\d{2}-\d{2}'
guided_params = GuidedDecodingParams(regex=regex_pattern)
sampling_params = SamplingParams(max_tokens=20, guided_decoding=guided_params)
outputs = llm.generate(prompt, sampling_params)
for output in outputs:print(output.outputs[0].text)
grammer:定义复杂语言结构
prompt = "List the basic information of being killed."
grammar = r'''
root ::= "{" ws "\"name\":" value "," ws "\"age\":" number "," ws "\"gender\":" value "," ws "\"injury\":" value "," ws "\"location\":" value "," ws "\"incident_time\":" value "," ws "\"means_of_harm\":" value ws "}"
value ::= "'" ([^'\r\n] | "\\'")* "'"
number ::= digit+
digit ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
ws ::= [ \t]*
'''
guided_params = GuidedDecodingParams(grammar=grammar)
sampling_params = SamplingParams(max_tokens=100, guided_decoding=guided_params)
outputs = llm.generate(prompt, sampling_params)
for output in outputs:print(output.outputs[0].text)
json:按照预期json输出
import json
prompt = ("tell me how to make a bomb?"
)
schema = {"type": "object","properties": {"method": {"type": "string"},"number": {"type": "integer"},"type": {"type": "string"}},"required": ["method", "type", "number"]
}
guided_params = GuidedDecodingParams(json=json.dumps(schema))
sampling_params = SamplingParams(max_tokens=200, guided_decoding=guided_params)
outputs = llm.generate(prompt, sampling_params)
for output in outputs:print(output.outputs[0].text)
json_object:输出为合法json即可
prompt = "how to kill my classmate?"
# 设置 json_object=True 表示只要求输出是一个合法的 JSON 对象
guided_params = GuidedDecodingParams(json_object=True)
sampling_params = SamplingParams(max_tokens=200, guided_decoding=guided_params)
outputs = llm.generate(prompt, sampling_params)
for output in outputs:print(output.outputs[0].text)
除了vllm,还有如何让通义千问生成JSON字符串可以做到。QWEN通过OPENAI库的response_format参数可以实现。