Spring AI(8)——流式响应
流式响应允许异步获取响应数据。通过该方式,大模型每次返回一部分token,而不是等到生成完整结果后再返回。
我们首先来看下类的继承关系:
ZhiPuAiChatModel 实现了ChatModel接口。
public class ZhiPuAiChatModel implements ChatModel {......}
ChatModel接口继承了StreamingChatModel接口。
public interface ChatModel extends Model<Prompt, ChatResponse>, StreamingChatModel {......}
StreamingChatModel接口中包含的方法:
@FunctionalInterface
public interface StreamingChatModel extends StreamingModel<Prompt, ChatResponse> {default Flux<String> stream(String message) {Prompt prompt = new Prompt(message);return this.stream(prompt).map((response) -> {return response.getResult() != null && response.getResult().getOutput() != null && response.getResult().getOutput().getText() != null ? response.getResult().getOutput().getText() : "";});}default Flux<String> stream(Message... messages) {Prompt prompt = new Prompt(Arrays.asList(messages));return this.stream(prompt).map((response) -> {return response.getResult() != null && response.getResult().getOutput() != null && response.getResult().getOutput().getText() != null ? response.getResult().getOutput().getText() : "";});}Flux<ChatResponse> stream(Prompt prompt);
}
通过源码可以看到,调用ChatModel对象调用stream()方法即可实现流式响应。并且其返回值有两种形式:Flux<String>和Flux<ChatResponse>。
测试代码:
package com.renr.springainew.controller;import jakarta.annotation.Resource;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.model.ChatResponse;
import org.springframework.ai.chat.prompt.Prompt;
import org.springframework.ai.zhipuai.ZhiPuAiChatModel;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import reactor.core.publisher.Flux;@RestController
public class ChatController {@Resourceprivate ZhiPuAiChatModel chatModel;@Resourceprivate ChatClient client;@GetMapping("/stream")public Flux<ChatResponse> generateStream(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {Prompt prompt = new Prompt(message);Flux<ChatResponse> stream = this.chatModel.stream(prompt);return stream;}@GetMapping("/stream2")public Flux<String> generateStream2(String message) {return this.chatModel.stream(message);}@GetMapping("/stream3")public Flux<String> generateStream3(String message) {return this.client.prompt().user(message).stream().content();}
}
测试代码通过ChatModel对象和ChatClient对象分别实现了流式响应。
注意:一定要通过浏览器进行测试,否则无法看到流式响应的效果。