当前位置: 首页 > news >正文

微软云语音识别ASR示例Demo

对象存储服务 OSS 对应    Azure Blob Storage

语音识别 ASR 对应   Azure Speech-to-Text

语音合成 TTS 对应   Azure Text-to-Speech

上传..mp3文件或者上传OSS地址  返回音频的文字示例demo

依赖

<dependencies><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-webflux</artifactId></dependency><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-web</artifactId></dependency><!--   microsoft ASR     --><dependency><groupId>com.microsoft.cognitiveservices.speech</groupId><artifactId>client-sdk</artifactId><version>1.43.0</version></dependency><dependency><groupId>org.projectlombok</groupId><artifactId>lombok</artifactId><optional>true</optional></dependency><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-test</artifactId><scope>test</scope></dependency><dependency><groupId>io.projectreactor</groupId><artifactId>reactor-test</artifactId><scope>test</scope></dependency></dependencies>

代码    在application.properties或者yaml中配置key和endpoint

package com.example.microsoftasr.controller;import com.microsoft.cognitiveservices.speech.*;
import com.microsoft.cognitiveservices.speech.audio.AudioConfig;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.web.bind.annotation.*;
import org.springframework.web.multipart.MultipartFile;import java.io.File;
import java.net.URI;
import java.nio.file.Files;@RestController
@RequestMapping("/asr")
public class TestController {@Value("${azure.speech.key}")private String speechKey;@Value("${azure.speech.endpoint}")private String speechEndpoint;@GetMapping("/hello")public String test() {return "Hello World";}@PostMapping("/recognize")public String recognize(@RequestParam(value = "file", required = false) MultipartFile file,@RequestParam(value = "url", required = false) String ossUrl) {if ((file == null || file.isEmpty()) && (ossUrl == null || ossUrl.isBlank())) {return "未提供音频文件或音频地址";}File tempInput = null;File tempWav = null;try {// 1. 保存临时原始音频if (file != null && !file.isEmpty()) {String suffix = getSuffix(file.getOriginalFilename());tempInput = File.createTempFile("audio-input-", "." + suffix);file.transferTo(tempInput);} else {String suffix = getSuffix(ossUrl);tempInput = File.createTempFile("audio-input-", "." + suffix);try (var in = new java.net.URL(ossUrl).openStream()) {Files.copy(in, tempInput.toPath(), java.nio.file.StandardCopyOption.REPLACE_EXISTING);}}// 2. 转换成 WAV(16kHz 单声道)tempWav = File.createTempFile("audio-output-", ".wav");if (!getSuffix(tempInput.getName()).equalsIgnoreCase("wav")) {ProcessBuilder pb = new ProcessBuilder("F:\\ffmpeg-7.1.1-full_build\\ffmpeg-7.1.1-full_build\\bin\\ffmpeg.exe", "-y","-i", tempInput.getAbsolutePath(),"-ar", "16000","-ac", "1",tempWav.getAbsolutePath());Process process = pb.inheritIO().start();int exitCode = process.waitFor();if (exitCode != 0) return "ffmpeg 转换失败,exitCode=" + exitCode;} else {Files.copy(tempInput.toPath(), tempWav.toPath(), java.nio.file.StandardCopyOption.REPLACE_EXISTING);}// 3. 调用微软 ASR 识别SpeechConfig speechConfig = SpeechConfig.fromEndpoint(new URI(speechEndpoint), speechKey);speechConfig.setSpeechRecognitionLanguage("zh-CN");try (AudioConfig audioConfig = AudioConfig.fromWavFileInput(tempWav.getAbsolutePath());SpeechRecognizer recognizer = new SpeechRecognizer(speechConfig, audioConfig)) {SpeechRecognitionResult result = recognizer.recognizeOnceAsync().get();if (result.getReason() == ResultReason.RecognizedSpeech) {return result.getText();} else {return "识别失败: " + result.getReason();}}} catch (Exception e) {e.printStackTrace();return "识别异常: " + e.getMessage();} finally {try {if (tempInput != null) Files.deleteIfExists(tempInput.toPath());if (tempWav != null) Files.deleteIfExists(tempWav.toPath());} catch (Exception ex) {ex.printStackTrace();}}}private String getSuffix(String filenameOrUrl) {if (filenameOrUrl == null || !filenameOrUrl.contains(".")) return "tmp";return filenameOrUrl.substring(filenameOrUrl.lastIndexOf('.') + 1);}}

http://www.xdnf.cn/news/1097821.html

相关文章:

  • Spring Boot:将应用部署到Kubernetes的完整指南
  • 使用langchain连接llama.cpp部署的本地deepseek大模型开发简单的LLM应用
  • Rust and the Linux Kernel
  • AUTOSAR进阶图解==>AUTOSAR_SWS_MFXLibrary
  • imx6ull-裸机学习实验17——SPI 实验
  • 数据结构与算法之美:广义表
  • 【SpringBoot实战系列】SpringBoot3.X 整合 MinIO 存储原生方案
  • JAVA JVM的内存区域划分
  • 政安晨【开源人工智能硬件】【ESP乐鑫篇】 —— 在macOS上部署工具开发环境(小资的非开发者用苹果系统也可以玩乐鑫)
  • 在 Mac 上安装 Java 和 IntelliJ IDEA(完整笔记)
  • (鱼书)深度学习入门1:python入门
  • 【IO复用】五种IO模型
  • 【2025/07/10】GitHub 今日热门项目
  • steam独立游戏开发销售全流程:2025实战版
  • 数据结构笔记10:排序算法
  • 百度文心ERNIE4.5部署与性能白皮书:FastDeploy加速方案+全系列模型实测数据对比
  • jenkins部署springboot项目
  • 实验作业1+整理笔记截图
  • 缺乏日常项目进度例会机制,如何系统推进
  • Spring事务管理深度解析:原理、实践与陷阱
  • Web前端:table标签的用法与属性
  • CMake指令:add_custom_command和add_custom_target详解
  • RLHF(人类反馈的强化学习)
  • SD NAND闪存技术全面解析
  • Linux 文件 IO 详解:从系统调用到实际操作
  • BatchNorm解决梯度消失/爆炸
  • 三维旋转沿轴分解
  • MySQL断开连接后无法正常启动解决记录
  • (鱼书)深度学习入门2:手搓感知机
  • 华锐云空间展销编辑器:开启数字化展示新时代​