当前位置：首页 > ds >正文

鸿蒙OSUniApp 实现的语音输入与语音识别功能#三方框架 #Uniapp

ds 2025/7/16 4:05:36

UniApp 实现的语音输入与语音识别功能

最近在开发跨平台应用时，客户要求添加语音输入功能以提升用户体验。经过一番调研和实践，我成功在UniApp项目中实现了语音输入与识别功能，现将过程和方法分享出来，希望对有类似需求的开发者有所帮助。

为什么需要语音输入功能？

随着移动设备的普及，语音交互已成为一种高效的人机交流方式。与传统的文字输入相比，语音输入具有以下优势：

操作便捷：免去键盘敲击，尤其适合单手操作或行走等场景
输入高效：语音输入速度通常快于手动输入
提升体验：为特定人群（如老年人、视障人士）提供便利
解放双手：适用于驾车、做家务等无法腾出手打字的场景

在商业应用中，语音输入可以显著降低用户的操作门槛，提高转化率和用户留存。

技术方案选型

在UniApp环境中实现语音识别，主要有三种方案：

使用原生插件：调用各平台的原生语音识别能力
对接云服务：接入第三方语音识别API（如百度、讯飞等）
Web API：在H5平台利用Web Speech API

经过对比和测试，我最终采用了混合方案：

在App平台使用原生插件获取最佳体验
在微信小程序使用微信自带的语音识别能力
在H5平台尝试使用Web Speech API，不支持时降级为云服务API

实现步骤

1. App端实现（基于原生插件）

首先需要安装语音识别插件。我选择了市场上比较成熟的speech-baidu插件，这是基于百度语音识别SDK封装的UniApp插件。

安装插件后，在manifest.json中配置：

"app-plus": {"plugins": {"speech": {"baidu": {"appid": "你的百度语音识别AppID","apikey": "你的API Key","secretkey": "你的Secret Key"}}},"distribute": {"android": {"permissions": ["<uses-permission android:name=\"android.permission.RECORD_AUDIO\"/>","<uses-permission android:name=\"android.permission.INTERNET\"/>"]}}
}

接下来创建语音识别组件：

<template><view class="voice-input-container"><view class="voice-btn" :class="{ 'recording': isRecording }"@touchstart="startRecord" @touchend="stopRecord"@touchcancel="cancelRecord"><image :src="isRecording ? '/static/mic-active.png' : '/static/mic.png'" mode="aspectFit"></image><text>{{ isRecording ? '松开结束' : '按住说话' }}</text></view><view v-if="isRecording" class="recording-tip"><text>正在聆听...</text><view class="wave-container"><view v-for="(item, index) in waveItems" :key="index" class="wave-item":style="{ height: item + 'rpx' }"></view></view></view></view>
</template><script>
// #ifdef APP-PLUS
const speechPlugin = uni.requireNativePlugin('speech-baidu');
// #endifexport default {name: 'VoiceInput',data() {return {isRecording: false,timer: null,waveItems: [10, 15, 20, 25, 30, 25, 20, 15, 10]}},props: {lang: {type: String,default: 'zh'  // zh: 中文, en: 英文},maxDuration: {type: Number,default: 60  // 最长录音时间，单位秒}},methods: {startRecord() {if (this.isRecording) return;// 申请录音权限uni.authorize({scope: 'scope.record',success: () => {this.isRecording = true;this.startWaveAnimation();// #ifdef APP-PLUSspeechPlugin.start({vadEos: 3000,  // 静音超时时间language: this.lang === 'zh' ? 'zh-cn' : 'en-us'}, (res) => {if (res.errorCode === 0) {// 识别结果this.$emit('result', res.result);} else {uni.showToast({title: `识别失败: ${res.errorCode}`,icon: 'none'});}this.isRecording = false;this.stopWaveAnimation();});// #endif// 设置最长录制时间this.timer = setTimeout(() => {if (this.isRecording) {this.stopRecord();}}, this.maxDuration * 1000);},fail: () => {uni.showToast({title: '请授权录音权限',icon: 'none'});}});},stopRecord() {if (!this.isRecording) return;// #ifdef APP-PLUSspeechPlugin.stop();// #endifclearTimeout(this.timer);this.isRecording = false;this.stopWaveAnimation();},cancelRecord() {if (!this.isRecording) return;// #ifdef APP-PLUSspeechPlugin.cancel();// #endifclearTimeout(this.timer);this.isRecording = false;this.stopWaveAnimation();},// 波形动画startWaveAnimation() {this.waveAnimTimer = setInterval(() => {this.waveItems = this.waveItems.map(() => Math.floor(Math.random() * 40) + 10);}, 200);},stopWaveAnimation() {clearInterval(this.waveAnimTimer);this.waveItems = [10, 15, 20, 25, 30, 25, 20, 15, 10];}},beforeDestroy() {this.cancelRecord();}
}
</script><style scoped>
.voice-input-container {width: 100%;
}.voice-btn {width: 200rpx;height: 200rpx;border-radius: 100rpx;background-color: #f5f5f5;display: flex;flex-direction: column;align-items: center;justify-content: center;margin: 0 auto;
}.voice-btn.recording {background-color: #e1f5fe;box-shadow: 0 0 20rpx rgba(0, 120, 255, 0.5);
}.voice-btn image {width: 80rpx;height: 80rpx;margin-bottom: 10rpx;
}.recording-tip {margin-top: 30rpx;text-align: center;
}.wave-container {display: flex;justify-content: center;align-items: flex-end;height: 80rpx;margin-top: 20rpx;
}.wave-item {width: 8rpx;background-color: #1890ff;margin: 0 5rpx;border-radius: 4rpx;transition: height 0.2s;
}
</style>

2. 微信小程序实现

微信小程序提供了原生的语音识别API，使用非常方便：

// 在小程序环境下的代码
startRecord() {// #ifdef MP-WEIXINthis.isRecording = true;this.startWaveAnimation();const recorderManager = wx.getRecorderManager();recorderManager.onStart(() => {console.log('录音开始');});recorderManager.onStop((res) => {this.isRecording = false;this.stopWaveAnimation();// 将录音文件发送到微信后台识别wx.showLoading({ title: '识别中...' });const { tempFilePath } = res;wx.uploadFile({url: 'https://api.weixin.qq.com/cgi-bin/media/voice/translatecontent',filePath: tempFilePath,name: 'media',formData: {access_token: this.accessToken,format: 'mp3',voice_id: Date.now(),lfrom: this.lang === 'zh' ? 'zh_CN' : 'en_US',lto: 'zh_CN'},success: (uploadRes) => {wx.hideLoading();const data = JSON.parse(uploadRes.data);if (data.errcode === 0) {this.$emit('result', data.result);} else {uni.showToast({title: `识别失败: ${data.errmsg}`,icon: 'none'});}},fail: () => {wx.hideLoading();uni.showToast({title: '语音识别失败',icon: 'none'});}});});recorderManager.start({duration: this.maxDuration * 1000,sampleRate: 16000,numberOfChannels: 1,encodeBitRate: 48000,format: 'mp3'});// #endif
},stopRecord() {// #ifdef MP-WEIXINwx.getRecorderManager().stop();// #endif// ...与App端相同的代码...
}

需要注意的是，微信小程序的语音识别需要获取access_token，这通常需要在后端实现并提供接口。

3. H5端实现

在H5端，我们可以利用Web Speech API来实现语音识别，当浏览器不支持时则降级为云服务API：

startRecord() {// #ifdef H5this.isRecording = true;this.startWaveAnimation();// 检查浏览器是否支持Speech Recognitionif ('webkitSpeechRecognition' in window || 'SpeechRecognition' in window) {const SpeechRecognition = window.SpeechRecognition || window.webkitSpeechRecognition;this.recognition = new SpeechRecognition();this.recognition.lang = this.lang === 'zh' ? 'zh-CN' : 'en-US';this.recognition.continuous = false;this.recognition.interimResults = false;this.recognition.onresult = (event) => {const result = event.results[0][0].transcript;this.$emit('result', result);};this.recognition.onerror = (event) => {uni.showToast({title: `识别错误: ${event.error}`,icon: 'none'});};this.recognition.onend = () => {this.isRecording = false;this.stopWaveAnimation();};this.recognition.start();} else {// 不支持Web Speech API，调用云服务APIthis.useCloudSpeechAPI();}// #endif// 设置最长录制时间this.timer = setTimeout(() => {if (this.isRecording) {this.stopRecord();}}, this.maxDuration * 1000);
},stopRecord() {// #ifdef H5if (this.recognition) {this.recognition.stop();}// #endif// ...与App端相同的代码...
},useCloudSpeechAPI() {// 这里实现降级方案，调用后端接口进行语音识别uni.chooseFile({count: 1,type: 'file',extension: ['.mp3', '.wav'],success: (res) => {const tempFilePath = res.tempFilePaths[0];// 上传音频文件到后端进行识别uni.uploadFile({url: this.apiBaseUrl + '/speech/recognize',filePath: tempFilePath,name: 'audio',formData: {lang: this.lang},success: (uploadRes) => {const data = JSON.parse(uploadRes.data);if (data.code === 0) {this.$emit('result', data.result);} else {uni.showToast({title: `识别失败: ${data.msg}`,icon: 'none'});}},complete: () => {this.isRecording = false;this.stopWaveAnimation();}});}});
}

4. 通用接口封装

为了让调用方便，我封装了一个统一的API：

// 在 utils/speech.js 中
const Speech = {// 开始语音识别startRecognize(options) {const { lang = 'zh', success, fail, complete } = options;// #ifdef APP-PLUSconst speechPlugin = uni.requireNativePlugin('speech-baidu');speechPlugin.start({vadEos: 3000,language: lang === 'zh' ? 'zh-cn' : 'en-us'}, (res) => {if (res.errorCode === 0) {success && success(res.result);} else {fail && fail(res);}complete && complete();});return {stop: () => speechPlugin.stop(),cancel: () => speechPlugin.cancel()};// #endif// #ifdef MP-WEIXIN// 微信小程序实现逻辑// ...// #endif// #ifdef H5// H5实现逻辑// ...// #endif}
};export default Speech;

实战案例：聊天应用中的语音输入

现在，我们来看一个实际应用场景 - 在聊天应用中添加语音输入功能：

<template><view class="chat-input-container"><view class="chat-tools"><image :src="isVoiceMode ? '/static/keyboard.png' : '/static/mic.png'" @tap="toggleInputMode"></image><image src="/static/emoji.png" @tap="showEmojiPicker"></image></view><view v-if="!isVoiceMode" class="text-input"><textareav-model="message"auto-heightplaceholder="请输入消息...":focus="textFocus"@focus="onFocus"@blur="onBlur"></textarea></view><view v-else class="voice-input"><voice-input @result="onVoiceResult"></voice-input></view><button class="send-btn" :disabled="!message.trim()" @tap="sendMessage">发送</button></view>
</template><script>
import VoiceInput from '@/components/voice-input/voice-input.vue';export default {components: {VoiceInput},data() {return {message: '',isVoiceMode: false,textFocus: false};},methods: {toggleInputMode() {this.isVoiceMode = !this.isVoiceMode;if (!this.isVoiceMode) {this.$nextTick(() => {this.textFocus = true;});}},onVoiceResult(result) {this.message = result;this.isVoiceMode = false;},sendMessage() {if (!this.message.trim()) return;this.$emit('send', this.message);this.message = '';},onFocus() {this.textFocus = true;},onBlur() {this.textFocus = false;},showEmojiPicker() {// 显示表情选择器}}
};
</script><style>
.chat-input-container {display: flex;align-items: center;padding: 20rpx;border-top: 1rpx solid #eee;background-color: #fff;
}.chat-tools {display: flex;margin-right: 20rpx;
}.chat-tools image {width: 60rpx;height: 60rpx;margin-right: 20rpx;
}.text-input {flex: 1;background-color: #f5f5f5;border-radius: 10rpx;padding: 10rpx 20rpx;
}.text-input textarea {width: 100%;min-height: 60rpx;max-height: 240rpx;
}.voice-input {flex: 1;display: flex;justify-content: center;
}.send-btn {width: 140rpx;height: 80rpx;line-height: 80rpx;font-size: 28rpx;margin-left: 20rpx;padding: 0;background-color: #1890ff;color: #fff;
}.send-btn[disabled] {background-color: #ccc;
}
</style>

性能优化和注意事项

在实际开发中，我遇到了一些需要特别注意的问题：

1. 权限处理

语音识别需要麦克风权限，不同平台的权限处理方式不同：

// 统一请求录音权限
requestAudioPermission() {return new Promise((resolve, reject) => {// #ifdef APP-PLUSconst permissions = ['android.permission.RECORD_AUDIO'];plus.android.requestPermissions(permissions,function(e) {if (e.granted.length === permissions.length) {resolve();} else {reject(new Error('未授予录音权限'));}},function(e) {reject(e);});// #endif// #ifdef MP-WEIXIN || MP-BAIDUuni.authorize({scope: 'scope.record',success: () => resolve(),fail: (err) => reject(err)});// #endif// #ifdef H5if (navigator.mediaDevices && navigator.mediaDevices.getUserMedia) {navigator.mediaDevices.getUserMedia({ audio: true }).then(() => resolve()).catch(err => reject(err));} else {reject(new Error('浏览器不支持录音功能'));}// #endif});
}

2. 流量控制

语音识别需要上传音频数据，在移动网络下会消耗流量：

// 检查网络环境并提示用户
checkNetwork() {uni.getNetworkType({success: (res) => {if (res.networkType === '2g' || res.networkType === '3g') {uni.showModal({title: '流量提醒',content: '当前处于移动网络环境，语音识别可能消耗较多流量，是否继续？',success: (confirm) => {if (confirm.confirm) {this.startSpeechRecognition();}}});} else {this.startSpeechRecognition();}}});
}

3. 性能优化

长时间语音识别会增加内存和电量消耗，需要做好优化：

// 设置最大录音时长和自动结束
setupMaxDuration() {if (this.timer) {clearTimeout(this.timer);}this.timer = setTimeout(() => {if (this.isRecording) {uni.showToast({title: '录音时间过长，已自动结束',icon: 'none'});this.stopRecord();}}, this.maxDuration * 1000);
}// 空闲自动停止
setupVAD() {// 监测静音，如果用户停止说话3秒，自动结束录音let lastAudioLevel = 0;let silenceCounter = 0;this.vadTimer = setInterval(() => {// 获取当前音量const currentLevel = this.getAudioLevel();if (Math.abs(currentLevel - lastAudioLevel) < 0.05) {silenceCounter++;if (silenceCounter > 30) { // 3秒 (30 * 100ms)this.stopRecord();}} else {silenceCounter = 0;}lastAudioLevel = currentLevel;}, 100);
}

增强功能：语音合成（TTS）

除了语音识别外，语音合成（Text-to-Speech）也是很有用的功能，可以将文本转换为语音：

// 语音合成
textToSpeech(text, options = {}) {const { lang = 'zh', speed = 5, volume = 5 } = options;// #ifdef APP-PLUSconst speechPlugin = uni.requireNativePlugin('speech-baidu');return new Promise((resolve, reject) => {speechPlugin.textToSpeech({text,language: lang === 'zh' ? 'zh-cn' : 'en-us',speed,volume}, (res) => {if (res.errorCode === 0) {resolve(res);} else {reject(new Error(`语音合成失败: ${res.errorCode}`));}});});// #endif// #ifdef H5return new Promise((resolve, reject) => {if ('speechSynthesis' in window) {const speech = new SpeechSynthesisUtterance();speech.text = text;speech.lang = lang === 'zh' ? 'zh-CN' : 'en-US';speech.rate = speed / 10;speech.volume = volume / 10;speech.onend = () => {resolve();};speech.onerror = (err) => {reject(err);};window.speechSynthesis.speak(speech);} else {reject(new Error('当前浏览器不支持语音合成'));}});// #endif
}

踩坑记录与解决方案

开发过程中，我遇到了一些常见问题与解决方法，分享如下：

百度语音插件初始化失败：检查API密钥配置和网络环境，特别是HTTPS限制
H5录音无法使用：多数浏览器要求必须在HTTPS环境下才能使用麦克风
识别结果不准确：尝试调整录音参数，如采样率、声道数等，或者使用更专业的噪声抑制算法
微信小程序调用失败：检查access_token是否有效，注意token有效期
不同设备体验差异大：针对低端设备优化，如减少动画效果、降低采样率等

我们的解决方案是进行兼容性检测，并根据设备性能自动调整参数：

// 检测设备性能并调整参数
detectDevicePerformance() {const platform = uni.getSystemInfoSync().platform;const brand = uni.getSystemInfoSync().brand;const model = uni.getSystemInfoSync().model;// 低端安卓设备优化if (platform === 'android') {// 特定型号的优化if (brand === 'samsung' && model.includes('SM-J')) {return {sampleRate: 8000,quality: 'low',useVAD: false // 禁用语音活动检测，降低CPU占用};}}// 默认配置return {sampleRate: 16000,quality: 'high',useVAD: true};
}