本地进行语音文字互转
文字转语音 ChatTTS
https://github.com/2noise/ChatTTS
- 用下面的代码即可实现,输出一个wav音频文件,即为转化的结果
from torch import manual_seedimport ChatTTS
import torch
import torchaudiochat = ChatTTS.Chat()
chat.load(compile=True) # Set to True for better performancetexts = ["今年上半年,各地因地制宜,加快建设各具特色的产业集群。"]# 设置种子, 保持音色稳定
torch.manual_seed(120)wavs = chat.infer(texts)for i in range(len(wavs)):"""In some versions of torchaudio, the first line works but in other versions, so does the second line."""try:torchaudio.save(f"basic_output{i}.wav", torch.from_numpy(wavs[i]).unsqueeze(0), 24000, format="wav")except:torchaudio.save(f"basic_output{i}.wav", torch.from_numpy(wavs[i]), 24000, format="wav")
- 需要安装soundfile来处理音频格式
pip install soundfile
RuntimeError: narrow(): length must be non-negative.
- transformer版本问题导致的,运行下面命令即可解决
pip install transformers==4.53.2
https://github.com/2noise/ChatTTS/issues/955
语音转文字 whisper
https://github.com/openai/whisper
- 把上面转出来的wav文件,用whisper再转成文字,按照官方代码实例即可