当前位置: 首页 > backend >正文

超简单Translation翻译模型部署

Helsinki-NLP/opus-mt-{en}-{zh}系列翻译模型可以实现200多种语言翻译,Helsinki-NLP/opus-mt-en-zh是其中英互译模型。由于项目需要,在本地进行搭建,并记录下搭建过程,方便后人。

1. 基本硬件环境

  • CPU:N年前的 Intel(R) Core(TM) i5-3470 CPU @ 3.20GHz, 32G内存
  • GPU:N年前的 NVIDIA GeForce GTX 1080 Ti,11G显存

2. 基本软件环境

  • 操作系统:Ubuntu20.04 LTS,是为了跟老旧的硬件相匹配,专门降级到20.04的,更高版本存在各种软件兼容性问题,等有钱了全部换新!!!
  • CUDA:cuda_12.0.0_525.60.13_linux.run,虽然能支持到12.2甚至12.4,保险起见还是选择了12.0
  • Cudnn:libcudnn8_8.8.0.121-1+cuda12.0_amd64.deb,对应CUDA版本
  • NCCL:libnccl2_2.19.3-1+cuda12.0_amd64.deb对应CUDA版本,多显卡需要
  • miniconda:Miniconda3-py312_24.9.2-0-Linux-x86_64.sh

3. 克隆fishspeech代码并安装本地依赖包

git clone https://gitclone.com/github.com/fishaudio/fish-speech.gitsudo apt-get install ffmpeg libsm6 libxext6 portaudio19-dev -y

4. 创建虚拟环境

conda create -n huggingface python==3.10 -y
conda activate huggingface

5. conda安装基础包

conda install -c pytorch -c nvidia -c conda-forge pytorch torchvision pytorch-cuda=11.8

6. 安装huggingface组件,transformers包

pip install transformers -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install -U huggingface_hub -i https://pypi.tuna.tsinghua.edu.cn/simple设置环境变量,用于加速
HF_ENDPOINT=https://hf-mirror.com

7. 以python脚本方式运行

# Load model directly
from transformers import AutoTokenizer, AutoModelForSeq2SeqLMtokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-zh")
model = AutoModelForSeq2SeqLM.from_pretrained("Helsinki-NLP/opus-mt-en-zh")def translate(text):inputs = tokenizer(text, return_tensors="pt", padding=True)translated = model.generate(**inputs)return [tokenizer.decode(t, skip_special_tokens=True) for t in translated]print(tokenizer.supported_language_codes)
text = ">>cmn_Hans<< Due to a bug fix in https://github.com/huggingface/transformers/pull/28687 transcription using a multilingual Whisper will default to language detection followed by transcription instead of translation to English.This might be a breaking change for your use case. If you want to instead always translate your audio to English, make sure to pass `language='en'`. The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results."
translated_text = translate(text)
print(translated_text)

首次运行会报错,因为缺少两个依赖包,安装即可

pip install sentencepiece sacremoses -i https://pypi.tuna.tsinghua.edu.cn/simple

8. 以FastAPI方式运行

# 安装fastapi ubicorn组件
pip install fastapi uvicorn -i https://pypi.tuna.tsinghua.edu.cn/simple

服务脚本如下:

# Load model directly
from fastapi import FastAPI
from pydantic import BaseModel
from transformers import AutoTokenizer, AutoModelForSeq2SeqLMapp = FastAPI()tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-zh")
model = AutoModelForSeq2SeqLM.from_pretrained("Helsinki-NLP/opus-mt-en-zh")def translate(text):inputs = tokenizer(text, return_tensors="pt", padding=True)translated = model.generate(**inputs)return [tokenizer.decode(t, skip_special_tokens=True) for t in translated]# print(tokenizer.supported_language_codes)
# text = ">>cmn_Hans<< Due to a bug fix in https://github.com/huggingface/transformers/pull/28687 transcription using a multilingual Whisper will default to language detection followed by transcription instead of translation to English.This might be a breaking change for your use case. If you want to instead always translate your audio to English, make sure to pass `language='en'`. The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results."
# translated_text = translate(text)
# print(translated_text)class TextRequest(BaseModel):text: str@app.post("/predict")
async def predict(request: TextRequest):# 预处理和预测translated_text = translate(request.text)# 返回结果return {"text": request.text,"predictions": translated_text}

运行服务

uvicorn fastapi_app:app --host 0.0.0.0 --port 8000

http://www.xdnf.cn/news/8814.html

相关文章:

  • TCP/IP
  • Mac系统-最方便的一键环境部署软件ServBay(支持php,java,python,node,go,mysql等)没有之一,已亲自使用!
  • RocketMQ 5.0 核心概念与架构解析
  • 深入剖析 RocketMQ:消息保障、事务处理与负载均衡策略
  • Lua 脚本在 Redis 中的运用-24 (使用 Lua 脚本实现原子计数器)
  • SpringBoot返回xml
  • NV171NV173美光闪存颗粒NV181NV186
  • binlog解析工具——binlog2sql
  • 动态规划(6)下降路径最小值
  • C++ for QWidget:类(1)
  • 22、web场景-web开发简介
  • Java 内部类
  • Php JIT 使用详解
  • 慢查询日志的开启与分析:优化SQL性能的实战指南
  • 审计报告附注救星!实现Word表格纵向求和+横向计算及其对应的智能校验
  • rt-linux里的泛rtmutex锁的调用链整体分析
  • clickhouse-1-特性及docker化安装
  • C语言指针进阶
  • 互联网大厂Java求职面试:AI与大模型应用集成中的架构难题与解决方案
  • 向量数据库选型实战指南:Milvus架构深度解析与技术对比
  • OPENEULER搭建私有云存储服务器
  • 使用 Python 库中自带的数据集来实现上述 50 个数据分析和数据可视化程序的示例代码
  • Go 语言基础1 Slice,map,string
  • 在PyCharm中使用pyenv指定的Python:配置指南
  • 机器学习--分类算法
  • xml双引号可以不转义
  • 【python实战】中国主要城市经济统计数据分析与预测
  • 力扣395做题笔记
  • 深度学习论文idea:多模态检索
  • 计算机网络总结(物理层,链路层)