当前位置：首页 > ops >正文

vLLM加载lora

ops 2025/8/20 15:41:31

下载Huggingface模型

安装包

pip install huggingface_hub  -i https://pypi.tuna.tsinghua.edu.cn/simple

下载

from huggingface_hub import snapshot_downloadsql_lora_path = snapshot_download(repo_id="Djs07/qwen2.5-1.5b-lora")

会放在~/.cache/huggingface/hub/ 目录下

启动服务

先把lora模型拷贝到当前目录再执行

vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B --enable-lora --lora-modules Qwen-Lora=models--Djs07--qwen2.5-1.5b-lora/snap
shots/8d7d20b1cbb95e7de29abe404e900c106fa8c8cb/

测试

模型改为上面设置的名字

curl http://172.17.0.3:10000/v1/completions   -H "Content-Type: application/json"     -d '{                                                       "model": "Qwen-Lora",                                                                                                                                                      "prompt": "San Francisco is a",                                                                                                                                            "max_tokens": 7,                                                                                                                                                           "temperature": 0                                                                                                                                                           }'

查看全文

http://www.xdnf.cn/news/18180.html