当前位置: 首页 > ai >正文

上海人工智能实验室开源基于Intern-S1同等技术的轻量化开源多模态推理模型

在这里插入图片描述

简介

我们推出Intern-S1-mini——一款基于Intern-S1同等技术的轻量化开源多模态推理模型。该模型以80亿参数稠密语言模型(Qwen3)和3亿参数视觉编码器(InternViT)为底座,在5万亿token的多模态数据(含超2.5万亿科学领域token)上进行了持续预训练。这使得模型在保持通用能力的同时,能出色处理化学结构解析、蛋白质序列理解、化合物合成路线规划等专业科学任务,成为现实科研应用中得力的智能助手。

特性

  • 在语言与视觉推理基准测试(尤其是科学任务)中表现优异。

  • 基于5万亿token数据集的持续预训练,其中超50%为专业科学数据,具备深厚的领域知识沉淀。

  • 动态分词器可实现分子式与蛋白质序列的原生解析。

性能表现

我们在包括通用数据集和科学数据集在内的多种基准测试上评估了Intern-S1-mini模型。以下是其与近期视觉语言模型及大语言模型的性能对比结果。

Intern-S1-miniQwen3-8BGLM-4.1VMiMo-VL-7B-RL-2508
GeneralMMLU-Pro74.7873.757.173.93
MMMU72.33N/A69.970.4
MMStar65.2N/A71.572.9
GPQA65.156250.3260.35
AIME202484.587636.272.6
AIME20258067.33264.4
MathVision51.41N/A53.954.5
MathVista70.3N/A80.779.4
IFEval81.158571.5371.4
ScientificSFE35.84N/A43.243.9
Physics28.76N/A28.328.2
SmolInstruct32.217.618.116.11
ChemBench76.4761.156.266.78
MatBench61.5545.2454.346.9
MicroVQA56.62N/A50.250.96
ProteinLMBench58.4759.158.359.8
MSEarthMCQ58.12N/A50.347.3
XLRS-Bench51.63N/A49.812.29

我们使用OpenCompass和VLMEvalkit来评估所有模型。

快速开始

采样参数

我们推荐使用以下超参数以获得更好的结果

top_p = 1.0
top_k = 50
min_p = 0.0
temperature = 0.8

Transformers

以下提供演示代码,展示如何基于文本和多模态输入生成内容。

请使用 transformers>=4.55.2 以确保模型正常运行。

文本输入
from transformers import AutoProcessor, AutoModelForCausalLM
import torchmodel_name = "internlm/Intern-S1-mini"
processor = AutoProcessor.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto", trust_remote_code=True)messages = [{"role": "user","content": [{"type": "text", "text": "tell me about an interesting physical phenomenon."},],}
]inputs = processor.apply_chat_template(messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt").to(model.device, dtype=torch.bfloat16)generate_ids = model.generate(**inputs, max_new_tokens=32768)
decoded_output = processor.decode(generate_ids[0, inputs["input_ids"].shape[1] :], skip_special_tokens=True)
print(decoded_output)
图片输入
from transformers import AutoProcessor, AutoModelForCausalLM
import torchmodel_name = "internlm/Intern-S1-mini"
processor = AutoProcessor.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto", trust_remote_code=True)messages = [{"role": "user","content": [{"type": "image", "url": "http://images.cocodataset.org/val2017/000000039769.jpg"},{"type": "text", "text": "Please describe the image explicitly."},],}
]inputs = processor.apply_chat_template(messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt").to(model.device, dtype=torch.bfloat16)generate_ids = model.generate(**inputs, max_new_tokens=32768)
decoded_output = processor.decode(generate_ids[0, inputs["input_ids"].shape[1] :], skip_special_tokens=True)
print(decoded_output)
视频输入

请确保通过 pip install decord 安装了 decord 视频解码库。为避免内存不足,请安装 flash_attention 并使用至少 2 块 GPU。

from transformers import AutoProcessor, AutoModelForCausalLM
import torchmodel_name = "internlm/Intern-S1-mini"
processor = AutoProcessor.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto", trust_remote_code=True)messages = [{"role": "user","content": [{"type": "video","url": "https://huggingface.co/datasets/hf-internal-testing/fixtures_videos/resolve/main/tennis.mp4",},{"type": "text", "text": "What type of shot is the man performing?"},],}]inputs = processor.apply_chat_template(messages,return_tensors="pt",add_generation_prompt=True,video_load_backend="decord",tokenize=True,return_dict=True,).to(model.device, dtype=torch.float16)generate_ids = model.generate(**inputs, max_new_tokens=32768)
decoded_output = processor.decode(generate_ids[0, inputs["input_ids"].shape[1] :], skip_special_tokens=True)
print(decoded_output)

服务

部署Intern-S1系列模型的最低硬件要求为:

ModelA100(GPUs)H800(GPUs)H100(GPUs)H200(GPUs)
internlm/Intern-S1-mini1111
internlm/Intern-S1-mini-FP8-111

您可以使用以下其中一个LLM推理框架来创建一个与OpenAI兼容的服务器:

lmdeploy(>=0.9.2)
lmdeploy serve api_server internlm/Intern-S1-mini --reasoning-parser intern-s1 --tool-call-parser intern-s1
vllm
vllm serve internlm/Intern-S1-mini --trust-remote-code
sglang
python3 -m sglang.launch_server \--model-path internlm/Intern-S1-mini \--trust-remote-code \--grammar-backend none
本地部署的ollama:
# install ollama
curl -fsSL https://ollama.com/install.sh | sh
# fetch model
ollama pull internlm/interns1-mini
# run model
ollama run internlm/interns1-mini
# then use openai client to call on http://localhost:11434/v1

进阶用法

工具调用

目前,许多大语言模型(LLMs)都具备工具调用这一强大功能,使其能够通过调用外部工具或API来扩展能力。借助这一特性,模型可以实现诸如获取实时信息、运行代码或调用其他应用程序中的函数等任务。

对开发者而言,一个显著优势是越来越多的开源LLMs已兼容OpenAI API标准。这意味着您可以使用与OpenAI库相同的语法结构,在这些开源模型上实现工具调用功能。因此,本教程演示的代码具有通用性——不仅适用于OpenAI模型,也兼容任何遵循相同接口标准的模型。

下面我们通过具体代码示例(基于lmdeploy api server)来演示如何利用工具调用获取最新天气预报。

      
from openai import OpenAI
import jsondef get_current_temperature(location: str, unit: str = "celsius"):"""Get current temperature at a location.Args:location: The location to get the temperature for, in the format "City, State, Country".unit: The unit to return the temperature in. Defaults to "celsius". (choices: ["celsius", "fahrenheit"])Returns:the temperature, the location, and the unit in a dict"""return {"temperature": 26.1,"location": location,"unit": unit,}def get_temperature_date(location: str, date: str, unit: str = "celsius"):"""Get temperature at a location and date.Args:location: The location to get the temperature for, in the format "City, State, Country".date: The date to get the temperature for, in the format "Year-Month-Day".unit: The unit to return the temperature in. Defaults to "celsius". (choices: ["celsius", "fahrenheit"])Returns:the temperature, the location, the date and the unit in a dict"""return {"temperature": 25.9,"location": location,"date": date,"unit": unit,}def get_function_by_name(name):if name == "get_current_temperature":return get_current_temperatureif name == "get_temperature_date":return get_temperature_datetools = [{'type': 'function','function': {'name': 'get_current_temperature','description': 'Get current temperature at a location.','parameters': {'type': 'object','properties': {'location': {'type': 'string','description': 'The location to get the temperature for, in the format \'City, State, Country\'.'},'unit': {'type': 'string','enum': ['celsius','fahrenheit'],'description': 'The unit to return the temperature in. Defaults to \'celsius\'.'}},'required': ['location']}}
}, {'type': 'function','function': {'name': 'get_temperature_date','description': 'Get temperature at a location and date.','parameters': {'type': 'object','properties': {'location': {'type': 'string','description': 'The location to get the temperature for, in the format \'City, State, Country\'.'},'date': {'type': 'string','description': 'The date to get the temperature for, in the format \'Year-Month-Day\'.'},'unit': {'type': 'string','enum': ['celsius','fahrenheit'],'description': 'The unit to return the temperature in. Defaults to \'celsius\'.'}},'required': ['location','date']}}
}]messages = [{'role': 'user', 'content': 'Today is 2024-11-14, What\'s the temperature in San Francisco now? How about tomorrow?'}
]openai_api_key = "EMPTY"
openai_api_base = "http://0.0.0.0:23333/v1"
client = OpenAI(api_key=openai_api_key,base_url=openai_api_base,
)
model_name = client.models.list().data[0].id
response = client.chat.completions.create(model=model_name,messages=messages,max_tokens=32768,temperature=0.8,top_p=0.8,stream=False,extra_body=dict(spaces_between_special_tokens=False, enable_thinking=False),tools=tools)
print(response.choices[0].message)
messages.append(response.choices[0].message)for tool_call in response.choices[0].message.tool_calls:tool_call_args = json.loads(tool_call.function.arguments)tool_call_result = get_function_by_name(tool_call.function.name)(**tool_call_args)tool_call_result = json.dumps(tool_call_result, ensure_ascii=False)messages.append({'role': 'tool','name': tool_call.function.name,'content': tool_call_result,'tool_call_id': tool_call.id})response = client.chat.completions.create(model=model_name,messages=messages,temperature=0.8,top_p=0.8,stream=False,extra_body=dict(spaces_between_special_tokens=False, enable_thinking=False),tools=tools)
print(response.choices[0].message.content)

思维与非思维模式切换

Intern-S1-mini 默认启用思维模式,以增强模型的推理能力,从而生成更高质量的回复。如需禁用该功能,可在 tokenizer.apply_chat_template 中设置 enable_thinking=False

text = tokenizer.apply_chat_template(messages,tokenize=False,add_generation_prompt=True,enable_thinking=False  # think mode indicator
)

通过LMDeploy服务Intern-S1-mini模型时,你可以在请求中动态调整enable_thinking参数来控制思考模式。

from openai import OpenAI
import jsonmessages = [
{'role': 'user','content': 'who are you'
}, {'role': 'assistant','content': 'I am an AI'
}, {'role': 'user','content': 'AGI is?'
}]openai_api_key = "EMPTY"
openai_api_base = "http://0.0.0.0:23333/v1"
client = OpenAI(api_key=openai_api_key,base_url=openai_api_base,
)
model_name = client.models.list().data[0].idresponse = client.chat.completions.create(model=model_name,messages=messages,temperature=0.8,top_p=0.8,max_tokens=2048,extra_body={"enable_thinking": False,}
)
print(json.dumps(response.model_dump(), indent=2, ensure_ascii=False))

对于vllm和sglang用户,请通过以下方式配置:

extra_body={"chat_template_kwargs": {"enable_thinking": False}
}
http://www.xdnf.cn/news/18356.html

相关文章:

  • logback-spring.xml 文件
  • 车载 GPS 与手机导航的终极对决:谁在复杂路况下更胜一筹?
  • UE5 将纯蓝图项目转为 C++ 项目
  • MongoDB 完整指南
  • 安全运维过程文档体系规范
  • 如何轻松永久删除 Android 手机上的短信
  • Android音频学习(十四)——加载音频设备
  • 什么是Jmeter?Jmeter使用的原理步骤是什么?
  • day38-HTTP
  • 第41周——人脸图像生成
  • 携程旅游的 AI 网关落地实践
  • 计算机网络技术-第七章
  • Ingress控制器深度解析:Nginx与Traefik实战指南
  • Java的运行时数据区
  • ICMP 协议分析
  • 从零到一:RAGFlow 本地部署全攻略
  • JeeSite V5.13.0 发布,升级 Spring Boot 3.5,Cloud 2025,AI 1.0,Vite 7
  • 数据结构-HashMap
  • Vue2+Vue3前端开发_Day5
  • 【neo4j】安装使用教程
  • lesson44:Redis 数据库全解析:从数据类型到高级应用
  • 计算机网络:网络基础、TCP编程
  • 如何自定义一个SpringBoot Starter
  • 密码管理中明文密码与空密码的危害与预防
  • 继承(Inheritance)
  • 机器学习集成算法与K-means聚类
  • Pytest 插件怎么写:从0开发一个你自己的插件
  • 14. 多线程(进阶1) --- 常见的锁策略和锁的特性
  • 【Protues仿真】基于AT89C52单片机的数码管驱动事例
  • Windows下,将本地视频转化成rtsp推流的方法