当前位置：首页 > ai >正文

上海人工智能实验室开源基于Intern-S1同等技术的轻量化开源多模态推理模型

ai 2025/8/23 6:41:09

在这里插入图片描述

简介

我们推出Intern-S1-mini——一款基于Intern-S1同等技术的轻量化开源多模态推理模型。该模型以80亿参数稠密语言模型（Qwen3）和3亿参数视觉编码器（InternViT）为底座，在5万亿token的多模态数据（含超2.5万亿科学领域token）上进行了持续预训练。这使得模型在保持通用能力的同时，能出色处理化学结构解析、蛋白质序列理解、化合物合成路线规划等专业科学任务，成为现实科研应用中得力的智能助手。

特性

在语言与视觉推理基准测试（尤其是科学任务）中表现优异。
基于5万亿token数据集的持续预训练，其中超50%为专业科学数据，具备深厚的领域知识沉淀。
动态分词器可实现分子式与蛋白质序列的原生解析。

性能表现

我们在包括通用数据集和科学数据集在内的多种基准测试上评估了Intern-S1-mini模型。以下是其与近期视觉语言模型及大语言模型的性能对比结果。

		Intern-S1-mini	Qwen3-8B	GLM-4.1V	MiMo-VL-7B-RL-2508
General	MMLU-Pro	74.78	73.7	57.1	73.93
	MMMU	72.33	N/A	69.9	70.4
	MMStar	65.2	N/A	71.5	72.9
	GPQA	65.15	62	50.32	60.35
	AIME2024	84.58	76	36.2	72.6
	AIME2025	80	67.3	32	64.4
	MathVision	51.41	N/A	53.9	54.5
	MathVista	70.3	N/A	80.7	79.4
	IFEval	81.15	85	71.53	71.4

Scientific	SFE	35.84	N/A	43.2	43.9
	Physics	28.76	N/A	28.3	28.2
	SmolInstruct	32.2	17.6	18.1	16.11
	ChemBench	76.47	61.1	56.2	66.78
	MatBench	61.55	45.24	54.3	46.9
	MicroVQA	56.62	N/A	50.2	50.96
	ProteinLMBench	58.47	59.1	58.3	59.8
	MSEarthMCQ	58.12	N/A	50.3	47.3
	XLRS-Bench	51.63	N/A	49.8	12.29

我们使用OpenCompass和VLMEvalkit来评估所有模型。

快速开始

采样参数

我们推荐使用以下超参数以获得更好的结果

top_p = 1.0
top_k = 50
min_p = 0.0
temperature = 0.8

Transformers

以下提供演示代码，展示如何基于文本和多模态输入生成内容。

请使用 transformers>=4.55.2 以确保模型正常运行。

文本输入

from transformers import AutoProcessor, AutoModelForCausalLM
import torchmodel_name = "internlm/Intern-S1-mini"
processor = AutoProcessor.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto", trust_remote_code=True)messages = [{"role": "user","content": [{"type": "text", "text": "tell me about an interesting physical phenomenon."},],}
]inputs = processor.apply_chat_template(messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt").to(model.device, dtype=torch.bfloat16)generate_ids = model.generate(**inputs, max_new_tokens=32768)
decoded_output = processor.decode(generate_ids[0, inputs["input_ids"].shape[1] :], skip_special_tokens=True)
print(decoded_output)

图片输入

from transformers import AutoProcessor, AutoModelForCausalLM
import torchmodel_name = "internlm/Intern-S1-mini"
processor = AutoProcessor.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto", trust_remote_code=True)messages = [{"role": "user","content": [{"type": "image", "url": "http://images.cocodataset.org/val2017/000000039769.jpg"},{"type": "text", "text": "Please describe the image explicitly."},],}
]inputs = processor.apply_chat_template(messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt").to(model.device, dtype=torch.bfloat16)generate_ids = model.generate(**inputs, max_new_tokens=32768)
decoded_output = processor.decode(generate_ids[0, inputs["input_ids"].shape[1] :], skip_special_tokens=True)
print(decoded_output)

视频输入

请确保通过 pip install decord 安装了 decord 视频解码库。为避免内存不足，请安装 flash_attention 并使用至少 2 块 GPU。

from transformers import AutoProcessor, AutoModelForCausalLM
import torchmodel_name = "internlm/Intern-S1-mini"
processor = AutoProcessor.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto", trust_remote_code=True)messages = [{"role": "user","content": [{"type": "video","url": "https://huggingface.co/datasets/hf-internal-testing/fixtures_videos/resolve/main/tennis.mp4",},{"type": "text", "text": "What type of shot is the man performing?"},],}]inputs = processor.apply_chat_template(messages,return_tensors="pt",add_generation_prompt=True,video_load_backend="decord",tokenize=True,return_dict=True,).to(model.device, dtype=torch.float16)generate_ids = model.generate(**inputs, max_new_tokens=32768)
decoded_output = processor.decode(generate_ids[0, inputs["input_ids"].shape[1] :], skip_special_tokens=True)
print(decoded_output)

服务

部署Intern-S1系列模型的最低硬件要求为：

Model	A100(GPUs)	H800(GPUs)	H100(GPUs)	H200(GPUs)
internlm/Intern-S1-mini	1	1	1	1
internlm/Intern-S1-mini-FP8	-	1	1	1

您可以使用以下其中一个LLM推理框架来创建一个与OpenAI兼容的服务器：

lmdeploy(>=0.9.2)

lmdeploy serve api_server internlm/Intern-S1-mini --reasoning-parser intern-s1 --tool-call-parser intern-s1

vllm

vllm serve internlm/Intern-S1-mini --trust-remote-code

sglang

python3 -m sglang.launch_server \--model-path internlm/Intern-S1-mini \--trust-remote-code \--grammar-backend none

本地部署的ollama:

# install ollama
curl -fsSL https://ollama.com/install.sh | sh
# fetch model
ollama pull internlm/interns1-mini
# run model
ollama run internlm/interns1-mini
# then use openai client to call on http://localhost:11434/v1

进阶用法

工具调用

目前，许多大语言模型（LLMs）都具备工具调用这一强大功能，使其能够通过调用外部工具或API来扩展能力。借助这一特性，模型可以实现诸如获取实时信息、运行代码或调用其他应用程序中的函数等任务。

对开发者而言，一个显著优势是越来越多的开源LLMs已兼容OpenAI API标准。这意味着您可以使用与OpenAI库相同的语法结构，在这些开源模型上实现工具调用功能。因此，本教程演示的代码具有通用性——不仅适用于OpenAI模型，也兼容任何遵循相同接口标准的模型。

下面我们通过具体代码示例（基于lmdeploy api server）来演示如何利用工具调用获取最新天气预报。

      
from openai import OpenAI
import jsondef get_current_temperature(location: str, unit: str = "celsius"):"""Get current temperature at a location.Args:location: The location to get the temperature for, in the format "City, State, Country".unit: The unit to return the temperature in. Defaults to "celsius". (choices: ["celsius", "fahrenheit"])Returns:the temperature, the location, and the unit in a dict"""return {"temperature": 26.1,"location": location,"unit": unit,}def get_temperature_date(location: str, date: str, unit: str = "celsius"):"""Get temperature at a location and date.Args:location: The location to get the temperature for, in the format "City, State, Country".date: The date to get the temperature for, in the format "Year-Month-Day".unit: The unit to return the temperature in. Defaults to "celsius". (choices: ["celsius", "fahrenheit"])Returns:the temperature, the location, the date and the unit in a dict"""return {"temperature": 25.9,"location": location,"date": date,"unit": unit,}def get_function_by_name(name):if name == "get_current_temperature":return get_current_temperatureif name == "get_temperature_date":return get_temperature_datetools = [{'type': 'function','function': {'name': 'get_current_temperature','description': 'Get current temperature at a location.','parameters': {'type': 'object','properties': {'location': {'type': 'string','description': 'The location to get the temperature for, in the format \'City, State, Country\'.'},'unit': {'type': 'string','enum': ['celsius','fahrenheit'],'description': 'The unit to return the temperature in. Defaults to \'celsius\'.'}},'required': ['location']}}
}, {'type': 'function','function': {'name': 'get_temperature_date','description': 'Get temperature at a location and date.','parameters': {'type': 'object','properties': {'location': {'type': 'string','description': 'The location to get the temperature for, in the format \'City, State, Country\'.'},'date': {'type': 'string','description': 'The date to get the temperature for, in the format \'Year-Month-Day\'.'},'unit': {'type': 'string','enum': ['celsius','fahrenheit'],'description': 'The unit to return the temperature in. Defaults to \'celsius\'.'}},'required': ['location','date']}}
}]messages = [{'role': 'user', 'content': 'Today is 2024-11-14, What\'s the temperature in San Francisco now? How about tomorrow?'}
]openai_api_key = "EMPTY"
openai_api_base = "http://0.0.0.0:23333/v1"
client = OpenAI(api_key=openai_api_key,base_url=openai_api_base,
)
model_name = client.models.list().data[0].id
response = client.chat.completions.create(model=model_name,messages=messages,max_tokens=32768,temperature=0.8,top_p=0.8,stream=False,extra_body=dict(spaces_between_special_tokens=False, enable_thinking=False),tools=tools)
print(response.choices[0].message)
messages.append(response.choices[0].message)for tool_call in response.choices[0].message.tool_calls:tool_call_args = json.loads(tool_call.function.arguments)tool_call_result = get_function_by_name(tool_call.function.name)(**tool_call_args)tool_call_result = json.dumps(tool_call_result, ensure_ascii=False)messages.append({'role': 'tool','name': tool_call.function.name,'content': tool_call_result,'tool_call_id': tool_call.id})response = client.chat.completions.create(model=model_name,messages=messages,temperature=0.8,top_p=0.8,stream=False,extra_body=dict(spaces_between_special_tokens=False, enable_thinking=False),tools=tools)
print(response.choices[0].message.content)

思维与非思维模式切换

Intern-S1-mini 默认启用思维模式，以增强模型的推理能力，从而生成更高质量的回复。如需禁用该功能，可在 tokenizer.apply_chat_template 中设置 enable_thinking=False。

text = tokenizer.apply_chat_template(messages,tokenize=False,add_generation_prompt=True,enable_thinking=False  # think mode indicator
)

通过LMDeploy服务Intern-S1-mini模型时，你可以在请求中动态调整enable_thinking参数来控制思考模式。

from openai import OpenAI
import jsonmessages = [
{'role': 'user','content': 'who are you'
}, {'role': 'assistant','content': 'I am an AI'
}, {'role': 'user','content': 'AGI is?'
}]openai_api_key = "EMPTY"
openai_api_base = "http://0.0.0.0:23333/v1"
client = OpenAI(api_key=openai_api_key,base_url=openai_api_base,
)
model_name = client.models.list().data[0].idresponse = client.chat.completions.create(model=model_name,messages=messages,temperature=0.8,top_p=0.8,max_tokens=2048,extra_body={"enable_thinking": False,}
)
print(json.dumps(response.model_dump(), indent=2, ensure_ascii=False))

对于vllm和sglang用户，请通过以下方式配置：

extra_body={"chat_template_kwargs": {"enable_thinking": False}
}

查看全文

http://www.xdnf.cn/news/18356.html

logback-spring.xml 文件

车载 GPS 与手机导航的终极对决：谁在复杂路况下更胜一筹？

UE5 将纯蓝图项目转为 C++ 项目

MongoDB 完整指南

安全运维过程文档体系规范

如何轻松永久删除 Android 手机上的短信

Android音频学习(十四)——加载音频设备

什么是Jmeter？Jmeter使用的原理步骤是什么？

Ingress控制器深度解析：Nginx与Traefik实战指南

Java的运行时数据区

ICMP 协议分析

从零到一：RAGFlow 本地部署全攻略

JeeSite V5.13.0 发布，升级 Spring Boot 3.5，Cloud 2025，AI 1.0，Vite 7

数据结构-HashMap

Vue2+Vue3前端开发_Day5

【neo4j】安装使用教程

lesson44：Redis 数据库全解析：从数据类型到高级应用

计算机网络：网络基础、TCP编程

如何自定义一个SpringBoot Starter

密码管理中明文密码与空密码的危害与预防

继承（Inheritance）

机器学习集成算法与K-means聚类

Pytest 插件怎么写：从0开发一个你自己的插件

14. 多线程(进阶1) --- 常见的锁策略和锁的特性

【Protues仿真】基于AT89C52单片机的数码管驱动事例

Windows下，将本地视频转化成rtsp推流的方法

简介

特性