当前位置: 首页 > backend >正文

Manus AI 原理深度解析第三篇:Tools

前言

在前两篇文章里面,介绍了Manus的Prompt Manus AI 原理深度解析第一篇:Prompt以及 Modules & Agent Loop Manus AI 原理深度解析第二篇:Modules & Agent Loop,那么这一篇就介绍剩下的Tools了。

Tools 是 Manus 与外部环境交互的具体执行单元,覆盖文件、终端、浏览器、网络等操作,支持多语言和多场景需求。接下来不多废话,直入主题分析。

Message Tools

用于处理用户消息的工具类。

message_notify_user

向用户发送一条无需回复的消息的工作。

用于确认收到消息、提供进度更新、报告任务完成情况或解释方法变更。

  {"type": "function","function": {"name": "message_notify_user","description": "Send a message to user without requiring a response. Use for acknowledging receipt of messages, providing progress updates, reporting task completion, or explaining changes in approach.","parameters": {"type": "object","properties": {"text": {"type": "string","description": "Message text to display to user"},"attachments": {"anyOf": [{"type": "string"},{"items": {"type": "string"}, "type": "array"}],"description": "(Optional) List of attachments to show to user, can be file paths or URLs"}},"required": ["text"]}}}

message_ask_user

向用户提问并等待回复的工具。

用于请求澄清、确认或收集更多信息。

  {"type": "function","function": {"name": "message_ask_user","description": "Ask user a question and wait for response. Use for requesting clarification, asking for confirmation, or gathering additional information.","parameters": {"type": "object","properties": {"text": {"type": "string","description": "Question text to present to user"},"attachments": {"anyOf": [{"type": "string"},{"items": {"type": "string"}, "type": "array"}],"description": "(Optional) List of question-related files or reference materials"},"suggest_user_takeover": {"type": "string","enum": ["none", "browser"],"description": "(Optional) Suggested operation for user takeover"}},"required": ["text"]}}}

File Processing Tools

用于处理文件的工具类。

file_read

读取文件内容的工具。

用于检查文件内容、分析日志或读取配置文件。

  {"type": "function","function": {"name": "file_read","description": "Read file content. Use for checking file contents, analyzing logs, or reading configuration files.","parameters": {"type": "object","properties": {"file": {"type": "string","description": "Absolute path of the file to read"},"start_line": {"type": "integer","description": "(Optional) Starting line to read from, 0-based"},"end_line": {"type": "integer","description": "(Optional) Ending line number (exclusive)"},"sudo": {"type": "boolean","description": "(Optional) Whether to use sudo privileges"}},"required": ["file"]}}}

file_write

覆盖或附加内容到文件的工具。

用于创建新文件、附加内容或修改现有文件。

  {"type": "function","function": {"name": "file_write","description": "Overwrite or append content to a file. Use for creating new files, appending content, or modifying existing files.","parameters": {"type": "object","properties": {"file": {"type": "string","description": "Absolute path of the file to write to"},"content": {"type": "string","description": "Text content to write"},"append": {"type": "boolean","description": "(Optional) Whether to use append mode"},"leading_newline": {"type": "boolean","description": "(Optional) Whether to add a leading newline"},"trailing_newline": {"type": "boolean","description": "(Optional) Whether to add a trailing newline"},"sudo": {"type": "boolean","description": "(Optional) Whether to use sudo privileges"}},"required": ["file", "content"]}}}

file_str_replace

替换文件中的指定字符串的工具。

用于更新文件中的特定内容或修复代码中的错误。

  {"type": "function","function": {"name": "file_str_replace","description": "Replace specified string in a file. Use for updating specific content in files or fixing errors in code.","parameters": {"type": "object","properties": {"file": {"type": "string","description": "Absolute path of the file to perform replacement on"},"old_str": {"type": "string","description": "Original string to be replaced"},"new_str": {"type": "string","description": "New string to replace with"},"sudo": {"type": "boolean","description": "(Optional) Whether to use sudo privileges"}},"required": ["file", "old_str", "new_str"]}}}

file_find_in_content

在文件内容中搜索匹配的文本工具。

用于查找文件中的特定内容或模式。

  {"type": "function","function": {"name": "file_find_in_content","description": "Search for matching text within file content. Use for finding specific content or patterns in files.","parameters": {"type": "object","properties": {"file": {"type": "string","description": "Absolute path of the file to search within"},"regex": {"type": "string","description": "Regular expression pattern to match"},"sudo": {"type": "boolean","description": "(Optional) Whether to use sudo privileges"}},"required": ["file", "regex"]}}}

file_find_by_name

在指定目录中按名称模式查找文件的工具。

用于定位具有特定命名模式的文件。

  {"type": "function","function": {"name": "file_find_by_name","description": "Find files by name pattern in specified directory. Use for locating files with specific naming patterns.","parameters": {"type": "object","properties": {"path": {"type": "string","description": "Absolute path of directory to search"},"glob": {"type": "string","description": "Filename pattern using glob syntax wildcards"}},"required": ["path", "glob"]}}}

Bash/Shell Tools

用于命令行等操作的工具类。

shell_exec

在指定的 shell 会话中执行命令的工具。

用于运行代码、安装包或管理文件。

  {"type": "function","function": {"name": "shell_exec","description": "Execute commands in a specified shell session. Use for running code, installing packages, or managing files.","parameters": {"type": "object","properties": {"id": {"type": "string","description": "Unique identifier of the target shell session"},"exec_dir": {"type": "string","description": "Working directory for command execution (must use absolute path)"},"command": {"type": "string","description": "Shell command to execute"}},"required": ["id", "exec_dir", "command"]}}}

shell_view

查看指定 shell 会话的内容的工具。

用于检查命令执行结果或监控输出。

  {"type": "function","function": {"name": "shell_view","description": "View the content of a specified shell session. Use for checking command execution results or monitoring output.","parameters": {"type": "object","properties": {"id": {"type": "string","description": "Unique identifier of the target shell session"}},"required": ["id"]}}}

shell_wait

等待指定 shell 会话中正在运行的进程返回的工具。

在运行需要较长运行时间的命令后使用。

  {"type": "function","function": {"name": "shell_wait","description": "Wait for the running process in a specified shell session to return. Use after running commands that require longer runtime.","parameters": {"type": "object","properties": {"id": {"type": "string","description": "Unique identifier of the target shell session"},"seconds": {"type": "integer","description": "Wait duration in seconds"}},"required": ["id"]}}}

shell_write_to_process

将输入写入指定 shell 会话中正在运行的进程的工具。

用于响应交互式命令提示符。

  {"type": "function","function": {"name": "shell_write_to_process","description": "Write input to a running process in a specified shell session. Use for responding to interactive command prompts.","parameters": {"type": "object","properties": {"id": {"type": "string","description": "Unique identifier of the target shell session"},"input": {"type": "string","description": "Input content to write to the process"},"press_enter": {"type": "boolean","description": "Whether to press Enter key after input"}},"required": ["id", "input", "press_enter"]}}}

shell_kill_process

终止指定 shell 会话中正在运行的进程的工具。

用于停止长时间运行的进程或处理冻结的命令。

  {"type": "function","function": {"name": "shell_kill_process","description": "Terminate a running process in a specified shell session. Use for stopping long-running processes or handling frozen commands.","parameters": {"type": "object","properties": {"id": {"type": "string","description": "Unique identifier of the target shell session"}},"required": ["id"]}}}

Browser-use Tools

用于浏览器处理相关的工具。

browser_view

查看当前浏览器页面的内容的工具。用于检查之前打开的页面的最新状态。

  {"type": "function","function": {"name": "browser_view","description": "View content of the current browser page. Use for checking the latest state of previously opened pages.","parameters": {"type": "object"}}}

browser_navigate

将浏览器导航至指定的 URL的工具。

需要访问新页面时使用。

  {"type": "function","function": {"name": "browser_navigate","description": "Navigate browser to specified URL. Use when accessing new pages is needed.","parameters": {"type": "object","properties": {"url": {"type": "string","description": "Complete URL to visit. Must include protocol prefix."}},"required": ["url"]}}}

browser_restart

重启浏览器并导航到指定的 URL的工具。

当需要重置浏览器状态时使用。

  {"type": "function","function": {"name": "browser_restart","description": "Restart browser and navigate to specified URL. Use when browser state needs to be reset.","parameters": {"type": "object","properties": {"url": {"type": "string","description": "Complete URL to visit after restart. Must include protocol prefix."}},"required": ["url"]}}}

browser_click

点击当前浏览器页面中的元素的工具。

需要点击页面元素时使用。

  {"type": "function","function": {"name": "browser_click","description": "Click on elements in the current browser page. Use when clicking page elements is needed.","parameters": {"type": "object","properties": {"index": {"type": "integer","description": "(Optional) Index number of the element to click"},"coordinate_x": {"type": "number","description": "(Optional) X coordinate of click position"},"coordinate_y": {"type": "number","description": "(Optional) Y coordinate of click position"}}}}}

browser_input

覆盖当前浏览器页面上可编辑元素中的文本的工具。

在输入字段中填充内容时使用。

  {"type": "function","function": {"name": "browser_input","description": "Overwrite text in editable elements on the current browser page. Use when filling content in input fields.","parameters": {"type": "object","properties": {"index": {"type": "integer","description": "(Optional) Index number of the element to overwrite text"},"coordinate_x": {"type": "number","description": "(Optional) X coordinate of the element to overwrite text"},"coordinate_y": {"type": "number","description": "(Optional) Y coordinate of the element to overwrite text"},"text": {"type": "string","description": "Complete text content to overwrite"},"press_enter": {"type": "boolean","description": "Whether to press Enter key after input"}},"required": ["text", "press_enter"]}}}

browser_move_mouse

将光标移动到当前浏览器页面上的指定位置的工具。

用于模拟用户鼠标移动。

  {"type": "function","function": {"name": "browser_move_mouse","description": "Move cursor to specified position on the current browser page. Use when simulating user mouse movement.","parameters": {"type": "object","properties": {"coordinate_x": {"type": "number","description": "X coordinate of target cursor position"},"coordinate_y": {"type": "number","description": "Y coordinate of target cursor position"}},"required": ["coordinate_x", "coordinate_y"]}}}

browser_press_key

在当前浏览器页面中模拟按键操作的工具。

当需要特定的键盘操作时使用。

  {"type": "function","function": {"name": "browser_press_key","description": "Simulate key press in the current browser page. Use when specific keyboard operations are needed.","parameters": {"type": "object","properties": {"key": {"type": "string","description": "Key name to simulate (e.g., Enter, Tab, ArrowUp), supports key combinations (e.g., Control+Enter)."}},"required": ["key"]}}}

browser_select_option

从当前浏览器页面的下拉列表元素中选择特定选项的工具。

用于选择下拉菜单选项。

  {"type": "function","function": {"name": "browser_select_option","description": "Select specified option from dropdown list element in the current browser page. Use when selecting dropdown menu options.","parameters": {"type": "object","properties": {"index": {"type": "integer","description": "Index number of the dropdown list element"},"option": {"type": "integer","description": "Option number to select, starting from 0."}},"required": ["index", "option"]}}}

browser_scroll_up

向上滚动当前浏览器页面的工具。

用于查看上方内容或返回页面顶部。

  {"type": "function","function": {"name": "browser_scroll_up","description": "Scroll up the current browser page. Use when viewing content above or returning to page top.","parameters": {"type": "object","properties": {"to_top": {"type": "boolean","description": "(Optional) Whether to scroll directly to page top instead of one viewport up."}}}}}

browser_scroll_down

向下滚动当前浏览器页面的工具。

用于查看下方内容或跳转至页面底部。

  {"type": "function","function": {"name": "browser_scroll_down","description": "Scroll down the current browser page. Use when viewing content below or jumping to page bottom.","parameters": {"type": "object","properties": {"to_bottom": {"type": "boolean","description": "(Optional) Whether to scroll directly to page bottom instead of one viewport down."}}}}}

browser_console_exec

在浏览器控制台中执行 JavaScript 代码的工具。

当需要执行自定义脚本时使用。

  {"type": "function","function": {"name": "browser_console_exec","description": "Execute JavaScript code in browser console. Use when custom scripts need to be executed.","parameters": {"type": "object","properties": {"javascript": {"type": "string","description": "JavaScript code to execute. Note that the runtime environment is browser console."}},"required": ["javascript"]}}}

browser_console_view

查看浏览器控制台输出的工具。

用于检查 JavaScript 日志或调试页面错误。

  {"type": "function","function": {"name": "browser_console_view","description": "View browser console output. Use when checking JavaScript logs or debugging page errors.","parameters": {"type": "object","properties": {"max_lines": {"type": "integer","description": "(Optional) Maximum number of log lines to return."}}}}}

Web Search Tools

用于联网搜索的工具类。

info_search_web

使用搜索引擎搜索网页的工具。

用于获取最新信息或查找参考资料。

  {"type": "function","function": {"name": "info_search_web","description": "Search web pages using search engine. Use for obtaining latest information or finding references.","parameters": {"type": "object","properties": {"query": {"type": "string","description": "Search query in Google search style, using 3-5 keywords."},"date_range": {"type": "string","enum": ["all", "past_hour", "past_day", "past_week", "past_month", "past_year"],"description": "(Optional) Time range filter for search results."}},"required": ["query"]}}}

Deploy Tools

用于部署、启动项目的工具类。

deploy_expose_port

公开指定的本地端口以进行临时公共访问的工具。

用于为服务提供临时公共访问。

  {"type": "function","function": {"name": "deploy_expose_port","description": "Expose specified local port for temporary public access. Use when providing temporary public access for services.","parameters": {"type": "object","properties": {"port": {"type": "integer","description": "Local port number to expose"}},"required": ["port"]}}},

deploy_apply_deployment

将网站或应用程序部署到公共生产环境的工具。

用于部署或更新静态网站或应用程序。

  {"type": "function","function": {"name": "deploy_apply_deployment","description": "Deploy website or application to public production environment. Use when deploying or updating static websites or applications.","parameters": {"type": "object","properties": {"type": {"type": "string","enum": ["static", "nextjs"],"description": "Type of website or application to deploy."},"local_dir": {"type": "string","description": "Absolute path of local directory to deploy."}},"required": ["type", "local_dir"]}}}

Other Tools

补充其他必要与非必要操作的工具类。

make_manus_page

从本地 MDX 文件制作手册页的工具。

  {"type": "function","function": {"name": "make_manus_page","description": "Make a Manus Page from a local MDX file.","parameters": {"type": "object","properties": {"mdx_file_path": {"type": "string","description": "Absolute path of the source MDX file"}},"required": ["mdx_file_path"]}}}

idle

一种特殊工具,用于指示您已完成所有任务并即将进入空闲状态。

  {"type": "function","function": {"name": "idle","description": "A special tool to indicate you have completed all tasks and are about to enter idle state.","parameters": {"type": "object"}}}

附录

Brwoser-use Framework

在这里插入图片描述

OpenManus架构图

在这里插入图片描述

AI Agent 基础架构

在这里插入图片描述

http://www.xdnf.cn/news/6559.html

相关文章:

  • 什么是DHCP?
  • JavaScript零基础入门笔记:狂神版
  • C# Try Catch Finally 执行顺序是什么?有返回值呢?
  • Openlayers:如何注册一个新的坐标系统
  • web第二次课后作业--设计一个注册登录系统
  • MyBatis:从入门到深度理解
  • 从入门到实战:时序图核心知识与高效绘制全解析(附模板)
  • 如何利用芯片模型提升终端PCB的SIPI热仿真精度
  • 如何让open-mpi在不同版本的OS上运行
  • shell常用语法
  • 晶振的核心参数
  • 会计要素+借贷分录+会计科目+账户,几个银行会计的重要概念
  • 从 Vue3 回望 Vue2:组件设计升级——Options API vs Composition API
  • OpenResty Manager 介绍与部署(Docker部署)
  • C++算法(22):二维数组参数传递,从内存模型到高效实践
  • ERP知识手册【第三弹:INV(库存管理)】
  • Windows软件插件-写mp3
  • 2021-10-25 C++三的倍数含五
  • 动态规划之数列
  • 前端缓存策略
  • 【数据结构】栈与队列
  • Redis6为什么引入了多线程?
  • 20、工业协议转换与数据采集中间件 (模拟) - /数据与物联网组件/protocol-converter-middleware
  • std::deque 底层实现结构
  • 老字号焕新案例:天猫代运营如何让传统品牌年轻化破圈
  • SEO双核驱动:关键词与长尾词优化
  • JAVA:多线程使用哈希表
  • Web前端入门:JavaScript 的应用领域
  • [数据结构]7. 堆-Heap
  • undefined reference to vtable for DeviceAllocator‘