利用Prometheus监控服务器相关数据
需求分析
最近有个需求,需要监控服务器的相关指标,如CPU、内存、磁盘等,还有几个不太常见的如风扇转速、CPU温度等,经过一番调研,发现Prometheus这个开源监控平台,记录一下前后的开发历程。
简单了解Prometheus
这个网上资料很多,这里简单贴一张图,Prometheus只是其中监控的一个环节,一般利用Prometheus + Grafana + Alertmanager这三个来搭建。具体参考:https://www.iocoder.cn/Prometheus/install/?yudao
简单理解,比如B服务器作为业务服务器,我要要完成监控服务器A的功能,大概需要部署这几个东西:
1.在服务器A上部署一个expoter节点,负责从A服务器上收集原始数据,转化成Prometheus的metric格式供Prometheus来抓取。
2.在B服务器上部署Prometheus、grafna、通过修改Prometheus的配置文件,将要监控的服务器关联起来。
3.如果还需要告警,比如你希望在检测到某个服务器宕机的时候,通过webhook推送一条告警,那就要部署alertmanager,这个服务配置文件主要配置告警规则和推送信息。
4.因为还需要检测服务器的硬件参数,还部署了IPMI Exporter,这个服务可以监控硬件参数。Exporter 内部会调用标准的命令行工具(最常用的是 ipmitool)或使用专门的库(如 Go 的 go-ipmi)来执行 IPMI 命令(例如 ipmitool sensor)。
尝试嵌入grafna图表
因为最终要给客户展示的是图表信息,grafna的图表刚好满足需求,而且还是很好看的:
所以一开始尝试直接通过前端嵌入iframe 的方式,实现这个功能,再就是看到grafna中针对每一张图表都提供了共享嵌入,这个实在是太方便了,非常节省前端工作量。
问题:但是实际做完了几张图之后,发现每次从我们自己的平台加载这些图片变得非常卡,并且加载过程中会有grafna本身的logo。
在经过多番尝试之后,无法解决卡顿问题,最终放弃了这种开发模式。
Prometheus API
然后经过一段时间的摸索,发现Prometheus提供了API调用,可以通过拼接promQL的方式,获取到grafna图表中的数据,而且经过测试,查询速度非常快,完全可以接受,于是尝试利用api调用的方式,实现了这个功能。
接口很简单,我这里简单贴一下:
api调用获取实时数据:
范围查询API:
- 请求地址:
http://ip:9090/api/v1/query
- 请求方式:GET
- 请求参数:
参数 | 是否必传 | 含义 |
---|---|---|
query | 是 | 查询的promQL表达式 |
start | 是 | 开始时间(2025-08-19T19:10:30.781Z) |
end | 是 | 结束时间(2025-08-20T20:20:30.781Z) |
step | 是 | 步长 |
api调用获取历史数据:
范围查询API:
- 请求地址:
http://ip:9090/api/v1/query_range
- 请求方式:GET
- 请求参数:
参数 | 是否必传 | 含义 |
---|---|---|
query | 是 | 查询的promQL表达式 |
start | 是 | 开始时间(2025-08-19T19:10:30.781Z) |
end | 是 | 结束时间(2025-08-20T20:20:30.781Z) |
step | 是 | 步长 |
网上有一些关于这个api的一些使用方式,大部分都是从Prometheus官网直译过来的,不过也有些帮助:
https://blog.csdn.net/u013235026/article/details/39833223
这里直接贴一下我开发这个流程获所用到的一些请求指标:
api调用获取实时数据
范围查询API:
- 请求地址:
http://ip:9090/api/v1/query
- 请求方式:GET
- 请求参数:
参数 | 是否必传 | 含义 |
---|---|---|
query | 是 | 查询的promQL表达式 |
start | 是 | 开始时间(2025-08-19T19:10:30.781Z) |
end | 是 | 结束时间(2025-08-20T20:20:30.781Z) |
step | 是 | 步长 |
服务器指标
-
内存总大小:
GET
http://ip:9090/api/v1/query?query=node_memory_MemTotal_bytes{instance="ip:9100"}
{"status": "success","data": {"resultType": "vector","result": [{"metric": {"__name__": "node_memory_MemTotal_bytes","environment": "production","instance": "ip:9100","job": "node-exporter3","server": "server-ip"},"value": [1753868141.687,"134598856704"]}]} }
-
服务器启动天数
-
GET
-
http://ip:9090/api/v1/query?query=time() - node_boot_time_seconds{instance="ip:9100"}
{"status": "success","data": {"resultType": "vector","result": [{"metric": {"environment": "production","instance": "ip:9100","job": "node-exporter3","server": "server-ip"},"value": [1753870023.095,"5286545.095000029"]}]} }
-
-
CPU核数
-
GET
-
http://ip:9090/api/v1/query?query=count(node_cpu_seconds_total{mode="idle",instance="ip:9100"})
{"status": "success","data": {"resultType": "vector","result": [{"metric": {},"value": [1753870172.652,"64"]}]} }
-
-
CPU利用率(5分钟内)
-
GET
-
http://ip:9090/api/v1/query?query=100 - (avg by(instance)(irate(node_cpu_seconds_total{mode="idle",instance="ip:9100"}[5m])) * 100)
{"status": "success","data": {"resultType": "vector","result": [{"metric": {"instance": "ip:9100"},"value": [1753870280.965,"1.5908854165900266"]}]} }
-
-
内存利用率
-
GET
-
http://ip:9090/api/v1/query?query=(node_memory_MemTotal_bytes{instance="ip:9100"} - node_memory_MemAvailable_bytes{instance="ip:9100"}) / node_memory_MemTotal_bytes{instance="ip:9100"} * 100
{"status": "success","data": {"resultType": "vector","result": [{"metric": {"environment": "production","instance": "ip:9100","job": "node-exporter3","server": "server-ip"},"value": [1753870419.388,"35.561521483991584"]}]} }
-
-
磁盘读速率
-
GET
-
http://ip:9090/api/v1/query?query=irate(node_disk_read_bytes_total{instance="ip:9100",device="sda"}[5m])
{"status": "success","data": {"resultType": "vector","result": [{"metric": {"device": "sda","environment": "production","instance": "ip:9100","job": "node-exporter3","server": "server-ip"},"value": [1753870514.682,"0"]}]} }
-
-
磁盘写速率
-
GET
-
http://ip:9090/api/v1/query?query=irate(node_disk_written_bytes_total{instance="ip:9100",device="sda"}[5m])
{"status": "success","data": {"resultType": "vector","result": [{"metric": {"device": "sda","environment": "production","instance": "ip:9100","job": "node-exporter3","server": "server-ip"},"value": [1753870654.165,"858794.6666666666"//字节/秒 (B/s)]}]} }
-
-
链接数
-
GET
-
http://ip:9090/api/v1/query?query=node_netstat_Tcp_CurrEstab{instance="ip:9100"}
{"status": "success","data": {"resultType": "vector","result": [{"metric": {"__name__": "node_netstat_Tcp_CurrEstab","environment": "production","instance": "ip:9100","job": "node-exporter3","server": "server-ip"},"value": [1753870741.001,"640"//建立的所有TCP链接数]}]} }
-
组合参数
- 查询基本参数
http://ip:9090/api/v1/query?query={instance="ip:9100", __name__=~"node_memory_MemTotal_bytes|node_netstat_Tcp_CurrEstab|node_boot_time_seconds|node_memory_MemAvailable_bytes"}{"status": "success","data": {"resultType": "vector","result": [{"metric": {"__name__": "node_boot_time_seconds","environment": "production","instance": "ip:9100","job": "node-exporter3","server": "server-ip"},"value": [1753926624.297,"1748583478"]},{"metric": {"__name__": "node_memory_MemAvailable_bytes","environment": "production","instance": "ip:9100","job": "node-exporter3","server": "server-ip"},"value": [1753926624.297,"86579154944"]},{"metric": {"__name__": "node_memory_MemTotal_bytes","environment": "production","instance": "ip:9100","job": "node-exporter3","server": "server-ip"},"value": [1753926624.297,"134598856704"]},{"metric": {"__name__": "node_netstat_Tcp_CurrEstab","environment": "production","instance": "ip:9100","job": "node-exporter3","server": "server-ip"},"value": [1753926624.297,"616"]}]}
}
- 可行参数
node_memory_MemTotal_bytes{instance="ip:9100"} or count(node_cpu_seconds_total{mode="idle",instance="ip:9100"}) or 100 - (avg by(instance)(irate(node_cpu_seconds_total{mode="idle",instance="ip:9100"}[5m])) * 100)
or参数:当使用 or 运算符组合 三个及以上 的查询时,Prometheus 的处理逻辑存在特殊行为:
or 操作是左关联的(从左到右执行)
如果后续查询返回的标签组合与已有结果完全匹配,会覆盖前面的值Prometheus 要求在同一结果集中,完全相同的标签组合只能存在一个时间序列。当多个查询返回相同标签组时,or 操作按从左到右的顺序保留第一个匹配值,后续匹配值会被丢弃(不是报错,而是静默忽略)。建议在设计指标时添加可区分的标签(如 metric_type)来规避此类问题。
-
最终组合参数:
- 入参:
# 系统运行时间(添加类型标签) label_replace(time() - node_boot_time_seconds{instance="ip:9100"},"metric_type", "系统运行总时长", "", "" ) # TCP连接数(添加不同类型标签) or label_replace(node_netstat_Tcp_CurrEstab{instance="ip:9100"},"metric_type", "TCP连接数", "", "" )or label_replace(#内存总大小node_memory_MemTotal_bytes{instance="ip:9100"},"metric_type", "内存总大小", "", "" )or label_replace(#CPU总核数count(node_cpu_seconds_total{mode="idle",instance="ip:9100"}),"metric_type", "CPU总核数", "", "" )or label_replace(#CPU利用率100 - (avg by(instance)(irate(node_cpu_seconds_total{mode="idle",instance="ip:9100"}[5m])) * 100),"metric_type", "CPU利用率", "", "" )or label_replace(#内存利用率(node_memory_MemTotal_bytes{instance="ip:9100"} - node_memory_MemAvailable_bytes{instance="ip:9100"}) / node_memory_MemTotal_bytes{instance="ip:9100"} * 100,"metric_type", "内存利用率", "", "" )or label_replace(#磁盘读速率irate(node_disk_read_bytes_total{instance="ip:9100",device="sda"}[5m]),"metric_type", "磁盘读速率", "", "" )or label_replace(#磁盘写速率irate(node_disk_written_bytes_total{instance="ip:9100",device="sda"}[5m]),"metric_type", "磁盘写速率", "", "" )
- 请求地址&返回值:
http://ip:9090/api/v1/query?query= label_replace(time() - node_boot_time_seconds{instance="ip:9100"},"metric_type", "系统运行总时长", "", "" ) or label_replace(node_netstat_Tcp_CurrEstab{instance="ip:9100"},"metric_type", "TCP连接数", "", "" )or label_replace(node_memory_MemTotal_bytes{instance="ip:9100"},"metric_type", "内存总大小", "", "" )or label_replace(count(node_cpu_seconds_total{mode="idle",instance="ip:9100"}),"metric_type", "CPU总核数", "", "" )or label_replace(100 - (avg by(instance)(irate(node_cpu_seconds_total{mode="idle",instance="ip:9100"}[5m])) * 100),"metric_type", "CPU利用率", "", "" )or label_replace((node_memory_MemTotal_bytes{instance="ip:9100"} - node_memory_MemAvailable_bytes{instance="ip:9100"}) / node_memory_MemTotal_bytes{instance="ip:9100"} * 100,"metric_type", "内存利用率", "", "" )or label_replace(irate(node_disk_read_bytes_total{instance="ip:9100",device="sda"}[5m]),"metric_type", "磁盘读速率", "", "" )or label_replace(irate(node_disk_written_bytes_total{instance="ip:9100",device="sda"}[5m]),"metric_type", "磁盘写速率", "", "" )((1-(1 - avg(irate(node_cpu_seconds_total{mode="idle",instance="ip:9100"}[5m])) by (instance))^1.3)^(1/3)*0.5 + (1-(1 - avg(node_memory_MemAvailable_bytes{instance="ip:9100"} / node_memory_MemTotal_bytes{instance="ip:9100"})by (instance))^6)^(1/3)*0.3 + (1 - max(irate(node_disk_io_time_seconds_total{instance="ip:9100"}[5m]))by (instance)^1.1)^(1/2)*0.2)*100结果: {"status": "success","data": {"resultType": "vector","result": [{"metric": {"environment": "production","instance": "ip:9100","job": "node-exporter3","metric_type": "系统运行总时长","server": "server-ip"},"value": [1753930316.288,"5346838.288000107"]},{"metric": {"__name__": "node_netstat_Tcp_CurrEstab","environment": "production","instance": "ip:9100","job": "node-exporter3","metric_type": "TCP连接数","server": "server-ip"},"value": [1753930316.288,"628"]},{"metric": {"__name__": "node_memory_MemTotal_bytes","environment": "production","instance": "ip:9100","job": "node-exporter3","metric_type": "内存总大小","server": "server-ip"},"value": [1753930316.288,"134598856704"]},{"metric": {"metric_type": "CPU总核数"},"value": [1753930316.288,"64"]},{"metric": {"instance": "ip:9100","metric_type": "CPU利用率"},"value": [1753930316.288,"1.7414062501241716"]},{"metric": {"environment": "production","instance": "ip:9100","job": "node-exporter3","metric_type": "内存利用率","server": "server-ip"},"value": [1753930316.288,"35.69893645208953"]},{"metric": {"device": "sda","environment": "production","instance": "ip:9100","job": "node-exporter3","metric_type": "磁盘读速率","server": "server-ip"},"value": [1753930316.288,"273.06666666666666"]},{"metric": {"device": "sda","environment": "production","instance": "ip:9100","job": "node-exporter3","metric_type": "磁盘写速率","server": "server-ip"},"value": [1753930316.288,"960170.6666666666"]}]} }
api调用获取历史数据
范围查询API:
- 请求地址:
http://ip:9090/api/v1/query_range
- 请求方式:GET
- 请求参数:
参数 | 是否必传 | 含义 |
---|---|---|
query | 是 | 查询的promQL表达式 |
start | 是 | 开始时间(2025-08-19T19:10:30.781Z) |
end | 是 | 结束时间(2025-08-20T20:20:30.781Z) |
step | 是 | 步长 |
重要参数概览
总CPU使用率
-
表达式:
(1 - avg(irate(node_cpu_seconds_total{instance="ip:9100",mode="idle"}[5m])) by (instance))*100
- 含义:1-计算5分钟CPU内空闲时间的比例
node_cpu_seconds_total
:CPU 时间累计计数器(Counter)mode="idle"
:筛选空闲状态的 CPU 时间irate()[interval]
:计算时间窗口内的瞬时增长率
-
请求示例:
http://ip:9090/api/v1/query_range?query=(1 - avg(irate(node_cpu_seconds_total{instance="ip:9100",mode="idle"}[5m])) by (instance))*100&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"instance": "ip:9100"},"values": [[1755717030.781,"5.37638888885138"],[1755720630.781,"7.025000000075021"]]}]} }
内存使用率
-
表达式:
(1 - (node_memory_MemAvailable_bytes{instance="ip:9100"} / (node_memory_MemTotal_bytes{instance="ip:9100"})))* 100
- 含义:内存使用率
-
请求示例:
http://ip:9090/api/v1/query_range?query=(1 - (node_memory_MemAvailable_bytes{instance="ip:9100"} / (node_memory_MemTotal_bytes{instance="ip:9100"})))* 100&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"38.715711476735024"],[1755720630.781,"38.63129705705032"]]}]} }
磁盘使用率
-
表达式:
(node_filesystem_size_bytes{instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}-node_filesystem_free_bytes{instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}) *100/(node_filesystem_avail_bytes {instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}+(node_filesystem_size_bytes{instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}-node_filesystem_free_bytes{instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}))
- 含义:用户视角的磁盘使用率,已经使用空间/用户可操作总空间
fstype=~"ext.*|xfs|nfs"
:监控主流文件系统(排除 tmpfs 等内存文件系统)mountpoint !~".*pod.*"
:排除 Kubernetes Pod 挂载点(容器化环境特有)- 注意:
+
会被转移为空格,需要用%2B
替代
-
请求示例:
http://ip:9090/api/v1/query_range?query=(node_filesystem_size_bytes{instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}-node_filesystem_free_bytes{instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}) *100/(node_filesystem_avail_bytes {instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}%2B(node_filesystem_size_bytes{instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}-node_filesystem_free_bytes{instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}))&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"device": "/dev/mapper/centos-home","environment": "production","fstype": "xfs","instance": "ip:9100","job": "node-exporter3","mountpoint": "/home","server": "server-ip"},"values": [[1755717030.781,"6.383665437345736"],[1755720630.781,"6.400828182136928"]]},{"metric": {"device": "/dev/mapper/centos-root","environment": "production","fstype": "xfs","instance": "ip:9100","job": "node-exporter3","mountpoint": "/","server": "server-ip"},"values": [[1755717030.781,"36.219364650253205"],[1755720630.781,"36.311770803426654"]]},{"metric": {"device": "/dev/sda2","environment": "production","fstype": "xfs","instance": "ip:9100","job": "node-exporter3","mountpoint": "/boot","server": "server-ip"},"values": [[1755717030.781,"16.144092728832007"],[1755720630.781,"16.144092728832007"]]}]} }
交换分区使用率
-
表达式:
(1 - ((node_memory_SwapFree_bytes{instance="ip:9100"} %2B 1)/ (node_memory_SwapTotal_bytes{instance="ip:9100"} %2B 1))) * 100
-
请求示例:
http://ip:9090/api/v1/query_range?query=(1 - ((node_memory_SwapFree_bytes{instance="ip:9100"} %2B 1)/ (node_memory_SwapTotal_bytes{instance="ip:9100"} %2B 1))) * 100&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"2.1484702825546154"],[1755720630.781,"2.1484702825546154"]]}]} }
分区可用空间
文件系统总大小
-
表达式:
node_filesystem_size_bytes{instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}-0
-
请求示例:
http://ip:9090/api/v1/query_range?query=node_filesystem_size_bytes{instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}-0&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"device": "/dev/mapper/klas-root","environment": "production","fstype": "xfs","instance": "ip:9100","job": "node-exporter1","mountpoint": "/","server": "server-ip"},"values": [[1755717030.781,"4417954775040"],[1755720630.781,"4417954775040"]]},{"metric": {"device": "/dev/sda2","environment": "production","fstype": "xfs","instance": "ip:9100","job": "node-exporter1","mountpoint": "/boot","server": "server-ip"},"values": [[1755717030.781,"1063256064"],[1755720630.781,"1063256064"]]}]} }
文件系统可用大小(重复)
-
表达式:
(node_filesystem_size_bytes{instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}-node_filesystem_free_bytes{instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}) *100/(node_filesystem_avail_bytes {instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}+(node_filesystem_size_bytes{instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}-node_filesystem_free_bytes{instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}))
- 含义:用户视角的磁盘使用率,已经使用空间/用户可操作总空间
fstype=~"ext.*|xfs|nfs"
:监控主流文件系统(排除 tmpfs 等内存文件系统)mountpoint !~".*pod.*"
:排除 Kubernetes Pod 挂载点(容器化环境特有)- 注意:
+
会被转移为空格,需要用%2B
替代
-
请求示例:
http://ip:9090/api/v1/query_range?query=(node_filesystem_size_bytes{instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}-node_filesystem_free_bytes{instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}) *100/(node_filesystem_avail_bytes {instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}%2B(node_filesystem_size_bytes{instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}-node_filesystem_free_bytes{instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}))&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"device": "/dev/mapper/centos-home","environment": "production","fstype": "xfs","instance": "ip:9100","job": "node-exporter3","mountpoint": "/home","server": "server-ip"},"values": [[1755717030.781,"6.383665437345736"],[1755720630.781,"6.400828182136928"]]},{"metric": {"device": "/dev/mapper/centos-root","environment": "production","fstype": "xfs","instance": "ip:9100","job": "node-exporter3","mountpoint": "/","server": "server-ip"},"values": [[1755717030.781,"36.219364650253205"],[1755720630.781,"36.311770803426654"]]},{"metric": {"device": "/dev/sda2","environment": "production","fstype": "xfs","instance": "ip:9100","job": "node-exporter3","mountpoint": "/boot","server": "server-ip"},"values": [[1755717030.781,"16.144092728832007"],[1755720630.781,"16.144092728832007"]]}]} }
CPU利用率
总使用率(重复)
-
表达式:
(1 - avg(irate(node_cpu_seconds_total{instance="ip:9100",mode="idle"}[5m])) by (instance))*100
- 含义:1-计算5分钟CPU内空闲时间的比例
node_cpu_seconds_total
:CPU 时间累计计数器(Counter)mode="idle"
:筛选空闲状态的 CPU 时间irate()[interval]
:计算时间窗口内的瞬时增长率
-
请求示例:
http://ip:9090/api/v1/query_range?query=(1 - avg(irate(node_cpu_seconds_total{instance="ip:9100",mode="idle"}[5m])) by (instance))*100&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"instance": "ip:9100"},"values": [[1755717030.781,"5.37638888885138"],[1755720630.781,"7.025000000075021"]]}]} }
系统使用率
-
表达式:
avg(irate(node_cpu_seconds_total{instance="ip:9100",mode="system"}[5m])) by (instance) *100
- 含义:计算5分钟内CPU执行系统程序的比率
node_cpu_seconds_total
:计算CPU在不同模式下累计消耗的时间mode="system"
:筛选 CPU 在内核态(执行操作系统内核代码)的时间irate()[interval]
:计算时间窗口内的瞬时增长率
-
请求示例:
http://ip:9090/api/v1/query_range?query=avg(irate(node_cpu_seconds_total{instance="ip:9100",mode="system"}[5m])) by (instance) *100&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"instance": "ip:9100"},"values": [[1755717030.781,"0.3062499999999899"],[1755720630.781,"0.65833333333324"]]}]} }
用户使用率
-
表达式:
avg(irate(node_cpu_seconds_total{instance="ip:9100",mode="user"}[5m])) by (instance) *100
- 含义:计算5分钟内CPU执行应用程序的比率
node_cpu_seconds_total
:计算CPU在不同模式下累计消耗的时间mode="system"
:筛选 CPU 在用户态(执行应用程序代码)的时间irate()[interval]
:计算时间窗口内的瞬时增长率
-
请求示例:
http://ip:9090/api/v1/query_range?query=avg(irate(node_cpu_seconds_total{instance="ip:9100",mode="user"}[5m])) by (instance) *100&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"instance": "ip:9100"},"values": [[1755717030.781,"4.801388888888849"],[1755720630.781,"6.027083333332131"]]}]} }
磁盘IO使用率
-
表达式:
avg(irate(node_cpu_seconds_total{instance="ip:9100",mode="iowait"}[5m])) by (instance) *100
- 含义:计算5分钟内CPU的I/O等待时间占比
- I/O等待:CPU 空闲但因等待磁盘 I/O 操作而无法执行其他任务的时间占比。
- 当进程发起磁盘读写请求后,CPU 会暂停当前任务并等待 I/O 完成。
- 此时 CPU 处于“空闲但被阻塞”状态,该状态时间被记为 iowait。
node_cpu_seconds_total
:计算CPU在不同模式下累计消耗的时间mode="iowait"
:I/O 等待时间的累计计数器irate()[interval]
:计算时间窗口内的瞬时增长率
- 含义:计算5分钟内CPU的I/O等待时间占比
-
请求示例:
http://ip:9090/api/v1/query_range?query=avg(irate(node_cpu_seconds_total{instance="ip:9100",mode="iowait"}[5m])) by (instance) *100&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"instance": "ip:9100"},"values": [[1755717030.781,"0.040277777777777655"],[1755720630.781,"0"]]}]} }
内存信息图表
总内存
-
表达式:
node_memory_MemTotal_bytes{instance="ip:9100"}
- 含义:内存总量
-
请求示例:
http://ip:9090/api/v1/query_range?query=node_memory_MemAvailable_bytes{instance="ip:9100"}&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"__name__": "node_memory_MemAvailable_bytes","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"83072385024"],[1755720630.781,"83186810880"]]}]} }
可用内存
-
表达式:
node_memory_MemAvailable_bytes{instance="ip:9100"}
- 含义:可用内存
-
请求示例:
http://ip:9090/api/v1/query_range?query=node_memory_MemAvailable_bytes{instance="ip:9100"}&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"__name__": "node_memory_MemAvailable_bytes","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"83072385024"],[1755720630.781,"83186810880"]]}]} }
已用内存
-
表达式:
node_memory_MemTotal_bytes{instance="ip:9100"} - node_memory_MemAvailable_bytes{instance="ip:9100"}
- 含义:已用内存
-
请求示例:
http://ip:9090/api/v1/query_range?query=node_memory_MemTotal_bytes{instance="ip:9100"} - node_memory_MemAvailable_bytes{instance="ip:9100"}&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"52480114688"],[1755720630.781,"52365688832"]]}]} }
内存使用率(重复)
-
表达式:
(1 - (node_memory_MemAvailable_bytes{instance="ip:9100"} / (node_memory_MemTotal_bytes{instance="ip:9100"})))* 100
- 含义:内存使用率
-
请求示例:
http://ip:9090/api/v1/query_range?query=(1 - (node_memory_MemAvailable_bytes{instance="ip:9100"} / (node_memory_MemTotal_bytes{instance="ip:9100"})))* 100&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"38.715711476735024"],[1755720630.781,"38.63129705705032"]]}]} }
磁盘使用率
磁盘使用率(重复)
-
表达式:
(node_filesystem_size_bytes{instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}-node_filesystem_free_bytes{instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}) *100/(node_filesystem_avail_bytes {instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}+(node_filesystem_size_bytes{instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}-node_filesystem_free_bytes{instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}))
- 含义:用户视角的磁盘使用率,已经使用空间/用户可操作总空间
fstype=~"ext.*|xfs|nfs"
:监控主流文件系统(排除 tmpfs 等内存文件系统)mountpoint !~".*pod.*"
:排除 Kubernetes Pod 挂载点(容器化环境特有)- 注意:
+
会被转移为空格,需要用%2B
替代
-
请求示例:
http://ip:9090/api/v1/query_range?query=(node_filesystem_size_bytes{instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}-node_filesystem_free_bytes{instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}) *100/(node_filesystem_avail_bytes {instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}%2B(node_filesystem_size_bytes{instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}-node_filesystem_free_bytes{instance="ip:9100",fstype=~"ext.*|xfs|nfs",mountpoint !~".*pod.*"}))&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"device": "/dev/mapper/centos-home","environment": "production","fstype": "xfs","instance": "ip:9100","job": "node-exporter3","mountpoint": "/home","server": "server-ip"},"values": [[1755717030.781,"6.383665437345736"],[1755720630.781,"6.400828182136928"]]},{"metric": {"device": "/dev/mapper/centos-root","environment": "production","fstype": "xfs","instance": "ip:9100","job": "node-exporter3","mountpoint": "/","server": "server-ip"},"values": [[1755717030.781,"36.219364650253205"],[1755720630.781,"36.311770803426654"]]},{"metric": {"device": "/dev/sda2","environment": "production","fstype": "xfs","instance": "ip:9100","job": "node-exporter3","mountpoint": "/boot","server": "server-ip"},"values": [[1755717030.781,"16.144092728832007"],[1755720630.781,"16.144092728832007"]]}]} }
文件系统 inode 使用率百分比
-
表达式:
(1 - node_filesystem_files_free{instance="ip:9100",fstype=~"ext.?|xfs|nfs",mountpoint !~".*pod.*"} / node_filesystem_files{instance="ip:9100",fstype=~"ext.?|xfs|nfs",mountpoint !~".*pod.*"}) * 100
- 含义:inode 是文件系统中用于存储文件元数据(如大小、权限、位置等)的数据结构。每个文件/目录都占用一个 inode。
fstype=~"ext.*|xfs|nfs"
:监控主流文件系统(排除 tmpfs 等内存文件系统)mountpoint !~".*pod.*"
:排除 Kubernetes Pod 挂载点(容器化环境特有)
-
请求示例:
http://ip:9090/api/v1/query_range?query=(1 - node_filesystem_files_free{instance="ip:9100",fstype=~"ext.?|xfs|nfs",mountpoint !~".*pod.*"} / node_filesystem_files{instance="ip:9100",fstype=~"ext.?|xfs|nfs",mountpoint !~".*pod.*"}) * 100&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"device": "/dev/mapper/klas-root","environment": "production","fstype": "xfs","instance": "ip:9100","job": "node-exporter1","mountpoint": "/","server": "server-ip"},"values": [[1755717030.781,"0.9976753455840592"],[1755720630.781,"0.998009876234407"]]},{"metric": {"device": "/dev/sda2","environment": "production","fstype": "xfs","instance": "ip:9100","job": "node-exporter1","mountpoint": "/boot","server": "server-ip"},"values": [[1755717030.781,"0.02384185791015625"],[1755720630.781,"0.02384185791015625"]]}]} }
磁盘读写速率
磁盘读取速率
-
表达式:
irate(node_disk_reads_completed_total{instance="ip:9100"}[5m])
- 含义:磁盘读取速率。
-
请求示例:
http://ip:9090/api/v1/query_range?query=irate(node_disk_reads_completed_total{instance="ip:9100"}[5m])&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"device": "dm-0","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"1.8333333333333333"],[1755720630.781,"0.016666666666666666"]]},{"metric": {"device": "dm-1","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"0"],[1755720630.781,"0"]]},{"metric": {"device": "dm-2","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"0"],[1755720630.781,"0"]]},{"metric": {"device": "sda","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"1.85"],[1755720630.781,"0"]]},{"metric": {"device": "sdb","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"0"],[1755720630.781,"0.016666666666666666"]]}]} }
磁盘写入速率
-
表达式:
irate(node_disk_writes_completed_total{instance="ip:9100"}[5m])
- 含义:磁盘写入速率。
-
请求示例:
http://ip:9090/api/v1/query_range?query=irate(node_disk_writes_completed_total{instance="ip:9100"}[5m])&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"device": "dm-0","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"27.3"],[1755720630.781,"22.483333333333334"]]},{"metric": {"device": "dm-1","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"0.08333333333333333"],[1755720630.781,"0"]]},{"metric": {"device": "dm-2","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"0"],[1755720630.781,"0"]]},{"metric": {"device": "sda","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"24.85"],[1755720630.781,"20.233333333333334"]]},{"metric": {"device": "sdb","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"1.65"],[1755720630.781,"0.8333333333333334"]]}]} }
磁盘的实时负载深度
-
表达式:
node_disk_io_now{instance="ip:9100"}
- 含义:当前未完成的 I/O 请求数量:表示在查询时刻,目标磁盘设备上正在处理中或排队等待的 I/O 操作数量(瞬时值)
-
请求示例:
http://ip:9090/api/v1/query_range?query=node_disk_io_now{instance="ip:9100"}&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"__name__": "node_disk_io_now","device": "dm-0","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"0"],[1755720630.781,"0"]]},{"metric": {"__name__": "node_disk_io_now","device": "dm-1","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"0"],[1755720630.781,"0"]]},{"metric": {"__name__": "node_disk_io_now","device": "dm-2","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"0"],[1755720630.781,"0"]]},{"metric": {"__name__": "node_disk_io_now","device": "sda","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"0"],[1755720630.781,"0"]]},{"metric": {"__name__": "node_disk_io_now","device": "sdb","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"0"],[1755720630.781,"0"]]}]} }
每秒磁盘读写容量
磁盘读取容量
-
表达式:
irate(node_disk_read_bytes_total{instance='ip:9100'}[5m])
- 含义:磁盘读取容量
-
请求示例:
http://ip:9090/api/v1/query_range?query=irate(node_disk_read_bytes_total{instance='ip:9100'}[5m])&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"device": "dm-0","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"334233.6"],[1755720630.781,"273.06666666666666"]]},{"metric": {"device": "dm-1","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"0"],[1755720630.781,"0"]]},{"metric": {"device": "dm-2","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"0"],[1755720630.781,"0"]]},{"metric": {"device": "sda","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"334233.6"],[1755720630.781,"0"]]},{"metric": {"device": "sdb","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"0"],[1755720630.781,"273.06666666666666"]]}]} }
磁盘写入容量
-
表达式:
irate(node_disk_written_bytes_total{instance="ip:9100"}[5m])
- 含义:磁盘写入容量
-
请求示例:
http://ip:9090/api/v1/query_range?query=irate(node_disk_written_bytes_total{instance="ip:9100"}[5m])&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"device": "dm-0","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"1687825.0666666667"],[1755720630.781,"947746.1333333333"]]},{"metric": {"device": "dm-1","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"5461.333333333333"],[1755720630.781,"0"]]},{"metric": {"device": "dm-2","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"0"],[1755720630.781,"0"]]},{"metric": {"device": "sda","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"1652940.8"],[1755720630.781,"933410.1333333333"]]},{"metric": {"device": "sdb","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"40345.6"],[1755720630.781,"14336"]]}]} }
网络Socket连接信息
当前TCP连接数
-
表达式:
node_netstat_Tcp_CurrEstab{instance="ip:9100"}
- 含义:磁盘写入容量
-
请求示例:
http://ip:9090/api/v1/query_range?query=node_netstat_Tcp_CurrEstab{instance="ip:9100"}&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"__name__": "node_netstat_Tcp_CurrEstab","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"100"],[1755720630.781,"100"]]}]} }
TIME_WAIT状态TCP连接数
-
表达式:
node_sockstat_TCP_tw{instance='ip:9100'}
- 含义:TIME_WAIT状态TCP连接数
- 在 TCP 协议中,TIME_WAIT是连接关闭后的一个必经状态,作用是:
- 确保最后一个 ACK 到达对端,避免旧连接的数据包干扰新连接。
- 默认等待 2MSL(60秒) 后自动释放资源(MSL = Maximum Segment Lifetime,报文最大生存时间)。
- 在 TCP 协议中,TIME_WAIT是连接关闭后的一个必经状态,作用是:
- 含义:TIME_WAIT状态TCP连接数
-
请求示例:
http://ip:9090/api/v1/query_range?query=node_sockstat_TCP_tw{instance='ip:9100'}&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"__name__": "node_sockstat_TCP_tw","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"8"],[1755720630.781,"8"]]}]} }
正在使用的 socket 数量
-
表达式:
node_sockstat_sockets_used{instance='ip:9100'}
- 含义:正在使用的 socket 数量
-
请求示例:
http://ip:9090/api/v1/query_range?query=node_sockstat_sockets_used{instance='ip:9100'}&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"__name__": "node_sockstat_sockets_used","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"1813"],[1755720630.781,"1812"]]}]} }
当前正在使用的 UDP 套接字(Socket)数量
-
表达式:
node_sockstat_UDP_inuse{instance='ip:9100'}
- 含义:正在使用的 socket 数量
-
请求示例:
http://ip:9090/api/v1/query_range?query=node_sockstat_UDP_inuse{instance='ip:9100'}&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"__name__": "node_sockstat_UDP_inuse","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"10"],[1755720630.781,"10"]]}]} }
系统分配的 TCP 套接字(Socket)总量
-
表达式:
node_sockstat_TCP_alloc{instance='ip:9100'}
- 含义:系统分配的 TCP 套接字(Socket)总量
-
请求示例:
http://ip:9090/api/v1/query_range?query=node_sockstat_TCP_alloc{instance='ip:9100'}&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"__name__": "node_sockstat_TCP_alloc","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"554"],[1755720630.781,"554"]]}]} }
TCP 被动连接(Passive Opens)的瞬时增长率
-
表达式:
irate(node_netstat_Tcp_PassiveOpens{instance="ip:9100"}[5m])
- 含义:TCP 被动连接(Passive Opens)的瞬时增长率
-
请求示例:
http://ip:9090/api/v1/query_range?query=irate(node_netstat_Tcp_PassiveOpens{instance="ip:9100"}[5m])&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"0.16666666666666666"],[1755720630.781,"0.16666666666666666"]]}]} }
主动发起 TCP 连接的瞬时速率
-
表达式:
irate(node_netstat_Tcp_ActiveOpens{instance='ip:9100'}[5m])
- 含义:主动发起 TCP 连接的瞬时速率
-
请求示例:
http://ip:9090/api/v1/query_range?query=irate(node_netstat_Tcp_ActiveOpens{instance='ip:9100'}[5m])&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"0.18333333333333332"],[1755720630.781,"0.18333333333333332"]]}]} }
TCP 输入数据段(TCP segments received)的瞬时接收速率
-
表达式:
irate(node_netstat_Tcp_InSegs{instance='ip:9100'}[5m])
- 含义:TCP 输入数据段(TCP segments received)的瞬时接收速率
-
请求示例:
http://ip:9090/api/v1/query_range?query=irate(node_netstat_Tcp_InSegs{instance='ip:9100'}[5m])&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"23.466666666666665"],[1755720630.781,"24.983333333333334"]]}]} }
发送的 TCP 数据段(segments)的瞬时速率
-
表达式:
irate(node_netstat_Tcp_OutSegs{instance='ip:9100'}[5m])
- 含义:发送的 TCP 数据段(segments)的瞬时速率
-
请求示例:
http://ip:9090/api/v1/query_range?query=irate(node_netstat_Tcp_OutSegs{instance='ip:9100'}[5m])&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"23.75"],[1755720630.781,"25.466666666666665"]]}]} }
TCP 重传数据段的瞬时速率
-
表达式:
irate(node_netstat_Tcp_RetransSegs{instance="ip:9100"}[5m])
- 含义:TCP 重传数据段的瞬时速率
-
请求示例:
http://ip:9090/api/v1/query_range?query=irate(node_netstat_Tcp_RetransSegs{instance="ip:9100"}[5m])&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"0"],[1755720630.781,"0"]]}]} }
因监听队列溢出导致的 TCP 连接丢弃的瞬时速率
-
表达式:
irate(node_netstat_TcpExt_ListenDrops{instance="ip:9100"}[5m])
- 含义:因监听队列溢出导致的 TCP 连接丢弃的瞬时速率
-
请求示例:
http://ip:9090/api/v1/query_range?query=irate(node_netstat_TcpExt_ListenDrops{instance="ip:9100"}[5m])&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"0"],[1755720630.781,"0"]]}]} }
每秒网络带宽使用
网卡接收带宽
-
表达式:
irate(node_network_receive_bytes_total{instance=~"ip:9100",device!~"docker.*|veth.*|lo|br-.*|virbr.*|tun.*|tap.*"}[5m])*8
- 含义:网卡接收带宽
- 排除虚拟网卡
- 含义:网卡接收带宽
-
请求示例:
http://ip:9090/api/v1/query_range?query=irate(node_network_receive_bytes_total{instance=~"ip:9100",device!~"docker.*|veth.*|lo|br-.*|virbr.*|tun.*|tap.*"}[5m])*8&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"device": "endvnic","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"0"],[1755720630.781,"0"]]},{"metric": {"device": "enp125s0f0","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"0"],[1755720630.781,"0"]]},{"metric": {"device": "enp125s0f1","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"0"],[1755720630.781,"0"]]},{"metric": {"device": "enp125s0f2","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"0"],[1755720630.781,"0"]]},{"metric": {"device": "enp125s0f3","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"0"],[1755720630.781,"0"]]},{"metric": {"device": "enp125s0f4","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"0"],[1755720630.781,"0"]]},{"metric": {"device": "enp125s0f5","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"59909.6"],[1755720630.781,"2704437.2"]]}]} }
网卡发送带宽
-
表达式:
irate(node_network_transmit_bytes_total{instance=~"ip:9100",device!~"docker.*|veth.*|lo|br-.*|virbr.*|tun.*|tap.*"}[5m])*8
- 含义:网卡发送带宽
- 排除虚拟网卡
- 含义:网卡发送带宽
-
请求示例:
http://ip:9090/api/v1/query_range?query=irate(node_network_transmit_bytes_total{instance=~"ip:9100",device!~"docker.*|veth.*|lo|br-.*|virbr.*|tun.*|tap.*"}[5m])*8&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"device": "endvnic","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"0"],[1755720630.781,"0"]]},{"metric": {"device": "enp125s0f0","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"0"],[1755720630.781,"0"]]},{"metric": {"device": "enp125s0f1","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"0"],[1755720630.781,"0"]]},{"metric": {"device": "enp125s0f2","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"0"],[1755720630.781,"0"]]},{"metric": {"device": "enp125s0f3","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"0"],[1755720630.781,"0"]]},{"metric": {"device": "enp125s0f4","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"0"],[1755720630.781,"0"]]},{"metric": {"device": "enp125s0f5","environment": "production","instance": "ip:9100","job": "node-exporter1","server": "server-ip"},"values": [[1755717030.781,"64835.73333333333"],[1755720630.781,"124779.33333333333"]]}]} }
风扇转速
-
表达式:
ipmi_fan_speed_rpm{instance='ip:9290'}
- 含义:风扇转速
-
请求示例:
http://ip:9090/api/v1/query_range?query=ipmi_fan_speed_rpm{instance='ip:9290'}&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"__name__": "ipmi_fan_speed_rpm","id": "31","instance": "ip:9290","job": "ipmi_monitor_1.38","name": "FAN1 Speed"},"values": [[1755717030.781,"10425"],[1755720630.781,"10200"]]},{"metric": {"__name__": "ipmi_fan_speed_rpm","id": "32","instance": "ip:9290","job": "ipmi_monitor_1.38","name": "FAN2 Speed"},"values": [[1755717030.781,"10500"],[1755720630.781,"10200"]]},{"metric": {"__name__": "ipmi_fan_speed_rpm","id": "33","instance": "ip:9290","job": "ipmi_monitor_1.38","name": "FAN3 Speed"},"values": [[1755717030.781,"10500"],[1755720630.781,"10275"]]},{"metric": {"__name__": "ipmi_fan_speed_rpm","id": "34","instance": "ip:9290","job": "ipmi_monitor_1.38","name": "FAN4 Speed"},"values": [[1755717030.781,"10500"],[1755720630.781,"10200"]]}]} }
CPU温度
-
表达式:
ipmi_temperature_celsius{instance='ip:9290', name=~"CPU.*|.*Core.*"}
- 含义:CPU温度
-
请求示例:
http://ip:9090/api/v1/query_range?query=ipmi_temperature_celsius{instance='ip:9290', name=~"CPU.*|.*Core.*"}&start=2025-08-20T19:10:30.781Z&end=2025-08-20T20:20:30.781Z&step=3600s 返回值 {"status": "success","data": {"resultType": "matrix","result": [{"metric": {"__name__": "ipmi_temperature_celsius","id": "27","instance": "ip:9290","job": "ipmi_monitor_1.38","name": "CPU1 VDDQ Temp"},"values": [[1755717030.781,"9"],[1755720630.781,"9"]]},{"metric": {"__name__": "ipmi_temperature_celsius","id": "28","instance": "ip:9290","job": "ipmi_monitor_1.38","name": "CPU1 VRD Temp"},"values": [[1755717030.781,"9"],[1755720630.781,"9"]]},{"metric": {"__name__": "ipmi_temperature_celsius","id": "3","instance": "ip:9290","job": "ipmi_monitor_1.38","name": "CPU1 Core Rem"},"values": [[1755717030.781,"42"],[1755720630.781,"43"]]},{"metric": {"__name__": "ipmi_temperature_celsius","id": "4","instance": "ip:9290","job": "ipmi_monitor_1.38","name": "CPU1 MEM Temp"},"values": [[1755717030.781,"37"],[1755720630.781,"38"]]},{"metric": {"__name__": "ipmi_temperature_celsius","id": "9","instance": "ip:9290","job": "ipmi_monitor_1.38","name": "1711 Core Temp"},"values": [[1755717030.781,"35"],[1755720630.781,"36"]]}]} }