当前位置: 首页 > web >正文

Prometheus+Grafana监控安装及配置

前言
篇幅较长,纯手打,如有错误,请留言!

一、Prometheus介绍

官网:https://prometheus.io/docs/introduction/overview/

1.什么是Prometheus?

Prometheus是一个开源监控系统,它前身是SoundCloud的警告工具包。从2012年开始,许多公司和组织开始使用Prometheus。该项目的开发人员和用户社区非常活跃,越来越多的开发人员和用户参与到该项目中。目前它是一个独立的开源项目,且不依赖与任何公司。 为了强调这点和明确该项目治理结构,Prometheus在2016年继Kurberntes之后,加入了Cloud Native Computing Foundation。

2.Prometheus的主要特征

  • 多维度数据模型
  • 灵活的查询语言
  • 不依赖分布式存储,单个服务器节点是自主的
  • 以HTTP方式,通过pull模型拉去时间序列数据
  • 也通过中间网关支持push模型
  • 通过服务发现或者静态配置,来发现目标服务对象
  • 支持多种多样的图表和界面展示,grafana也支持它

3.组件

Prometheus生态包括了很多组件,它们中的一些是可选的:

  • 主服务Prometheus Server负责抓取和存储时间序列数据
  • 客户库负责检测应用程序代码
  • 支持短生命周期的PUSH网关
  • 基于Rails/SQL仪表盘构建器的GUI
  • 多种导出工具,可以支持Prometheus存储数据转化为HAProxy、StatsD、Graphite等工具所需要的数据存储格式
  • 警告管理器
  • 命令行查询工具
  • 其他各种支撑工具

多数Prometheus组件是Go语言写的,这使得这些组件很容易编译和部署。

4.架构

在这里插入图片描述
Prometheus服务,可以直接通过目标 拉取 数据,或者间接地通过中间网关拉取数据。它在本地存储抓取的所有数据,并通过一定规则进行清理和整理数据,并把得到的结果存储到新的时间序列中,PromQL和其他API可视化地展示收集的数据。

二、安装及配置

由于环境限制,Prometheus需要通过nginx代理才能访问,所以在Prometheus启动命令上加了参数,还需要配置nginx反向代理,详情如下:

1.安装Prometheus

[root@master supp_app]# wget https://github.com/prometheus/prometheus/releases/tag/v3.4.1/prometheus-3.4.1.linux-amd64.tar.gz
[root@master supp_app]# tar zxf prometheus-3.4.1.linux-amd64.tar.gz
[root@master supp_app]# cd prometheus-3.4.1.linux-amd64/

2.文件介绍

[root@master prometheus-3.4.1.linux-amd64]# ls -lrt
total 294972
-rwxr-xr-x  1 1001 docker 154802412 May 31 18:46 prometheus ## Prometheus启动文件
-rwxr-xr-x  1 1001 docker 146211128 May 31 18:46 promtool   ## Prometheus工具
-rw-r--r--  1 1001 docker      3773 May 31 18:58 NOTICE     ## 注意事项
-rw-r--r--  1 1001 docker     11357 May 31 18:58 LICENSE    ## 许可证
-rw-r--r--  1 1001 docker      1877 Aug 22 17:47 prometheus.yml  ## Prometheus配置文件
drwxr-xr-x 28 root root        4096 Aug 27 15:00 data       ## Prometheus自带数据库TSDB的数据目录

3.配置解析

[root@master prometheus-3.4.1.linux-amd64]# cat prometheus.yml
# my global config
global:scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.# scrape_timeout is set to the global default (10s).# Alertmanager configuration
alerting:alertmanagers:- static_configs:- targets:# - alertmanager:9093# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:# - "first_rules.yml"# - "second_rules.yml"# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.- job_name: "prometheus"# metrics_path defaults to '/metrics'# scheme defaults to 'http'.static_configs:- targets: ["localhost:9090"]# The label name is added as a label `label_name=<label_value>` to any timeseries scraped from this config.labels:app: "prometheus"

更详细的配置参考:https://prometheus.io/docs/prometheus/latest/configuration/configuration/

4.自编启动脚本

由于Prometheus没有启动脚本,每次启停都要输入一长串命令,简单写个脚本方便后期操作

[root@master prometheus-3.4.1.linux-amd64]# vim start.sh
#!/bin/bash
# Author: 
# Prometheus start scriptPRO_HOME=$(cd "$(dirname "$0")";pwd)usage(){echo "Usage: /bin/bash start.sh OPTION"echo "OPTIONS:start|stop|restart"
}log_time() {local STRING="${1:-"No message provided"}"local TIME="$(date "+%Y-%m-%d %H:%M:%S").$((`date "+%N"`/1000000))"printf "%s--%s\n" "$TIME" "$STRING"
}start(){PID=`cat $PRO_HOME/pro.pid`if [[ -n "$PID" ]];thenlog_time "Prometheus service has been start!"exit 1fi
## 添加了--web.external-url=prometheus参数,因为机器只能开放80端口,用nginx做了反向代理,具体的参数解释看下面解释nohup $PRO_HOME/prometheus --web.external-url=prometheus --config.file=$PRO_HOME/prometheus.yml >> pro.log 2>&1 &sleep 3ps -ef | grep $PRO_HOME | grep -v grep  | awk '{print $2}' > pro.pidlog_time "Prometheus service started!"PID1=`cat $PRO_HOME/pro.pid`log_time "Prometheus process pid: $PID1"
}stop(){PID=`cat $PRO_HOME/pro.pid`if [[ -z "$PID" ]];thenlog_time "Prometheus service don't start!"elsekill $PIDecho -n > $PRO_HOME/pro.pidlog_time "Prometheus service stoped!"fi
}case $1 in
start)start;;
stop)stop;;
restart)stopstart;;
*)usage;;
esac

–web.external-url= The URL under which Prometheus is externally reachable (for example, if Prometheus is served via a reverse proxy). Used for generating relative and absolute links back to Prometheus itself. If the URL has a path portion, it will be used to prefix all HTTP endpoints served by Prometheus. If omitted, relevant URL components will be derived automatically.

5.启动服务

[root@master prometheus-3.4.1.linux-amd64]# sh start.sh start
2025-08-27 16:30:42.942--Prometheus service started!
2025-08-27 16:30:42.946--Prometheus process pid: 3585681
[root@master prometheus-3.4.1.linux-amd64]# ps -ef | grep 3585681
root     3585681       1  1 16:30 pts/1    00:00:00 /opt/supp_app/prometheus-3.4.1.linux-amd64/prometheus --web.external-url=prometheus --config.file=/opt/supp_app/prometheus-3.4.1.linux-amd64/prometheus.yml
root     3585850 3444503  0 16:31 pts/1    00:00:00 grep --color=auto 3585681

6.配置nginx

[root@master prometheus-3.4.1.linux-amd64]# egrep -v '#|^$' /opt/web_app/nginx-1.16.1/conf/nginx.conf
worker_processes  2;
events {worker_connections  60000;
}
http {include       mime.types;default_type  application/octet-stream;sendfile        on;keepalive_timeout  65;server {listen       80;server_name  127.0.0.1;## 添加下面这段locationlocation /prometheus/ {proxy_pass http://localhost:9090/prometheus/;}location /jenkins {proxy_pass  http://localhost:8080;proxy_set_header Host $host;proxy_set_header X-Real-IP $remote_addr;proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;proxy_set_header X-Forwarded-Proto $scheme;}location /status {stub_status;allow 127.0.0.1;deny all;}error_page   500 502 503 504  /50x.html;location = /50x.html {root   html;}}
}

重启nginx

[root@master prometheus-3.4.1.linux-amd64]# /opt/web_app/nginx-1.16.1/sbin/nginx -t
nginx: the configuration file /opt/web_app/nginx-1.16.1/conf/nginx.conf syntax is ok
nginx: configuration file /opt/web_app/nginx-1.16.1/conf/nginx.conf test is successful
[root@master prometheus-3.4.1.linux-amd64]# /opt/web_app/nginx-1.16.1/sbin/nginx -s reload

7.浏览器打开Prometheus

在这里插入图片描述

三、安装exporter

本次需要安装redis_exporter、node_exporter、nginx_exporter,这几个exporter都可以在Prometheus官网找到。
官网链接:https://prometheus.io/docs/instrumenting/exporters/

1.下载tar包

[root@master supp_app]# wget https://github.com/prometheus/node_exporter/releases/download/v1.9.1/node_exporter-1.9.1.linux-amd64.tar.gz
[root@master supp_app]# wget https://github.com/oliver006/redis_exporter/releases/download/v1.69.0/redis_exporter-v1.69.0.linux-amd64.tar.gz
[root@master supp_app]# wget https://github.com/nginx/nginx-prometheus-exporter/releases/download/v1.3.0/nginx-prometheus-exporter_1.3.0_linux_amd64.tar.gz
[root@master supp_app]# tar zxf nginx-prometheus-exporter_1.3.0_linux_amd64.tar.gz 
[root@master supp_app]# tar zxf node_exporter-1.9.1.linux-amd64.tar.gz 
[root@master supp_app]# tar zxf redis_exporter-v1.69.0.linux-amd64.tar.gz 

2.启动exporter

启动nginx_exporter

[root@master nginx_exporter]# nohup ./nginx-prometheus-exporter -nginx.scrape-uri http://127.0.0.1/status &
[root@master nginx_exporter]# ps -ef | grep nginx-pro
root      481411       1  0 Jul08 ?        00:07:07 /opt/supp_app/nginx_exporter/nginx-prometheus-exporter -nginx.scrape-uri http://127.0.0.1/status
root     3605013 3444503  0 17:23 pts/1    00:00:00 grep --color=auto nginx-pro

启动redis_exporter

[root@master redis_exporter-v1.69.0.linux-amd64]# nohup ./redis_exporter --web.listen-address=localhost:9121 --exclude-latency-histogram-metrics &
[root@master redis_exporter-v1.69.0.linux-amd64]# ps -ef | grep redis_expor
root     3081020       1  0 Aug22 ?        00:13:33 ./redis_exporter --web.listen-address=localhost:9121 --exclude-latency-histogram-metrics
root     3614478 3444503  0 17:49 pts/1    00:00:00 grep --color=auto redis_expor

启动node_exporter

[root@master node_exporter-1.9.1.linux-amd64]# nohup /opt/supp_app/node_exporter-1.9.1.linux-amd64/node_exporter >> node_exporter.log 2>&1 &
[root@master node_exporter-1.9.1.linux-amd64]# ps -ef | grep node_exporter
root     3229621       1  0 Jun19 ?        01:15:01 /opt/supp_app/node_exporter-1.9.1.linux-amd64/node_exporter
root     3616041 3444503  0 17:53 pts/1    00:00:00 grep --color=auto node_exporter

四、配置Prometheus采集数据

1.修改配置

在prometheus.yml文件里面加入如下配置:

[root@master prometheus-3.4.1.linux-amd64]# cat prometheus.yml
# my global config
global:scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.# scrape_timeout is set to the global default (10s).# Alertmanager configuration
alerting:alertmanagers:- static_configs:- targets:# - alertmanager:9093# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:# - "first_rules.yml"# - "second_rules.yml"# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.- job_name: "prometheus"# metrics_path defaults to '/metrics'# scheme defaults to 'http'.metrics_path: '/prometheus/metrics'static_configs:- targets: ['localhost:9090']# The label name is added as a label `label_name=<label_value>` to any timeseries scraped from this config.labels:app: "prometheus"- job_name: "node"static_configs:- targets: ['localhost:9100']- job_name: "nginx_exporter"static_configs:- targets: ["localhost:9113"]- job_name: 'redis_exporter_targets'file_sd_configs:- files:- targets-redis-instances.jsonmetrics_path: /scraperelabel_configs:- source_labels: [__address__]target_label: __param_target- source_labels: [__param_target]target_label: instance- target_label: __address__replacement: "localhost:9121"- job_name: redis_exporterstatic_configs:- targets: ['localhost:9121']

2.重启Prometheus

这里我们用刚才写好的启动脚本直接重启

[root@master prometheus-3.4.1.linux-amd64]# sh start.sh restart
2025-08-27 17:57:45.830--Prometheus service stoped!
2025-08-27 17:57:48.860--Prometheus service started!
2025-08-27 17:57:48.865--Prometheus process pid: 3617592

3.通过浏览器查看exporter状态

在这里插入图片描述
state全是up就没有问题

4.查看监控指标

可以使用curl命令在服务器上查看各exporter的endpoint
在这里插入图片描述

[root@master web_app]# curl http://localhost:9100/metrics > nginx-metric% Total    % Received % Xferd  Average Speed   Time    Time     Time  CurrentDload  Upload   Total   Spent    Left  Speed
100 67496    0 67496    0     0  5992k      0 --:--:-- --:--:-- --:--:-- 5992k
[root@master web_app]# head -20 nginx-metric
# HELP go_gc_duration_seconds A summary of the wall-time pause (stop-the-world) duration in garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 3.0135e-05
go_gc_duration_seconds{quantile="0.25"} 3.8797e-05
go_gc_duration_seconds{quantile="0.5"} 4.7997e-05
go_gc_duration_seconds{quantile="0.75"} 5.0852e-05
go_gc_duration_seconds{quantile="1"} 8.6899e-05
go_gc_duration_seconds_sum 12.599998827
go_gc_duration_seconds_count 284197
# HELP go_gc_gogc_percent Heap size target percentage configured by the user, otherwise 100. This value is set by the GOGC environment variable, and the runtime/debug.SetGCPercent function. Sourced from /gc/gogc:percent
# TYPE go_gc_gogc_percent gauge
go_gc_gogc_percent 100
# HELP go_gc_gomemlimit_bytes Go runtime memory limit configured by the user, otherwise math.MaxInt64. This value is set by the GOMEMLIMIT environment variable, and the runtime/debug.SetMemoryLimit function. Sourced from /gc/gomemlimit:bytes
# TYPE go_gc_gomemlimit_bytes gauge
go_gc_gomemlimit_bytes 9.223372036854776e+18
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 8
# HELP go_info Information about the Go environment.
# TYPE go_info gauge

也可以在浏览器选择一个metric查看数据
在这里插入图片描述

五、Grafana介绍

官网:https://grafana.com/docs/grafana/latest/introduction/grafana-enterprise/

1.什么是grafana

Grafana是一款开源的‌数据可视化与监控分析平台‌,主要用于连接多种数据源(如Prometheus、InfluxDB等),通过交互式仪表盘实时展示和分析时序数据,广泛应用于IT运维、云服务监控等领域。

2.核心功能

多数据源支持‌:

  • 支持Prometheus、InfluxDB、Elasticsearch、MySQL等30+数据源,无需迁移数据即可统一分析。‌‌
  • 通过插件扩展兼容性,例如集成AWS、腾讯云等云服务监控数据。‌‌

动态仪表盘‌:

  • 提供拖拽式编辑器,可自定义折线图、热力图等10+图表类型,并支持时间范围筛选、阈值告警等交互功能。‌
  • 支持团队协作共享仪表盘,设置细粒度权限控制。‌‌

告警与自动化‌:

  • 可配置条件触发告警,通过邮件、Slack等渠道通知,并与PagerDuty等运维工具联动。‌‌

3.主要用途

  • 系统监控‌:结合Prometheus监控服务器、容器集群状态,实时展示CPU、内存等指标。‌‌
  • 业务分析‌:对接MySQL或日志服务,可视化用户增长、交易数据等业务指标。‌‌
  • 云原生集成‌:作为Kubernetes、OpenTelemetry等云原生技术的标准观测工具。‌‌

六、安装Grafana

1.下载

[root@master grafana]# wget https://dl.grafana.com/enterprise/release/grafana-enterprise-12.0.2.linux-amd64.tar.gz
[root@master grafana]# tar zxf grafana-enterprise-12.0.2.linux-amd64.tar.gz

2.修改配置文件

因为环境限制,需要nginx做反向代理,所以修改grafana.ini,大概在58行的位置

  4950 # The public facing domain name used to access grafana from a browser51 domain = localhost5253 # Redirect to correct domain if host header does not match domain54 # Prevents DNS rebinding attacks55 enforce_domain = false5657 # The full public facing url58 root_url = %(protocol)s://%(domain)s:%(http_port)s/grafana/5960 # Serve Grafana from subpath specified in `root_url` setting. By default it is set to `false` for compatibility reasons.61 serve_from_sub_path = true6263 # Log web requests64 router_logging = false6566 # the path relative working pathroot_url

2.配置启动文件

这里是把grafana注册成系统服务了

[root@master grafana]# cp -r grafana-v12.0.2/ /usr/local/grafana/
[root@master grafana]# vim /etc/systemd/system/grafana-server.service
[Unit]
Description=Grafana Server
After=network.target[Service]
Type=simple
User=grafana
Group=users
ExecStart=/usr/local/grafana/bin/grafana server --config=/usr/local/grafana/conf/grafana.ini --homepath=/usr/local/grafana
Restart=on-failure[Install]
WantedBy=multi-user.target

3.启动grafana

[root@master grafana]# systemctl start grafana-server
[root@master grafana]# systemctl status grafana-server
● grafana-server.service - Grafana ServerLoaded: loaded (/etc/systemd/system/grafana-server.service; disabled; vendor preset: disabled)Active: active (running) since Mon 2025-07-07 17:37:12 CST; 1 months 20 days agoMain PID: 291119 (grafana)Tasks: 23 (limit: 11715)Memory: 173.1MCGroup: /system.slice/grafana-server.service└─291119 /usr/local/grafana/bin/grafana server --config=/usr/local/grafana/conf/grafana.ini --homepath=/usr/local/grafanaAug 27 18:29:24 master grafana[291119]: logger=context userId=1 orgId=1 uname=admin t=2025-08-27T18:29:24.9597876+08:00 level=info msg="Request >
Aug 27 18:29:39 master grafana[291119]: logger=context userId=1 orgId=1 uname=admin t=2025-08-27T18:29:39.987415511+08:00 level=info msg="Reques>
Aug 27 18:29:42 master grafana[291119]: logger=context userId=1 orgId=1 uname=admin t=2025-08-27T18:29:42.96207908+08:00 level=info msg="Request>
Aug 27 18:29:52 master grafana[291119]: logger=context userId=1 orgId=1 uname=admin t=2025-08-27T18:29:52.907149699+08:00 level=info msg="Reques>
Aug 27 18:30:10 master grafana[291119]: logger=context userId=1 orgId=1 uname=admin t=2025-08-27T18:30:10.90623326+08:00 level=info msg="Request>
Aug 27 18:30:13 master grafana[291119]: logger=context userId=1 orgId=1 uname=admin t=2025-08-27T18:30:13.919338485+08:00 level=info msg="Reques>
Aug 27 18:30:19 master grafana[291119]: logger=context userId=1 orgId=1 uname=admin t=2025-08-27T18:30:19.91992532+08:00 level=info msg="Request>
Aug 27 18:30:25 master grafana[291119]: logger=context userId=1 orgId=1 uname=admin t=2025-08-27T18:30:25.917424907+08:00 level=info msg="Reques>

4.修改nginx配置

前面提到,grafana需要用到反向代理,这里用的nginx跟Prometheus是同一个nginx

[root@master conf]# !egrep
egrep -v '#|^$' /opt/web_app/nginx-1.16.1/conf/nginx.conf
worker_processes  2;
events {worker_connections  60000;
}
http {include       mime.types;default_type  application/octet-stream;sendfile        on;keepalive_timeout  65;server {listen       80;server_name  127.0.0.1;##添加下面这段配置location /grafana/ {proxy_pass http://localhost:3000/grafana/;proxy_set_header Host $host;proxy_set_header X-Real-IP $remote_addr;proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;proxy_set_header X-Forwarded-Proto $scheme;}location /prometheus/ {proxy_pass http://localhost:9090/prometheus/;}location /jenkins {proxy_pass  http://localhost:8080;proxy_set_header Host $host;proxy_set_header X-Real-IP $remote_addr;proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;proxy_set_header X-Forwarded-Proto $scheme;}location /status {stub_status;allow 127.0.0.1;deny all;}error_page   500 502 503 504  /50x.html;location = /50x.html {root   html;}}
}

重启nginx

[root@master conf]# /opt/web_app/nginx-1.16.1/sbin/nginx -t
nginx: the configuration file /opt/web_app/nginx-1.16.1/conf/nginx.conf syntax is ok
nginx: configuration file /opt/web_app/nginx-1.16.1/conf/nginx.conf test is successful
[root@master conf]# /opt/web_app/nginx-1.16.1/sbin/nginx -s reload

七、配置Grafana Dashboard

1.登录grafana

浏览器输入:http://IP/grafana
账号:admin
密码:admin
账密可以修改
在这里插入图片描述

2.添加数据源

在这里插入图片描述
在这里插入图片描述

3.添加Dashboard

grafana官网提供了一些Dashboard Template可供参考
Dashboard地址:https://grafana.com/grafana/dashboards/?plcmt=oss-nav
本次测试我们选的模板ID分别是:
node dashboard ID:1860
redis dashboard ID:18345
nginx dashboard ID:14900
在这里插入图片描述
导入dashboard,如果网络不允许也可以在官网下载到本地,通过本地导入
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

4.查看整体效果

redis和nginx添加dashboard的步骤都一样,输入对应的ID即可,下面分别展示各监控项的大盘
node监控大盘
在这里插入图片描述
redis监控大盘
在这里插入图片描述
nginx监控大盘
在这里插入图片描述

http://www.xdnf.cn/news/18985.html

相关文章:

  • Python 并行计算进阶:ProcessPoolExecutor 处理 CPU 密集型任务
  • 从“找不到”到“秒上手”:金仓文档系统重构记
  • 《电商库存系统超卖事故的技术复盘与数据防护体系重构》
  • 推荐系统王树森(四)特征交叉+行为序列
  • java基础(十六)操作系统(上)
  • 基于单片机光照强度检测(光敏电阻)系统Proteus仿真(含全部资料)
  • 【Qt开发】常用控件(七)-> styleSheet
  • 深度学习(鱼书)day12--卷积神经网络(后四节)
  • Java项目-苍穹外卖_Day3-Day4
  • 深度解析Structured Outputs:基于JSON Schema的结构化输出实践与最佳方案
  • 8月26日
  • 开发避坑指南(37):Vue3 标签页实现攻略
  • iPhone 17 Pro 全新配色确定,首款折叠屏 iPhone 将配备 Touch ID 及四颗镜头
  • 二、JVM 入门 —— (四)堆以及 GC
  • MATLAB中函数的详细使用
  • Slice-100K:推动AI驱动的CAD与3D打印创新的多模态数据集
  • 『专利好药用力心脑血管健康』——爱上古中医(28)(健康生活是coder抒写优质代码的前提条件——《黄帝内经》伴读学习纪要)
  • Hadoop MapReduce 任务/输入数据 分片 InputSplit 解析
  • VS中创建Linux项目
  • VGVLP思路探索和讨论
  • STL库——vector(类函数学习)
  • 算法编程实例-快乐学习
  • Git:基本使用
  • 校园勤工俭学微信小程序的设计与实现:基于数字化服务生态的赋能体系构建
  • 10分钟快速搭建 SkyWalking 服务
  • 机器学习笔记
  • 【C语言】小游戏:关机程序
  • 【Linux 进程】进程程序替换
  • RAG中使用到的相关函数注释——LangChain核心函数
  • AI出题人给出的Java后端面经(二十仨)(不定更)