当前位置: 首页 > news >正文

第9篇:监控与运维 - 集成Actuator健康检查

前言

生产环境中,监控和运维是不可或缺的。本章将集成Spring Boot Actuator,为日志框架添加健康检查、指标监控和运行时管理功能,让框架具备企业级的可观测性。

Actuator集成架构

Actuator端点
LogEndpoint
LogHealthIndicator
日志统计信息
健康状态检查
监控指标暴露

集成要点:

  • 🏥 健康检查:监控框架运行状态
  • 📊 指标收集:统计日志调用次数、性能数据
  • ⚙️ 运行时管理:动态调整日志配置
  • 🔍 故障诊断:提供调试和排错信息

LogHealthIndicator - 健康检查指标

package com.simpleflow.log.springboot.actuator;import com.simpleflow.log.context.ThreadLocalTraceHolder;
import com.simpleflow.log.processor.AnnotationConfigResolver;
import org.springframework.boot.actuate.health.Health;
import org.springframework.boot.actuate.health.HealthIndicator;
import org.springframework.stereotype.Component;import java.lang.management.ManagementFactory;
import java.lang.management.MemoryMXBean;
import java.lang.management.ThreadMXBean;/*** 日志框架健康检查指标* * 监控框架的运行状态、性能指标和资源使用情况*/
@Component
public class LogHealthIndicator implements HealthIndicator {private final AnnotationConfigResolver configResolver;private volatile long lastCheckTime = System.currentTimeMillis();private volatile boolean lastCheckResult = true;public LogHealthIndicator(AnnotationConfigResolver configResolver) {this.configResolver = configResolver;}@Overridepublic Health health() {try {Health.Builder builder = new Health.Builder();// 检查核心组件状态boolean coreHealthy = checkCoreComponents();// 检查内存使用情况MemoryStatus memoryStatus = checkMemoryUsage();// 检查线程状态ThreadStatus threadStatus = checkThreadStatus();// 检查配置缓存状态CacheStatus cacheStatus = checkCacheStatus();// 综合判断健康状态boolean isHealthy = coreHealthy && memoryStatus.isHealthy() && threadStatus.isHealthy() &&cacheStatus.isHealthy();if (isHealthy) {builder.up();} else {builder.down();}// 添加详细信息builder.withDetail("core", coreHealthy ? "UP" : "DOWN").withDetail("memory", memoryStatus).withDetail("threads", threadStatus).withDetail("cache", cacheStatus).withDetail("lastCheckTime", lastCheckTime).withDetail("uptime", getUptimeInfo());lastCheckTime = System.currentTimeMillis();lastCheckResult = isHealthy;return builder.build();} catch (Exception e) {return Health.down().withDetail("error", e.getMessage()).withDetail("lastCheckTime", lastCheckTime).build();}}/*** 检查核心组件状态*/private boolean checkCoreComponents() {try {// 检查配置解析器是否正常if (configResolver == null) {return false;}// 检查ThreadLocal是否可以正常工作ThreadLocalTraceHolder.initTrace();boolean hasContext = ThreadLocalTraceHolder.getCurrentTrace() != null;ThreadLocalTraceHolder.clearCurrentTrace();return hasContext;} catch (Exception e) {return false;}}/*** 检查内存使用情况*/private MemoryStatus checkMemoryUsage() {MemoryMXBean memoryBean = ManagementFactory.getMemoryMXBean();long usedMemory = memoryBean.getHeapMemoryUsage().getUsed();long maxMemory = memoryBean.getHeapMemoryUsage().getMax();double usagePercentage = maxMemory > 0 ? (double) usedMemory / maxMemory * 100 : 0;return new MemoryStatus(usedMemory, maxMemory, usagePercentage);}/*** 检查线程状态*/private ThreadStatus checkThreadStatus() {ThreadMXBean threadBean = ManagementFactory.getThreadMXBean();int threadCount = threadBean.getThreadCount();int daemonThreadCount = threadBean.getDaemonThreadCount();long totalStartedThreadCount = threadBean.getTotalStartedThreadCount();return new ThreadStatus(threadCount, daemonThreadCount, totalStartedThreadCount);}/*** 检查配置缓存状态*/private CacheStatus checkCacheStatus() {try {String cacheStats = configResolver.getCacheStats();return new CacheStatus(true, cacheStats);} catch (Exception e) {return new CacheStatus(false, "缓存状态检查失败: " + e.getMessage());}}/*** 获取运行时间信息*/private String getUptimeInfo() {long uptime = ManagementFactory.getRuntimeMXBean().getUptime();long hours = uptime / (1000 * 60 * 60);long minutes = (uptime % (1000 * 60 * 60)) / (1000 * 60);long seconds = (uptime % (1000 * 60)) / 1000;return String.format("%d小时%d分钟%d秒", hours, minutes, seconds);}// ========== 内部状态类 ==========public static class MemoryStatus {private final long usedMemory;private final long maxMemory;private final double usagePercentage;public MemoryStatus(long usedMemory, long maxMemory, double usagePercentage) {this.usedMemory = usedMemory;this.maxMemory = maxMemory;this.usagePercentage = usagePercentage;}public boolean isHealthy() {return usagePercentage < 90.0; // 内存使用率低于90%认为健康}public long getUsedMemory() { return usedMemory; }public long getMaxMemory() { return maxMemory; }public double getUsagePercentage() { return usagePercentage; }@Overridepublic String toString() {return String.format("已使用: %dMB, 最大: %dMB, 使用率: %.2f%%", usedMemory / 1024 / 1024, maxMemory / 1024 / 1024, usagePercentage);}}public static class ThreadStatus {private final int threadCount;private final int daemonThreadCount;private final long totalStartedThreadCount;public ThreadStatus(int threadCount, int daemonThreadCount, long totalStartedThreadCount) {this.threadCount = threadCount;this.daemonThreadCount = daemonThreadCount;this.totalStartedThreadCount = totalStartedThreadCount;}public boolean isHealthy() {return threadCount < 1000; // 线程数少于1000认为健康}public int getThreadCount() { return threadCount; }public int getDaemonThreadCount() { return daemonThreadCount; }public long getTotalStartedThreadCount() { return totalStartedThreadCount; }@Overridepublic String toString() {return String.format("当前线程: %d, 守护线程: %d, 总启动线程: %d", threadCount, daemonThreadCount, totalStartedThreadCount);}}public static class CacheStatus {private final boolean healthy;private final String stats;public CacheStatus(boolean healthy, String stats) {this.healthy = healthy;this.stats = stats;}public boolean isHealthy() { return healthy; }public String getStats() { return stats; }@Overridepublic String toString() {return stats;}}
}

LogEndpoint - 自定义端点

package com.simpleflow.log.springboot.actuator;import com.simpleflow.log.config.LogConfig;
import com.simpleflow.log.processor.AnnotationConfigResolver;
import com.simpleflow.log.springboot.properties.LogProperties;
import org.springframework.boot.actuate.endpoint.annotation.Endpoint;
import org.springframework.boot.actuate.endpoint.annotation.ReadOperation;
import org.springframework.boot.actuate.endpoint.annotation.WriteOperation;
import org.springframework.stereotype.Component;import java.time.LocalDateTime;
import java.util.HashMap;
import java.util.Map;
import java.util.concurrent.atomic.AtomicLong;/*** 日志框架自定义端点* * 提供日志统计信息、配置查看和运行时管理功能*/
@Component
@Endpoint(id = "simpleflow-log")
public class LogEndpoint {private final LogProperties logProperties;private final AnnotationConfigResolver configResolver;private final LogConfig defaultLogConfig;// 统计信息private final AtomicLong totalLogCount = new AtomicLong(0);private final AtomicLong errorLogCount = new AtomicLong(0);private final AtomicLong cacheHitCount = new AtomicLong(0);private final AtomicLong cacheMissCount = new AtomicLong(0);private volatile LocalDateTime startTime = LocalDateTime.now();public LogEndpoint(LogProperties logProperties, AnnotationConfigResolver configResolver,LogConfig defaultLogConfig) {this.logProperties = logProperties;this.configResolver = configResolver;this.defaultLogConfig = defaultLogConfig;}/*** 获取日志框架状态信息*/@ReadOperationpublic Map<String, Object> info() {Map<String, Object> info = new HashMap<>();// 基本信息info.put("framework", "SimpleFlow Log Framework");info.put("version", "1.0.0");info.put("startTime", startTime);info.put("status", "RUNNING");// 配置信息info.put("configuration", getConfigurationInfo());// 统计信息info.put("statistics", getStatisticsInfo());// 性能信息info.put("performance", getPerformanceInfo());// 缓存信息info.put("cache", getCacheInfo());return info;}/*** 获取配置信息*/@ReadOperationpublic Map<String, Object> config() {Map<String, Object> config = new HashMap<>();config.put("enabled", logProperties.isEnabled());config.put("defaultLevel", logProperties.getDefaultLevel());config.put("webEnabled", logProperties.isWebEnabled());config.put("logArgs", logProperties.isLogArgs());config.put("logResult", logProperties.isLogResult());config.put("logExecutionTime", logProperties.isLogExecutionTime());config.put("globalSensitiveFields", logProperties.getGlobalSensitiveFields());// 请求日志配置Map<String, Object> requestLogConfig = new HashMap<>();LogProperties.RequestLog requestLog = logProperties.getRequestLog();requestLogConfig.put("enabled", requestLog.isEnabled());requestLogConfig.put("logHeaders", requestLog.isLogHeaders());requestLogConfig.put("logParameters", requestLog.isLogParameters());requestLogConfig.put("excludePatterns", requestLog.getExcludePatterns());config.put("requestLog", requestLogConfig);// 性能配置Map<String, Object> performanceConfig = new HashMap<>();LogProperties.Performance performance = logProperties.getPerformance();performanceConfig.put("asyncEnabled", performance.isAsyncEnabled());performanceConfig.put("asyncQueueSize", performance.getAsyncQueueSize());performanceConfig.put("cacheSize", performance.getCacheSize());performanceConfig.put("maxLogLength", performance.getMaxLogLength());config.put("performance", performanceConfig);return config;}/*** 获取统计信息*/@ReadOperationpublic Map<String, Object> stats() {return getStatisticsInfo();}/*** 清除缓存*/@WriteOperationpublic Map<String, Object> clearCache() {try {configResolver.clearAllCache();Map<String, Object> result = new HashMap<>();result.put("success", true);result.put("message", "缓存清除成功");result.put("timestamp", LocalDateTime.now());return result;} catch (Exception e) {Map<String, Object> result = new HashMap<>();result.put("success", false);result.put("message", "缓存清除失败: " + e.getMessage());result.put("timestamp", LocalDateTime.now());return result;}}/*** 重置统计信息*/@WriteOperationpublic Map<String, Object> resetStats() {totalLogCount.set(0);errorLogCount.set(0);cacheHitCount.set(0);cacheMissCount.set(0);startTime = LocalDateTime.now();Map<String, Object> result = new HashMap<>();result.put("success", true);result.put("message", "统计信息重置成功");result.put("timestamp", LocalDateTime.now());return result;}// ========== 私有方法 ==========private Map<String, Object> getConfigurationInfo() {Map<String, Object> config = new HashMap<>();config.put("enabled", logProperties.isEnabled());config.put("defaultLevel", logProperties.getDefaultLevel());config.put("webEnabled", logProperties.isWebEnabled());config.put("actuatorEnabled", logProperties.isActuatorEnabled());return config;}private Map<String, Object> getStatisticsInfo() {Map<String, Object> stats = new HashMap<>();stats.put("totalLogCount", totalLogCount.get());stats.put("errorLogCount", errorLogCount.get());stats.put("successRate", calculateSuccessRate());stats.put("startTime", startTime);stats.put("uptime", calculateUptime());return stats;}private Map<String, Object> getPerformanceInfo() {Map<String, Object> performance = new HashMap<>();// 获取JVM性能信息Runtime runtime = Runtime.getRuntime();performance.put("totalMemory", runtime.totalMemory());performance.put("freeMemory", runtime.freeMemory());performance.put("maxMemory", runtime.maxMemory());performance.put("usedMemory", runtime.totalMemory() - runtime.freeMemory());// 计算内存使用率double memoryUsage = (double) (runtime.totalMemory() - runtime.freeMemory()) / runtime.maxMemory() * 100;performance.put("memoryUsagePercentage", String.format("%.2f%%", memoryUsage));return performance;}private Map<String, Object> getCacheInfo() {Map<String, Object> cache = new HashMap<>();cache.put("stats", configResolver.getCacheStats());cache.put("hitCount", cacheHitCount.get());cache.put("missCount", cacheMissCount.get());cache.put("hitRate", calculateHitRate());return cache;}private String calculateSuccessRate() {long total = totalLogCount.get();if (total == 0) {return "0.00%";}double rate = (double) (total - errorLogCount.get()) / total * 100;return String.format("%.2f%%", rate);}private String calculateUptime() {LocalDateTime now = LocalDateTime.now();long minutes = java.time.Duration.between(startTime, now).toMinutes();long hours = minutes / 60;long remainingMinutes = minutes % 60;return String.format("%d小时%d分钟", hours, remainingMinutes);}private String calculateHitRate() {long total = cacheHitCount.get() + cacheMissCount.get();if (total == 0) {return "0.00%";}double rate = (double) cacheHitCount.get() / total * 100;return String.format("%.2f%%", rate);}// ========== 统计方法(供框架内部调用) ==========public void incrementLogCount() {totalLogCount.incrementAndGet();}public void incrementErrorCount() {errorLogCount.incrementAndGet();}public void incrementCacheHit() {cacheHitCount.incrementAndGet();}public void incrementCacheMiss() {cacheMissCount.incrementAndGet();}
}

LogMetrics - 指标收集器

package com.simpleflow.log.springboot.actuator;import io.micrometer.core.instrument.Counter;
import io.micrometer.core.instrument.Gauge;
import io.micrometer.core.instrument.MeterRegistry;
import io.micrometer.core.instrument.Timer;
import org.springframework.boot.autoconfigure.condition.ConditionalOnClass;
import org.springframework.stereotype.Component;import java.util.concurrent.atomic.AtomicLong;/*** 日志框架指标收集器* * 集成Micrometer,提供丰富的监控指标*/
@Component
@ConditionalOnClass(MeterRegistry.class)
public class LogMetrics {private final MeterRegistry meterRegistry;// 计数器private final Counter logMethodCallCounter;private final Counter logErrorCounter;private final Counter cacheHitCounter;private final Counter cacheMissCounter;// 计时器private final Timer logExecutionTimer;private final Timer configResolveTimer;// 仪表private final AtomicLong activeLogContexts;private final AtomicLong cacheSize;public LogMetrics(MeterRegistry meterRegistry) {this.meterRegistry = meterRegistry;// 初始化计数器this.logMethodCallCounter = Counter.builder("simpleflow.log.method.calls").description("Total number of log method calls").register(meterRegistry);this.logErrorCounter = Counter.builder("simpleflow.log.errors").description("Total number of log errors").register(meterRegistry);this.cacheHitCounter = Counter.builder("simpleflow.log.cache.hits").description("Number of cache hits").register(meterRegistry);this.cacheMissCounter = Counter.builder("simpleflow.log.cache.misses").description("Number of cache misses").register(meterRegistry);// 初始化计时器this.logExecutionTimer = Timer.builder("simpleflow.log.execution.time").description("Log execution time").register(meterRegistry);this.configResolveTimer = Timer.builder("simpleflow.log.config.resolve.time").description("Configuration resolve time").register(meterRegistry);// 初始化仪表this.activeLogContexts = new AtomicLong(0);this.cacheSize = new AtomicLong(0);Gauge.builder("simpleflow.log.contexts.active").description("Number of active log contexts").register(meterRegistry, activeLogContexts, AtomicLong::get);Gauge.builder("simpleflow.log.cache.size").description("Current cache size").register(meterRegistry, cacheSize, AtomicLong::get);}// ========== 指标记录方法 ==========public void recordMethodCall() {logMethodCallCounter.increment();}public void recordError() {logErrorCounter.increment();}public void recordCacheHit() {cacheHitCounter.increment();}public void recordCacheMiss() {cacheMissCounter.increment();}public Timer.Sample startExecutionTimer() {return Timer.start(meterRegistry);}public void stopExecutionTimer(Timer.Sample sample) {sample.stop(logExecutionTimer);}public Timer.Sample startConfigResolveTimer() {return Timer.start(meterRegistry);}public void stopConfigResolveTimer(Timer.Sample sample) {sample.stop(configResolveTimer);}public void setActiveContexts(long count) {activeLogContexts.set(count);}public void setCacheSize(long size) {cacheSize.set(size);}public void incrementActiveContexts() {activeLogContexts.incrementAndGet();}public void decrementActiveContexts() {activeLogContexts.decrementAndGet();}
}

使用和测试

1. 配置启用

# application.yml
management:endpoints:web:exposure:include: health,info,simpleflow-log,metricsendpoint:health:show-details: alwayshealth:log:enabled: truesimpleflow:log:actuator-enabled: true

2. 访问端点

# 健康检查
curl http://localhost:8080/actuator/health/log# 查看日志框架信息
curl http://localhost:8080/actuator/simpleflow-log# 查看配置
curl http://localhost:8080/actuator/simpleflow-log/config# 查看统计信息
curl http://localhost:8080/actuator/simpleflow-log/stats# 清除缓存
curl -X POST http://localhost:8080/actuator/simpleflow-log/clearCache# 查看Prometheus指标
curl http://localhost:8080/actuator/metrics/simpleflow.log.method.calls

3. 输出示例

健康检查输出:

{"status": "UP","details": {"core": "UP","memory": "已使用: 256MB, 最大: 1024MB, 使用率: 25.00%","threads": "当前线程: 45, 守护线程: 12, 总启动线程: 67","cache": "MethodCache: 150, ClassCache: 25","lastCheckTime": 1692766815123,"uptime": "2小时15分钟30秒"}
}

框架信息输出:

{"framework": "SimpleFlow Log Framework","version": "1.0.0","startTime": "2024-08-23T08:30:15","status": "RUNNING","configuration": {"enabled": true,"defaultLevel": "INFO","webEnabled": true,"actuatorEnabled": true},"statistics": {"totalLogCount": 1250,"errorLogCount": 3,"successRate": "99.76%","uptime": "2小时15分钟"},"performance": {"totalMemory": 268435456,"freeMemory": 201326592,"usedMemory": 67108864,"memoryUsagePercentage": "25.00%"},"cache": {"stats": "MethodCache: 150, ClassCache: 25","hitCount": 980,"missCount": 195,"hitRate": "83.40%"}
}

本章小结

✅ 完成的任务

  1. 健康检查:实现了LogHealthIndicator监控框架状态
  2. 自定义端点:创建了LogEndpoint提供管理功能
  3. 指标收集:集成Micrometer收集性能指标
  4. 运行时管理:支持缓存清理、统计重置等操作
  5. 监控集成:完整的Actuator集成方案

🎯 学习要点

  • Actuator扩展的正确方式
  • 健康检查指标的设计原则
  • 自定义端点的实现技巧
  • 指标收集与监控系统的集成
  • 运行时管理功能的安全考虑

💡 思考题

  1. 如何设计更细粒度的健康检查?
  2. 监控指标的报警阈值如何确定?
  3. 如何保护管理端点的安全性?

🚀 下章预告

最后一章我们将构建完整的示例应用,整合所有功能模块,并进行全面的测试验证,展示框架在实际项目中的应用效果。


💡 设计原则: 优秀的监控系统应该是主动发现问题、提供详实信息、支持快速响应的。通过Actuator集成,我们让框架具备了生产级的可观测性。

http://www.xdnf.cn/news/1421389.html

相关文章:

  • 『C++成长记』vector模拟实现
  • 车载总线架构 --- 车载LIN总线传输层概述
  • 百胜软件获邀出席第七届中国智慧零售大会,智能中台助力品牌零售数智变革
  • C++ 虚继承:破解菱形继承的“双亲困境”
  • 【macOS】垃圾箱中文件无法清理的--特殊方法
  • Linux | 走进网络世界:MAC、IP 与通信的那些事
  • PyTorch 实战(3)—— PyTorch vs. TensorFlow:深度学习框架的王者之争
  • mysql中如何解析某个字段是否是中文
  • 攻防演练笔记
  • Frida Hook API 转换/显示堆栈
  • 【数学建模学习笔记】缺失值处理
  • 数学分析原理答案——第七章 习题13
  • 文件夹上传 (UploadFolder)
  • crypto-babyrsa(2025YC行业赛)
  • 【系统架构师设计(8)】需求分析之 SysML系统建模语言:从软件工程到系统工程的跨越
  • 【机器学习学习笔记】numpy基础2
  • 基于 HTML、CSS 和 JavaScript 的智能图像边缘检测系统
  • ESB 走向黄昏,为什么未来属于 iPaaS?
  • 【第十一章】Python 队列全方位解析:从基础到实战
  • 计算机网络技术(四)完结
  • 9月1日
  • 8Lane V-by-One HS LVDS FMC Card
  • 【STM32】贪吃蛇 [阶段 8] 嵌入式游戏引擎通用框架设计
  • IO进程线程;标准io;文件IO;0901
  • OPENCV 基于旋转矩阵 旋转Point2f
  • Python核心技术开发指南(030)——函数入门
  • PAT乙级_1093 字符串A+B_Python_AC解法_含疑难点
  • 基于 C 语言的网络单词查询系统设计与实现(客户端 + 服务器端)
  • Python OpenCV图像处理与深度学习:Python OpenCV特征检测入门
  • AI时代SEO关键词实战解析