当前位置: 首页 > backend >正文

【大数据技术-联邦集群RBF】DFSRouter日志一直打印修改Membership为EXPIRED状态的日志分析

生产环境遇到下面报错

2025-04-23 17:44:15,780 INFO  store.CachedRecordStore (CachedRecordStore.java:overrideExpiredRecords(192)) - Override State Store record MembershipState: router1:8888->hh-fed-sub25:nn2:nn2:8020-EXPIRED
2025-04-23 17:44:15,781 INFO  store.CachedRecordStore (CachedRecordStore.java:overrideExpiredRecords(192)) - Override State Store record MembershipState: router1:8888->hh-fed-sub25:nn1:nn1:8020-EXPIRED
2025-04-23 17:44:15,781 INFO  store.CachedRecordStore (CachedRecordStore.java:overrideExpiredRecords(192)) - Override State Store record MembershipState: router2:8888->hh-fed-sub25:nn1:nn1:8020-EXPIRED
2025-04-23 17:44:15,781 INFO  store.CachedRecordStore (CachedRecordStore.java:overrideExpiredRecords(192)) - Override State Store record MembershipState: router2:8888->hh-fed-sub25:nn2:nn2:8020-EXPIRED

报错原因是,之前子集群配置了3个router,2个nn,然后会向StateStore中存储6个MembershipState。

后来,将子集群的router停了两个,只运行一个router,这样的后果就是会在运行的router日志发现上面报错。

因为router会周期性下载MembershipState,每次都会去检查是否过期,而我们停了2个Router,这俩Router之前和NameNode形成Membership并上报到了StateStore,并且我们关闭了删除过期记录的参数dfs.federation.router.store.membership.expiration.deletion,所以,会在运行的Router中打印上面报错。

修复做法,选择下面之一都可以:

  1. 开启删除过期参数
    1. dfs.federation.router.store.membership.expiration默认未5min,若设置dfs.federation.router.store.membership.expiration.deletion=2min,则表示membership过期了(超过5min没汇报),在等2min就删除它。
  2. 启动已停止的router

参考源码

org.apache.hadoop.hdfs.server.federation.store.CachedRecordStore#overrideExpiredRecords

  public void overrideExpiredRecords(QueryResult<R> query) throws IOException {List<R> commitRecords = new ArrayList<>();List<R> deleteRecords = new ArrayList<>();List<R> newRecords = query.getRecords();long currentDriverTime = query.getTimestamp();if (newRecords == null || currentDriverTime <= 0) {LOG.error("Cannot check overrides for record");return;}for (R record : newRecords) {if (record.shouldBeDeleted(currentDriverTime)) {String recordName = StateStoreUtils.getRecordName(record.getClass());if (getDriver().remove(record)) {deleteRecords.add(record);LOG.info("Deleted State Store record {}: {}", recordName, record);} else {LOG.warn("Couldn't delete State Store record {}: {}", recordName,record);}} else if (record.checkExpired(currentDriverTime)) {String recordName = StateStoreUtils.getRecordName(record.getClass());LOG.info("Override State Store record {}: {}", recordName, record);commitRecords.add(record);}}if (commitRecords.size() > 0) {getDriver().putAll(commitRecords, true, false);}if (deleteRecords.size() > 0) {newRecords.removeAll(deleteRecords);}}

org.apache.hadoop.hdfs.server.federation.store.records.BaseRecord#checkExpired

   @Overridepublic boolean checkExpired(long currentTime) {if (super.checkExpired(currentTime)) {this.setState(EXPIRED);// Commit itreturn true;}return false;}public boolean checkExpired(long currentTime) {long expiration = getExpirationMs();long modifiedTime = getDateModified();if (modifiedTime > 0 && expiration > 0) {return (modifiedTime + expiration) < currentTime;}return false;}

org.apache.hadoop.hdfs.server.federation.store.records.BaseRecord#shouldBeDeleted

public boolean shouldBeDeleted(long currentTime) {long deletionTime = getDeletionMs();if (isExpired() && deletionTime > 0) {long elapsedTime = currentTime - (getDateModified() + getExpirationMs());return elapsedTime > deletionTime;} else {return false;}
}
http://www.xdnf.cn/news/1544.html

相关文章:

  • UniGoal 具身导航 | 通用零样本目标导航 CVPR 2025
  • Sql文件处理SQLDumpSplitter
  • mojoPortal 接口imagehandler任意文件读取漏洞(CVE-2025-28367)
  • 【Java面试笔记:基础】9.对比Hashtable、HashMap、TreeMap有什么不同?
  • OpenCV 图形API(55)颜色空间转换-----将图像从 RGB 色彩空间转换为 I420 格式函数RGB2I420()
  • 计算机网络笔记(七)——1.7计算机网络体系结构
  • Linux系统学习----概述与目录结构
  • JSON实现动态按钮管理的Python应用
  • Java大师成长计划之第1天:Java编程基础入门
  • docker在windows下wsl存储路径的变更与数据迁移
  • 18487.1-2015-解读笔记之四-交流充电之流程分析
  • 大数据人工智能概述
  • Saas、Paas、Faas、Baas的概念学习与对比
  • 什么是鸿蒙南向开发?什么是北向开发?
  • 信息学奥赛一本通 1505:【例 2】双调路径 | 洛谷 P5530 [BalticOI 2002] 双调路径
  • bedtools coverage 获取每个位置的测序深度
  • springboot-基于Web企业短信息发送系统(源码+lw+部署文档+讲解),源码可白嫖!
  • 前端基础之《Vue(8)—内置组件》
  • 第二批流程!合肥市各地文化创意园区认定申报条件时间和申报材料方式归纳
  • 通信安全员考试重难点考哪些?
  • 如何计算光伏电站需要多少光伏板
  • 4.19-4.26学习周报
  • 【Tools】Git常见操作
  • 从并发问题衍生出的Spring的七种事务传播行为
  • 4.7 字符串到整形的相互转换
  • 机器学习分类算法详解:原理、应用场景与测试用例
  • 深入理解 Python 协程:单线程下的高效并发方案
  • JWT的token泄露要如何应对
  • 关于编译原理——语义翻译器的设计
  • 异构迁移学习(无创脑机接口中的跨脑电帽迁移学习)