当前位置: 首页 > web >正文

Oracle集群多副本控制文件异常问题

 问题描述

填写问题的基础信息。

系统名称

集群

IP地址

操作系统

LInux

数据库

Oracle RAC 11.2.0.4

发现时间

发现方式

症状表现

问题的症状表现如下

两套集群上的数据库实例频繁重启,并且两个实例告警日志都有如下内容

The controlfile header block returned by the oshas a sequence number that is too old.The controlfile might be corrupted.PLEASE DO NOT ATTEMPT TO START UP THE INSTANCEwithout following the steps below.RE -STARTING THE INSTANCE CAN CAUSE SERIOUS DAMAGETO THE DATABASE, if the controlfile is truly corrupted.In order to re-start the instance safely,please do the following:

(1)Save all copies of the controlfile for lateranalysis and contact your 0s vendor and Oracle support.(2)Mount the instance and issue:

ALTER DATABASE BACKUP CONTROLFILE TO TRACE;

(3)Unmount the instance.

(4)Use the script in the trace file to

RE-CREATE THE CONTROLFILE and open the database.

处理过程

处理过程推荐按照时间以列表形式,将处理过程时间点,处理内容。

去mos上搜关键字得到如下吻合的case

The controlfile header block returned by the OS has a sequence number that is too old. (Doc ID 1589355.1)APPLIES TO:
Oracle Database Cloud Schema Service - Version N/A and later
Gen 1 Exadata Cloud at Customer (Oracle Exadata Database Cloud Machine) - Version N/A and later
Oracle Database Exadata Express Cloud Service - Version N/A and later
Oracle Cloud Infrastructure - Database Service - Version N/A and later
Oracle Database Cloud Exadata Service - Version N/A and later
Information in this document applies to any platform.SYMPTOMS
Database instance went down with following error message in alert log:
---
Wed Sep 11 23:26:39 2013
********************* ATTENTION: ********************
The controlfile header block returned by the OS
has a sequence number that is too old.
The controlfile might be corrupted.
PLEASE DO NOT ATTEMPT TO START UP THE INSTANCE
without following the steps below.
RE-STARTING THE INSTANCE CAN CAUSE SERIOUS DAMAGE
TO THE DATABASE, if the controlfile is truly corrupted.
In order to re-start the instance safely,
please do the following:
(1) Save all copies of the controlfile for later
analysis and contact your OS vendor and Oracle support.
(2) Mount the instance and issue:
ALTER DATABASE BACKUP CONTROLFILE TO TRACE;
(3) Unmount the instance.
(4) Use the script in the trace file to
RE-CREATE THE CONTROLFILE and open the database.
*****************************************************
USER (ospid: 24051722): terminating the instance
---CAUSE
BUG 14281768 - CONTROL FILE GETS CORRUPTEDWhich was closed as Vendor OS/Software/Framework ProblemSOLUTION
Error is typically raised when the Controlfile is overwritten by an older copy of the Controlfile. Most likely this happened due to Storage OR I/o error.
All copies of the control file must have the same internal sequence number for Oracle to start up the database or shut it down in normal or immediate mode.The solution is actually given with the accompained message :-(1) Save all copies of the controlfile for later
analysis and contact your OS vendor and Oracle support.
(2) Mount the instance and issue:
ALTER DATABASE BACKUP CONTROLFILE TO TRACE;
(3) Unmount the instance.
(4) Use the script in the trace file to
RE-CREATE THE CONTROLFILE and open the database.To make a sanity check in the future , please set the following parameter :-SQL> alter system set "_controlfile_update_check"='HIGH' scope=spfile; -- then bounce the database.Please check with your OS System/Storage admin regarding the issue.The precautions is to relocate the control file on a fast and direct I/O enabled disk , the main target is not letting the OS to write an old copy (cached copy of the controlfile to it).
To reverse the parameter setting :-SQL> alter system set "_controlfile_update_check"='OFF' scope=spfile; -- then bounce the database.

问题原因

问题原因如下

让客户查了报错实例确实有多个控制文件,分别存放在不同的磁盘中,并且这些磁盘组都是同一套存储。

结合官方文档的提示,推测是客户的存储出现了问题

问题解决

问题解决如下

优先排查存储问题,存储问题排查完成之后建议recreate controlfile,这个步骤视存储修复的情况而定,如果存储方面的问题排除后,这个报错不再出现则不做任何操作。

不过当前情况下如果客户没有备份,建议先对controlfile做一个trace备份,在数据库能够打开或者mount的情况下执行:

alter database backup controlfile to trace;

recreate controlfile需要考虑一些问题,并且需要做一些测试,详见另一篇文档。

http://www.xdnf.cn/news/5676.html

相关文章:

  • 产品思维30讲-(梁宁)--实战2
  • 分水岭算法:从逻辑学角度看图像分割的智慧
  • Ubuntu20.04 搭建Kubernetes 1.28版本集群
  • C++ 编译报错 undefined reference 找不到引用的问题解决思路
  • vue+element下拉选择器默认选择第一个并根据选择项展示相关数据
  • 瑞派宠物医生:借腔镜影像妙技,筑牢宠物生命防线
  • 4.MySQL全量、增量备份与恢复
  • 构造二叉树
  • STM32的TIMx中Prescaler和ClockDivision的区别
  • AI与IoT携手,精准农业未来已来
  • Nacos源码—8.Nacos升级gRPC分析六
  • 2025年5月12日第一轮
  • 最大子数组和
  • Ubuntu虚拟机文件系统扩容
  • 通过Windows操作系统双因素认证实现工业设备安全运维:安当SLA
  • 论文学习_A Survey of Binary Code Similarity
  • 生成式人工智能认证(GAI认证)适合人群
  • 电商平台一站式网络安全架构设计指南
  • 自动化测试与功能测试详解
  • 【办公类-99-06】20250512用Python制作PPT的GIF照片动图(统一图片大小、自定义不同切换秒数,以蝴蝶为例)
  • 并发笔记-信号量(四)
  • ActiveMQ 高级特性:延迟消息与优先级队列实战(二)
  • MultiTTS 1.7.6 | 最强离线语音引擎,提供多音色无障碍朗读功能,附带语音包
  • 使用PhpStudy搭建Web测试服务器
  • 机动车授权签字人备考考试题库及答案
  • HLS图像处理:从算法到硬件的创新加速之旅
  • 蓝牙AVDTP协议概述
  • 配置Hadoop集群环境准备
  • Python集成开发环境之Thonny
  • Python实例题:Django搭建简易博客