escdelete 2017-08-17
环境:OEL 5.7 + Oracle 10.2.0.5 Clusterware + Oracle 10.2.0.5 RAC
故障:数据库open时报错ORA-1172,ORA-1151
几年前的10gRAC实验环境,整体冷备迁移到新环境时,无法正常启动数据库,手工尝试也无法正常启动,报错如下:
Mon Aug 14 10:04:13 EDT 2017 ALTER DATABASE OPEN This instance was first to open Block change tracking file is current. Mon Aug 14 10:04:14 EDT 2017 Beginning crash recovery of 2 threads Mon Aug 14 10:04:14 EDT 2017 Started redo scan Mon Aug 14 10:04:14 EDT 2017 Completed redo scan 337 redo blocks read, 85 data blocks need recovery Mon Aug 14 10:04:14 EDT 2017 Started redo application at Thread 1: logseq 21, block 71672 Thread 2: logseq 17, block 33379 Mon Aug 14 10:04:14 EDT 2017 Recovery of Online Redo Log: Thread 1 Group 1 Seq 21 Reading mem 0 Mem# 0: +ZHAOJINGYU/jy/onlinelog/group_1.262.839673937 Mem# 1: +ZHAOJINGYU/jy/onlinelog/group_1.263.839673939 Mon Aug 14 10:04:14 EDT 2017 Recovery of Online Redo Log: Thread 2 Group 3 Seq 17 Reading mem 0 Mem# 0: +ZHAOJINGYU/jy/onlinelog/group_3.269.839674171 Mem# 1: +ZHAOJINGYU/jy/onlinelog/group_3.270.839674173 RECOVERY OF THREAD 1 STUCK AT BLOCK 41 OF FILE 2 Mon Aug 14 10:04:27 EDT 2017 Abort recovery for domain 0 Mon Aug 14 10:04:27 EDT 2017 Aborting crash recovery due to error 1172 Mon Aug 14 10:04:27 EDT 2017 Errors in file /s01/oracle/admin/jy/udump/jy1_ora_18982.trc: ORA-01172: recovery of thread 1 stuck at block 41 of file 2 ORA-01151: use media recovery to recover block, restore backup if needed ORA-1172 signalled during: ALTER DATABASE OPEN... Mon Aug 14 10:04:30 EDT 2017 Shutting down instance (abort) License high water mark = 1 Instance terminated by USER, pid = 19144
根据MOS文档 Error ORA-01219 , ORA-01172, ORA-01151, ORA-01033 (文档 ID 1605148.1)
结合自己这里的实际情况,怀疑是之前数据库没有正常关闭导致。
试图在mount下进行recover database操作。
startup mount recover database; ALTER DATABASE OPEN;
我这里实际解决过程如下:
3.1 查看状态发现数据库实例资源始终没有启动:
[oracle@rac1-server crsd]$ crs_stat -t Name Type Target State Host ------------------------------------------------------------ ora.jy.db application ONLINE OFFLINE ora....y1.inst application ONLINE OFFLINE ora....y2.inst application ONLINE OFFLINE ora....SM1.asm application ONLINE ONLINE rac1-server ora....ER.lsnr application ONLINE ONLINE rac1-server ora....ver.gsd application ONLINE ONLINE rac1-server ora....ver.ons application ONLINE ONLINE rac1-server ora....ver.vip application ONLINE ONLINE rac1-server ora....SM2.asm application ONLINE ONLINE rac2-server ora....ER.lsnr application ONLINE ONLINE rac2-server ora....ver.gsd application ONLINE ONLINE rac2-server ora....ver.ons application ONLINE ONLINE rac2-server ora....ver.vip application ONLINE ONLINE rac2-server
3.2 尝试手工启动数据库
[oracle@rac1-server crsd]$ srvctl start database -d jy PRKP-1001 : Error starting instance jy1 on node rac1-server CRS-0215: Could not start resource 'ora.jy.jy1.inst'. PRKP-1001 : Error starting instance jy2 on node rac2-server CRS-0215: Could not start resource 'ora.jy.jy2.inst'.
尝试手工启动数据库失败,再次查询状态:
[oracle@rac1-server crsd]$ crs_stat -t Name Type Target State Host ------------------------------------------------------------ ora.jy.db application ONLINE OFFLINE ora....y1.inst application ONLINE OFFLINE ora....y2.inst application ONLINE OFFLINE ora....SM1.asm application ONLINE ONLINE rac1-server ora....ER.lsnr application ONLINE ONLINE rac1-server ora....ver.gsd application ONLINE ONLINE rac1-server ora....ver.ons application ONLINE ONLINE rac1-server ora....ver.vip application ONLINE ONLINE rac1-server ora....SM2.asm application ONLINE ONLINE rac2-server ora....ER.lsnr application ONLINE ONLINE rac2-server ora....ver.gsd application ONLINE ONLINE rac2-server ora....ver.ons application ONLINE ONLINE rac2-server ora....ver.vip application ONLINE ONLINE rac2-server
3.3 根据MOS文档思路,将数据库启动到mount
[oracle@rac1-server crsd]$ srvctl start database -d jy -o mount; [oracle@rac1-server crsd]$ crs_stat -t Name Type Target State Host ------------------------------------------------------------ ora.jy.db application ONLINE ONLINE rac1-server ora....y1.inst application ONLINE ONLINE rac1-server ora....y2.inst application ONLINE ONLINE rac2-server ora....SM1.asm application ONLINE ONLINE rac1-server ora....ER.lsnr application ONLINE ONLINE rac1-server ora....ver.gsd application ONLINE ONLINE rac1-server ora....ver.ons application ONLINE ONLINE rac1-server ora....ver.vip application ONLINE ONLINE rac1-server ora....SM2.asm application ONLINE ONLINE rac2-server ora....ER.lsnr application ONLINE ONLINE rac2-server ora....ver.gsd application ONLINE ONLINE rac2-server ora....ver.ons application ONLINE ONLINE rac2-server ora....ver.vip application ONLINE ONLINE rac2-server
3.4 尝试在mount下recover database
[oracle@rac1-server crsd]$ sqlplus / as sysdba SQL*Plus: Release 10.2.0.5.0 - Production on Mon Aug 14 22:43:03 2017 Copyright (c) 1982, 2010, Oracle. All Rights Reserved. Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production With the Partitioning, Real Application Clusters, OLAP, Data Mining and Real Application Testing options SQL> recover database; Media recovery complete. SQL> alter database open; Database altered. SQL>
最终成功recover并打开数据库。
3.5 节点2也将数据库open
节点2也将数据库open:
[oracle@rac2-server ~]$ sqlplus / as sysdba SQL*Plus: Release 10.2.0.5.0 - Production on Mon Aug 14 22:45:17 2017 Copyright (c) 1982, 2010, Oracle. All Rights Reserved. Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production With the Partitioning, Real Application Clusters, OLAP, Data Mining and Real Application Testing options SQL> select open_mode from v$database; OPEN_MODE ---------- MOUNTED SQL> alter database open; Database altered. SQL>
至此,这套10g RAC的数据库实验环境恢复正常。