博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
修改/dev/shm大小造成Oracle 12c集群启动故障
阅读量:2438 次
发布时间:2019-05-10

本文共 14573 字,大约阅读时间需要 48 分钟。

由于维护人员修改Oracle Linux 7中的/dev/shm大小造成其大小小于Oracle实例的MEMORY_TARGET或者SGA_TARGET而导致集群不能启动(CRS-4535,CRS-4000)

[grid@jtp1 ~]$ crsctl stat res -tCRS-4535: Cannot communicate with Cluster Ready ServicesCRS-4000: Command Status failed, or completed with errors.

检查asm磁盘的权限是否问题,发现磁盘权限正常

[root@jtp3 ~]# ls -lrt /dev/asm*brw-rw----. 1 grid oinstall 8, 128 Apr  3  2018 /dev/asmdisk07brw-rw----. 1 grid oinstall 8,  48 Apr  3  2018 /dev/asmdisk02brw-rw----. 1 grid oinstall 8,  96 Apr  3  2018 /dev/asmdisk05brw-rw----. 1 grid oinstall 8, 112 Apr  3  2018 /dev/asmdisk06brw-rw----. 1 grid oinstall 8,  64 Apr  3  2018 /dev/asmdisk03brw-rw----. 1 grid oinstall 8,  80 Apr  3  2018 /dev/asmdisk04brw-rw----. 1 grid oinstall 8,  32 Apr  3  2018 /dev/asmdisk01

重启crs

[root@jtp1 bin]# ./crsctl stop crs -fCRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'jtp1'CRS-2673: Attempting to stop 'ora.mdnsd' on 'jtp1'CRS-2673: Attempting to stop 'ora.gpnpd' on 'jtp1'CRS-2677: Stop of 'ora.mdnsd' on 'jtp1' succeededCRS-2677: Stop of 'ora.gpnpd' on 'jtp1' succeededCRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'jtp1'CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'jtp1'CRS-2677: Stop of 'ora.drivers.acfs' on 'jtp1' succeededCRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'jtp1' succeededCRS-2673: Attempting to stop 'ora.ctssd' on 'jtp1'CRS-2673: Attempting to stop 'ora.evmd' on 'jtp1'CRS-2677: Stop of 'ora.ctssd' on 'jtp1' succeededCRS-2677: Stop of 'ora.evmd' on 'jtp1' succeededCRS-2673: Attempting to stop 'ora.cssd' on 'jtp1'CRS-2677: Stop of 'ora.cssd' on 'jtp1' succeededCRS-2673: Attempting to stop 'ora.gipcd' on 'jtp1'CRS-2673: Attempting to stop 'ora.driver.afd' on 'jtp1'CRS-2677: Stop of 'ora.driver.afd' on 'jtp1' succeededCRS-2677: Stop of 'ora.gipcd' on 'jtp1' succeededCRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'jtp1' has completedCRS-4133: Oracle High Availability Services has been stopped.[root@jtp1 bin]# ./crsctl start crsCRS-4123: Oracle High Availability Services has been started.

查看crs的alert.log发现磁盘组不能加载

[root@jtp1 ~]# tail -f /u01/app/grid/diag/crs/jtp1/crs/trace/alert.log2018-04-02 18:30:21.227 [OHASD(8143)]CRS-8500: Oracle Clusterware OHASD process is starting with operating system process ID 81432018-04-02 18:30:21.230 [OHASD(8143)]CRS-0714: Oracle Clusterware Release 12.2.0.1.0.2018-04-02 18:30:21.245 [OHASD(8143)]CRS-2112: The OLR service started on node jtp1.2018-04-02 18:30:21.262 [OHASD(8143)]CRS-8017: location: /etc/oracle/lastgasp has 2 reboot advisory log files, 0 were announced and 0 errors occurred2018-04-02 18:30:21.262 [OHASD(8143)]CRS-1301: Oracle High Availability Service started on node jtp1.2018-04-02 18:30:21.567 [ORAROOTAGENT(8214)]CRS-8500: Oracle Clusterware ORAROOTAGENT process is starting with operating system process ID 82142018-04-02 18:30:21.600 [CSSDAGENT(8231)]CRS-8500: Oracle Clusterware CSSDAGENT process is starting with operating system process ID 82312018-04-02 18:30:21.607 [CSSDMONITOR(8241)]CRS-8500: Oracle Clusterware CSSDMONITOR process is starting with operating system process ID 82412018-04-02 18:30:21.620 [ORAAGENT(8225)]CRS-8500: Oracle Clusterware ORAAGENT process is starting with operating system process ID 82252018-04-02 18:30:22.146 [ORAAGENT(8316)]CRS-8500: Oracle Clusterware ORAAGENT process is starting with operating system process ID 83162018-04-02 18:30:22.211 [MDNSD(8335)]CRS-8500: Oracle Clusterware MDNSD process is starting with operating system process ID 83352018-04-02 18:30:22.215 [EVMD(8337)]CRS-8500: Oracle Clusterware EVMD process is starting with operating system process ID 83372018-04-02 18:30:23.259 [GPNPD(8369)]CRS-8500: Oracle Clusterware GPNPD process is starting with operating system process ID 83692018-04-02 18:30:24.275 [GPNPD(8369)]CRS-2328: GPNPD started on node jtp1.2018-04-02 18:30:24.283 [GIPCD(8433)]CRS-8500: Oracle Clusterware GIPCD process is starting with operating system process ID 84332018-04-02 18:30:26.296 [CSSDMONITOR(8464)]CRS-8500: Oracle Clusterware CSSDMONITOR process is starting with operating system process ID 84642018-04-02 18:30:28.299 [CSSDAGENT(8482)]CRS-8500: Oracle Clusterware CSSDAGENT process is starting with operating system process ID 84822018-04-02 18:30:28.496 [OCSSD(8497)]CRS-8500: Oracle Clusterware OCSSD process is starting with operating system process ID 84972018-04-02 18:30:29.538 [OCSSD(8497)]CRS-1713: CSSD daemon is started in hub mode2018-04-02 18:30:36.015 [OCSSD(8497)]CRS-1707: Lease acquisition for node jtp1 number 1 completed2018-04-02 18:30:37.087 [OCSSD(8497)]CRS-1605: CSSD voting file is online: AFD:CRS1; details in /u01/app/grid/diag/crs/jtp1/crs/trace/ocssd.trc.2018-04-02 18:30:37.103 [OCSSD(8497)]CRS-1672: The number of voting files currently available 1 has fallen to the minimum number of voting files required 1.2018-04-02 18:30:46.237 [OCSSD(8497)]CRS-1601: CSSD Reconfiguration complete. Active nodes are jtp1 .2018-04-02 18:30:48.514 [OCTSSD(9302)]CRS-8500: Oracle Clusterware OCTSSD process is starting with operating system process ID 93022018-04-02 18:30:48.535 [OCSSD(8497)]CRS-1720: Cluster Synchronization Services daemon (CSSD) is ready for operation.2018-04-02 18:30:50.626 [OCTSSD(9302)]CRS-2407: The new Cluster Time Synchronization Service reference node is host jtp1.2018-04-02 18:30:50.627 [OCTSSD(9302)]CRS-2401: The Cluster Time Synchronization Service started on host jtp1.2018-04-02 18:31:04.202 [ORAROOTAGENT(8214)]CRS-5019: All OCR locations are on ASM disk groups [CRS], and none of these disk groups are mounted. Details are at "(:CLSN00140:)" in "/u01/app/grid/diag/crs/jtp1/crs/trace/ohasd_orarootagent_root.trc".2018-04-02 18:41:00.225 [ORAROOTAGENT(8214)]CRS-5818: Aborted command 'start' for resource 'ora.storage'. Details at (:CRSAGF00113:) {0:9:3} in /u01/app/grid/diag/crs/jtp1/crs/trace/ohasd_orarootagent_root.trc.2018-04-02 18:41:03.757 [ORAROOTAGENT(8214)]CRS-5017: The resource action "ora.storage start" encountered the following error:2018-04-02 18:41:03.757+Storage agent start action aborted. For details refer to "(:CLSN00107:)" in "/u01/app/grid/diag/crs/jtp1/crs/trace/ohasd_orarootagent_root.trc".2018-04-02 18:41:03.760 [OHASD(8143)]CRS-2757: Command 'Start' timed out waiting for response from the resource 'ora.storage'. Details at (:CRSPE00221:) {0:9:3} in /u01/app/grid/diag/crs/jtp1/crs/trace/ohasd.trc.2018-04-02 18:42:09.921 [ORAROOTAGENT(8214)]CRS-5019: All OCR locations are on ASM disk groups [CRS], and none of these disk groups are mounted. Details are at "(:CLSN00140:)" in "/u01/app/grid/diag/crs/jtp1/crs/trace/ohasd_orarootagent_root.trc".

检查跟踪文件,发现查询ASM_DISCOVERY_ADDRESS与ASM_DISCOVERY_ADDRESS属性时出现

[root@jtp1 ~]# more /u01/app/grid/diag/crs/jtp1/crs/trace/ohasd_orarootagent_root.trcTrace file /u01/app/grid/diag/crs/jtp1/crs/trace/ohasd_orarootagent_root.trcOracle Database 12c Clusterware Release 12.2.0.1.0 - Production Copyright 1996, 2016 Oracle. All rights reserved.*** TRACE CONTINUED FROM FILE /u01/app/grid/diag/crs/jtp1/crs/trace/ohasd_orarootagent_root_93.trc ***2018-04-02 18:42:09.165 : CSSCLNT:3554666240: clsssterm: terminating context (0x7f03c0229390)2018-04-02 18:42:09.165 : default:3554666240: clsCredDomClose: Credctx deleted 0x7f03c04594702018-04-02 18:42:09.166 :    GPNP:3554666240: clsgpnp_dbmsGetItem_profile: [at clsgpnp_dbms.c:399] Result: (0) CLSGPNP_OK. (:GPNP00401:)got ASM-Profile.Mode='remote'2018-04-02 18:42:09.253 : CSSCLNT:3554666240: clsssinit: initialized context: (0x7f03c045c2c0) flags 0x1152018-04-02 18:42:09.253 : CSSCLNT:3554666240: clsssterm: terminating context (0x7f03c045c2c0)2018-04-02 18:42:09.254 :   CLSNS:3554666240: clsns_SetTraceLevel:trace level set to 1.2018-04-02 18:42:09.254 :    GPNP:3554666240: clsgpnp_dbmsGetItem_profile: [at clsgpnp_dbms.c:399] Result: (0) CLSGPNP_OK. (:GPNP00401:)got ASM-Profile.Mode='remote'2018-04-02 18:42:09.257 : default:3554666240: Inited LSF context: 0x7f03c04f04202018-04-02 18:42:09.260 : CLSCRED:3554666240: clsCredCommonInit: Inited singleton credctx.2018-04-02 18:42:09.260 : CLSCRED:3554666240: (:CLSCRED0101:)clsCredDomInitRootDom: Using user given storage context for repository access.2018-04-02 18:42:09.294 : USRTHRD:3554666240: {0:9:3} 8033 Error 4 querying length of attr ASM_DISCOVERY_ADDRESS2018-04-02 18:42:09.300 : USRTHRD:3554666240: {0:9:3} 8033 Error 4 querying length of attr ASM_DISCOVERY_ADDRESS2018-04-02 18:42:09.356 : CLSCRED:3554666240: (:CLSCRED1079:)clsCredOcrKeyExists: Obj dom : SYSTEM.credentials.domains.root.ASM.Self.5c82286a084bcf37ffa014144074e5dd.root not found2018-04-02 18:42:09.356 : USRTHRD:3554666240: {0:9:3} 7755 Error 4 opening dom root in 0x7f03c064c980

检查ASM的alert.log 发现/dev/shm大小小于MEMORY_TARGET大小,并且给出了/dev/shm应该被设置的最小值

[root@jtp1 ~]# tail -f /u01/app/grid/diag/asm/+asm/+ASM1/trace/alert_+ASM1.logWARNING: ASM does not support ipclw. Switching to skgxpWARNING: ASM does not support ipclw. Switching to skgxpWARNING: ASM does not support ipclw. Switching to skgxp* instance_number obtained from CSS = 1, checking for the existence of node 0...* node 0 does not exist. instance_number = 1Starting ORACLE instance (normal) (OS id: 9343)2018-04-02T18:31:00.187055+08:00CLI notifier numLatches:7 maxDescs:23012018-04-02T18:31:00.193961+08:00WARNING: You are trying to use the MEMORY_TARGET feature. This feature requires the /dev/shm file system to be mounted for at least 1140850688 bytes. /dev/shm is either not mounted or is mounted with available space less than this size. Please fix this so that MEMORY_TARGET can work as expected. Current available is 1073573888 and used is 167936 bytes. Ensure that the mount point is /dev/shm for this directory.

修改/dev/shm的大小可以通过修改/etc/fstab来实现,将/dev/shm的大小修改为12G

[root@jtp1 bin]# df -hFilesystem           Size  Used Avail Use% Mounted on/dev/mapper/ol-root   49G   42G  7.9G  85% /devtmpfs              12G   28K   12G   1% /devtmpfs                1.0G  164K  1.0G   1% /dev/shmtmpfs                1.0G  9.3M 1015M   1% /runtmpfs                1.0G     0  1.0G   0% /sys/fs/cgroup/dev/sda1           1014M  141M  874M  14% /boot[root@jtp1 bin]# vi /etc/fstab## /etc/fstab# Created by anaconda on Sat Mar 18 15:27:13 2017## Accessible filesystems, by reference, are maintained under '/dev/disk'# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info#/dev/mapper/ol-root     /                       xfs     defaults        0 0UUID=ca5854cd-0125-4954-a5c4-1ac42c9a0f70 /boot                   xfs     defaults        0 0/dev/mapper/ol-swap     swap                    swap    defaults        0 0tmpfs                   /dev/shm                tmpfs   defaults,size=12G        0 0tmpfs                   /run                    tmpfs   defaults,size=12G        0 0tmpfs                  /sys/fs/cgroup           tmpfs   defaults,size=12G        0 0

重启集群后,再次检查集群资源状态恢复正常

--------------------------------------------------------------------------------[grid@jtp1 ~]$ crsctl stat res -t--------------------------------------------------------------------------------Name           Target  State        Server                   State details--------------------------------------------------------------------------------Local Resources--------------------------------------------------------------------------------ora.ASMNET1LSNR_ASM.lsnr               ONLINE  ONLINE       jtp1                  STABLE               ONLINE  ONLINE       jtp2                  STABLEora.CRS.dg               ONLINE  ONLINE       jtp1                  STABLE               ONLINE  ONLINE       jtp2                  STABLEora.DATA.dg               ONLINE  ONLINE       jtp1                  STABLE               ONLINE  ONLINE       jtp2                  STABLEora.FRA.dg               ONLINE  ONLINE       jtp1                  STABLE               ONLINE  ONLINE       jtp2                  STABLEora.LISTENER.lsnr               ONLINE  ONLINE       jtp1                  STABLE               ONLINE  ONLINE       jtp2                  STABLEora.TEST.dg               ONLINE  ONLINE       jtp1                  STABLE               ONLINE  ONLINE       jtp2                  STABLEora.chad               ONLINE  ONLINE       jtp1                  STABLE               ONLINE  ONLINE       jtp2                  STABLEora.net1.network               ONLINE  ONLINE       jtp1                  STABLE               ONLINE  ONLINE       jtp2                  STABLEora.ons               ONLINE  ONLINE       jtp1                  STABLE               ONLINE  ONLINE       jtp2                  STABLEora.proxy_advm               OFFLINE OFFLINE      jtp1                  STABLE               OFFLINE OFFLINE      jtp2                  STABLE--------------------------------------------------------------------------------Cluster Resources--------------------------------------------------------------------------------ora.LISTENER_SCAN1.lsnr      1        ONLINE  ONLINE       jtp1                  STABLEora.LISTENER_SCAN2.lsnr      1        ONLINE  ONLINE       jtp2                  STABLEora.LISTENER_SCAN3.lsnr      1        ONLINE  ONLINE       jtp2                  STABLEora.MGMTLSNR      1        ONLINE  ONLINE       jtp2                  169.254.237.250 88.8                                                             8.88.2,STABLEora.asm      1        ONLINE  ONLINE       jtp1                  Started,STABLE      2        ONLINE  ONLINE       jtp2                  Started,STABLE      3        OFFLINE OFFLINE                               STABLEora.cvu      1        ONLINE  ONLINE       jtp2                  STABLEora.jy.db      1        ONLINE  OFFLINE                               STABLE      2        ONLINE  OFFLINE                               STABLEora.jtp1.vip      1        ONLINE  ONLINE       jtp1                  STABLEora.jtp2.vip      1        ONLINE  ONLINE       jtp2                  STABLEora.mgmtdb      1        ONLINE  ONLINE       jtp2                  Open,STABLEora.qosmserver      1        ONLINE  ONLINE       jtp2                  STABLEora.scan1.vip      1        ONLINE  ONLINE       jtp1                  STABLEora.scan2.vip      1        ONLINE  ONLINE       jtp2                  STABLEora.scan3.vip      1        ONLINE  ONLINE       jtp2                  STABLE--------------------------------------------------------------------------------

到此集群恢复正常

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/26015009/viewspace-2152657/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/26015009/viewspace-2152657/

你可能感兴趣的文章
Routing
查看>>
ubuntu12.04安装openCV2.4.6.1
查看>>
jsp与servlet的作用以及区别--为什么说JSP底层就是一个Servlet
查看>>
看HashMap源码前的必备冷知识,白话文式教学,适合刚开始了解源码的新手观看
查看>>
Oracle安装指南
查看>>
通过Form表单一次性拿到json格式数据,及后台接收
查看>>
Mybatis光速入门(配置文件模块)
查看>>
手撕HashMap的resize()方法源码渗透解析+图解
查看>>
Mybatis常见异常类型Could not set parameters for mapping离不开这个原因!
查看>>
JAVA如何实现短信验证码--阿里云接口,新手式图文教学,个人项目有这一篇就够了
查看>>
Java中大小数BigDecimal的加减乘除用法及场景的详细介绍,看完不信你还会报Syntax error on token “+/-/*“, invalid AssignmentOperat异常
查看>>
UVa 10917 Dijkstra
查看>>
CF403B/CF402D
查看>>
CF402E / 403C
查看>>
cf404c
查看>>
cf404d
查看>>
武大网络预赛 Problem 1545 - I - Twenty-four
查看>>
ZOJ Problem Set - 3768 Continuous Login
查看>>
某山面试 3、实现如下函数:
查看>>
malloc的小知识
查看>>