Cluster Ready Services and ASM
Cluster Ready Services and ASM
(OP)
I thought to open this as a question but by the time I had installed Oracle Grid on a standalone server and tried to create an ASM instance on Linux red hat ES 5.2, I had somehow resolved the problem.
The problem was that I could install the ASM drivers, create ASMlib and assign volumes. I could then install the software for grid for a standalone server. An ASM instance would then be created and started running. you could then shutdown and startup ASM. The required SPFILE was then created on the ASM diskgroup. Then reboot the Linux host and the grid would not come up! I tried all sorts of things via oracle and root but little joy. When you wanted to start ASM you would get the following error
ORA-01078: failure in processing system parameters
ORA-29701: unable to connect to Cluster Synchronization Service
The way I overcame this problem was through oracle UNIX logging. Just go to directory $ORACLE_HOME/grid/bin and run the following crs command:
crsctl start resource -all
CRS-5702: Resource 'ora.LISTENER.lsnr' is already running on 'rhes5'
CRS-2672: Attempting to start 'ora.cssd' on 'rhes5'
CRS-2679: Attempting to clean 'ora.diskmon' on 'rhes5'
CRS-2681: Clean of 'ora.diskmon' on 'rhes5' succeeded
CRS-2672: Attempting to start 'ora.diskmon' on 'rhes5'
CRS-2676: Start of 'ora.diskmon' on 'rhes5' succeeded
CRS-2676: Start of 'ora.cssd' on 'rhes5' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'rhes5'
CRS-2676: Start of 'ora.asm' on 'rhes5' succeeded
CRS-2672: Attempting to start 'ora.ORACLE_DG.dg' on 'rhes5'
CRS-2676: Start of 'ora.ORACLE_DG.dg' on 'rhes5' succeeded
CRS-4000: Command Start failed, or completed with errors.
Although the last line says 'Command Start failed', it actually works and both the grid and instance are started.
idle> select * from v$asm_diskgroup;
GROUP_NUMBER NAME SECTOR_SIZE BLOCK_SIZE ALLOCATION_UNIT_SIZE STATE TYPE TOTAL_MB FREE_MB
------------ ------------------------------ ----------- ---------- -------------------- ----------- ------ ---------- ----------
HOT_USED_MB COLD_USED_MB REQUIRED_MIRROR_FREE_MB USABLE_FILE_MB OFFLINE_DISKS
----------- ------------ ----------------------- -------------- -------------
COMPATIBILITY DATABASE_COMPATIBILITY V
------------------------------------------------------------ ------------------------------------------------------------ -
1 ORACLE_DG 512 4096 1048576 MOUNTED NORMAL 185978 185799
0 179 60 92869 0
11.2.0.0.0 10.1.0.0.0 N
If someone has better ideas, please let me know. Is this some bug? The version of OS and Oracle are as follows:
Linux rhes5 2.6.18-92.el5xen #1 SMP Tue Apr 29 13:45:57 EDT 2008 i686 i686 i386 GNU/Linux
Release 11.2.0.1.0
HTH
The problem was that I could install the ASM drivers, create ASMlib and assign volumes. I could then install the software for grid for a standalone server. An ASM instance would then be created and started running. you could then shutdown and startup ASM. The required SPFILE was then created on the ASM diskgroup. Then reboot the Linux host and the grid would not come up! I tried all sorts of things via oracle and root but little joy. When you wanted to start ASM you would get the following error
ORA-01078: failure in processing system parameters
ORA-29701: unable to connect to Cluster Synchronization Service
The way I overcame this problem was through oracle UNIX logging. Just go to directory $ORACLE_HOME/grid/bin and run the following crs command:
crsctl start resource -all
CRS-5702: Resource 'ora.LISTENER.lsnr' is already running on 'rhes5'
CRS-2672: Attempting to start 'ora.cssd' on 'rhes5'
CRS-2679: Attempting to clean 'ora.diskmon' on 'rhes5'
CRS-2681: Clean of 'ora.diskmon' on 'rhes5' succeeded
CRS-2672: Attempting to start 'ora.diskmon' on 'rhes5'
CRS-2676: Start of 'ora.diskmon' on 'rhes5' succeeded
CRS-2676: Start of 'ora.cssd' on 'rhes5' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'rhes5'
CRS-2676: Start of 'ora.asm' on 'rhes5' succeeded
CRS-2672: Attempting to start 'ora.ORACLE_DG.dg' on 'rhes5'
CRS-2676: Start of 'ora.ORACLE_DG.dg' on 'rhes5' succeeded
CRS-4000: Command Start failed, or completed with errors.
Although the last line says 'Command Start failed', it actually works and both the grid and instance are started.
idle> select * from v$asm_diskgroup;
GROUP_NUMBER NAME SECTOR_SIZE BLOCK_SIZE ALLOCATION_UNIT_SIZE STATE TYPE TOTAL_MB FREE_MB
------------ ------------------------------ ----------- ---------- -------------------- ----------- ------ ---------- ----------
HOT_USED_MB COLD_USED_MB REQUIRED_MIRROR_FREE_MB USABLE_FILE_MB OFFLINE_DISKS
----------- ------------ ----------------------- -------------- -------------
COMPATIBILITY DATABASE_COMPATIBILITY V
------------------------------------------------------------ ------------------------------------------------------------ -
1 ORACLE_DG 512 4096 1048576 MOUNTED NORMAL 185978 185799
0 179 60 92869 0
11.2.0.0.0 10.1.0.0.0 N
If someone has better ideas, please let me know. Is this some bug? The version of OS and Oracle are as follows:
Linux rhes5 2.6.18-92.el5xen #1 SMP Tue Apr 29 13:45:57 EDT 2008 i686 i686 i386 GNU/Linux
Release 11.2.0.1.0
HTH
RE: Cluster Ready Services and ASM
CODE
If you enable crs, you most likely won't have to manually issue any start commands at all after a reboot.
If you want to continue starting crs manually, I don't see any major problem with what you're currently doing. It looks as if you're only getting the "Command Start failed, or completed with errors" message because your listener was already started. If you want a more granular method of starting only what is needed, try something like the following:
CODE
crsctl start crs
(as oracle)
crs_stat -t (to get a list of which resources are currently offline)
crs_start ora.asm
crs_start ora.ORACLE_DG.dg
RE: Cluster Ready Services and ASM
I decided to reboot the host and check what is started by deamons first;
ps -fuoracle
oracle 5755 1 0 14:50 ? 00:00:02 /u01/app/oracle/product/11.2.0/grid/bin/ohasd.bin reboot
So we have the Oracle Grid running. I will then do your suggestion:
oracle@rhes5:/home/oracle% crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....ER.lsnr ora....er.type OFFLINE OFFLINE
ora....E_DG.dg ora....up.type OFFLINE OFFLINE
ora.asm ora.asm.type OFFLINE OFFLINE
ora.cssd ora.cssd.type OFFLINE OFFLINE
ora.diskmon ora....on.type OFFLINE OFFLINE
ora.mydb.db ora....se.type OFFLINE OFFLINE
So everything is effectively down now.
oracle@rhes5:/home/oracle% crs_start ora.asm
Attempting to start `ora.LISTENER.lsnr` on member `rhes5`
Attempting to start `ora.cssd` on member `rhes5`
Attempting to stop `ora.diskmon` on member `rhes5`
Stop of `ora.diskmon` on member `rhes5` succeeded.
Attempting to start `ora.diskmon` on member `rhes5`
Start of `ora.LISTENER.lsnr` on member `rhes5` succeeded.
Start of `ora.diskmon` on member `rhes5` succeeded.
Start of `ora.cssd` on member `rhes5` succeeded.
Attempting to start `ora.asm` on member `rhes5`
Start of `ora.asm` on member `rhes5` succeeded.
So far so good
Now this one
oracle@rhes5:/home/oracle% crs_start ora.ORACLE_DG.dg
CRS-5702: Resource 'ora.ORACLE_DG.dg' is already running on 'rhes5'
CRS-0223: Resource 'ora.ORACLE_DG.dg' has placement error.
Any ideas what is happening.
Cheers
RE: Cluster Ready Services and ASM
RE: Cluster Ready Services and ASM
Trying as you suggested.
First try what is running under oracle
racle@rhes5:/home/oracle% ps -fuoracle
UID PID PPID C STIME TTY TIME CMD
oracle 5755 1 0 Nov17 ? 00:00:13 /u01/app/oracle/product/11.2.0/grid/bin/ohasd.bin reboot
oracle 6847 6845 0 Nov17 ? 00:00:02 sshd: oracle@pts/1
oracle 6848 6847 0 Nov17 pts/1 00:00:00 -ksh
oracle 6869 6848 0 Nov17 pts/1 00:00:00 -sh
oracle 10715 6869 0 11:42 pts/1 00:00:00 ps -fuoracle
Next I do crs_stat -t
oracle@rhes5:/home/oracle% crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....ER.lsnr ora....er.type ONLINE ONLINE rhes5
ora....E_DG.dg ora....up.type ONLINE ONLINE rhes5
ora.asm ora.asm.type ONLINE ONLINE rhes5
ora.cssd ora.cssd.type ONLINE ONLINE rhes5
ora.diskmon ora....on.type ONLINE ONLINE rhes5
ora.mydb.db ora....se.type OFFLINE OFFLINE
So the only thing is being down is the Oracle instance mydb!
oracle@rhes5:/home/oracle% crs_start ora.ORACLE_DG.dg
CRS-5702: Resource 'ora.ORACLE_DG.dg' is already running on 'rhes5'
CRS-0223: Resource 'ora.ORACLE_DG.dg' has placement error.
We get the same error. If ASM disks are mounted
oracle@rhes5:/home/oracle% ps -fuoracle
UID PID PPID C STIME TTY TIME CMD
oracle 5755 1 0 Nov17 ? 00:00:13 /u01/app/oracle/product/11.2.0/grid/bin/ohasd.bin reboot
oracle 6847 6845 0 Nov17 ? 00:00:02 sshd: oracle@pts/1
oracle 6848 6847 0 Nov17 pts/1 00:00:00 -ksh
oracle 6869 6848 0 Nov17 pts/1 00:00:00 -sh
oracle 10721 1 0 11:43 ? 00:00:00 /u01/app/oracle/product/11.2.0/grid/bin/oraagent.bin
oracle 10724 1 0 11:43 ? 00:00:00 /u01/app/oracle/product/11.2.0/grid/bin/cssdagent
oracle 10726 1 0 11:43 ? 00:00:00 /u01/app/oracle/product/11.2.0/grid/bin/orarootagent.bin
oracle 10742 1 0 11:43 ? 00:00:00 /u01/app/oracle/product/11.2.0/grid/bin/diskmon.bin -d -f
oracle 10755 1 0 11:43 ? 00:00:00 /u01/app/oracle/product/11.2.0/grid/bin/ocssd.bin
oracle 10779 1 0 11:43 ? 00:00:00 /u01/app/oracle/product/11.2.0/grid/bin/tnslsnr LISTENER -inherit
oracle 10845 1 0 11:44 ? 00:00:00 asm_pmon_+ASM
oracle 10847 1 0 11:44 ? 00:00:00 asm_vktm_+ASM
oracle 10851 1 0 11:44 ? 00:00:00 asm_gen0_+ASM
oracle 10853 1 0 11:44 ? 00:00:00 asm_diag_+ASM
oracle 10855 1 0 11:44 ? 00:00:00 asm_psp0_+ASM
oracle 10857 1 0 11:44 ? 00:00:00 asm_dia0_+ASM
oracle 10859 1 0 11:44 ? 00:00:00 asm_mman_+ASM
oracle 10861 1 0 11:44 ? 00:00:00 asm_dbw0_+ASM
oracle 10863 1 0 11:44 ? 00:00:00 asm_lgwr_+ASM
oracle 10865 1 0 11:44 ? 00:00:00 asm_ckpt_+ASM
oracle 10867 1 0 11:44 ? 00:00:00 asm_smon_+ASM
oracle 10869 1 0 11:44 ? 00:00:00 asm_rbal_+ASM
oracle 10871 1 0 11:44 ? 00:00:00 asm_gmon_+ASM
oracle 10873 1 0 11:44 ? 00:00:00 asm_mmon_+ASM
oracle 10875 1 0 11:44 ? 00:00:00 asm_mmnl_+ASM
oracle 10892 6869 0 11:45 pts/1 00:00:00 ps -fuoracle
Now I try to start Oracle instance manually
Connected to an idle instance.
idle> startup
ORACLE instance started.
Total System Global Area 1640484864 bytes
Fixed Size 1336876 bytes
Variable Size 973081044 bytes
Database Buffers 654311424 bytes
Redo Buffers 11755520 bytes
Database mounted.
Database opened.
So I guess that error is spurious and disks are mounted. Otherwise I would not be able to start the instance!
Hope this make sense.