Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations derfloh on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

REPLACING A FAILED DISK ON A MIRRORED SYSTEM

Status
Not open for further replies.

ponetguy2

MIS
Joined
Aug 28, 2002
Messages
442
Location
US
does this look okay?

REPLACING A FAILED DISK ON A MIRRORED SYSTEM
============================================

1. PhysicallY replace the failed disk (RTFM, be extra cautious)

2. Run format to verify if OS is able to find the new disk

3. cfgadm -al (to find label and type status)

4. cfgadm -c configure c1::dsk/c1t1d0

5. cfgadm -al (confirm if type shows disk)

6. prtvtoc -h /dev/rdsk/c1t0d0s2 | fmthard -s - /dev/rdsk/c1t1d0s2

7. recreate file system on each slice of new disk

newfs -r 1000 /dev/rdsk/c1t1d0s0
newfs -r 1000 /dev/rdsk/c1t1d0s3
newfs -r 1000 /dev/rdsk/c1t1d0s4
newfs -r 1000 /dev/rdsk/c1t1d0s5
newfs -r 1000 /dev/rdsk/c1t1d0s6
newfs -r 1000 /dev/rdsk/c1t1d0s7

8. recreate database replica on new disk

metadb -a -f -c 3 c1t1d0s4


9. Recreate Submirror
=====================

1. First, Mirror root

e. metainit d2 1 1 c1t1d0s0

f. metattach d0 d2 (Do not reboot until State process is okay. Check w/ metastat)

g. shutdown -y -g0 -i6

2. /swap

b. metainit d12 1 1 c1t1d0s1 (second sub-mirror)

f. metattach d10 d12 (Do not reboot until State process is okay. Check w/ metastat)

g. shutdown -y -g0 -i6


3. /var

b. metainit d22 1 1 c1t1d0s3

f. metattach d20 d22 (Do not reboot until State process is okay. Check w/ metastat)

g. shutdown -y -g0 -i6

4. /opt

b. metainit d32 1 1 c1t1d0s5

f. metattach d30 d32 (Do not reboot until State process is okay. Check w/ metastat)

g. shutdown -y -g0 -i6

5. /usr

b. metainit d42 1 1 c1t1d0s6

f. metattach d40 d42 (Do not reboot until State process is okay. Check w/ metastat)

g. shutdown -y -g0 -i6

6. /home

b. metainit d52 1 1 c1t1d0s7

f. metattach d50 d52 (Do not reboot until State process is okay. Check w/ metastat)

g. shutdown -y -g0 -i6

 
Never used the -r option for newfs... I am assuming you want the value for rpm set to 10000 not 1000...
 
Why is the newfs -r option required ? What would happen if you did not specify the speed ?
 
The only time I would think you would need to use this option is when you are using a disk that is not in the HCL...
 
Revised:


REPLACING A FAILED DISK ON A MIRRORED SYSTEM
============================================

1. PhysicallY replace the failed disk (RTFM, be extra cautious)

2. Run format to verify if OS is able to find the new disk

3. cfgadm -al (to find label and type status)

a. cfgadm -c configure c1::dsk/c1t1d0 (if type shows as undefined)

b. cfgadm -al (confirm if type shows disk)

4. prtvtoc -h /dev/rdsk/c1t0d0s2 | fmthard -s - /dev/rdsk/c1t1d0s2

5. verify w/ format if both disk are identical

6. recreate file system on each slice of new disk

newfs -r 1000 /dev/rdsk/c1t1d0s0
newfs -r 1000 /dev/rdsk/c1t1d0s3
newfs -r 1000 /dev/rdsk/c1t1d0s4
newfs -r 1000 /dev/rdsk/c1t1d0s5
newfs -r 1000 /dev/rdsk/c1t1d0s6
newfs -r 1000 /dev/rdsk/c1t1d0s7

7. metadb -d c1t1d0s4

8. recreate database replica on new disk

metadb -a -f -c 3 c1t1d0s4


9. Replace Bad Submirror (metareplace -e)
=========================================

1. root (/)

a. metastat d0

b. metareplace -e d0 c1t1d0s0 (c1t1d0s0 is the mirror which needs maintenance)

c. metastat d0 (wait until state is okay)

2. /swap

a. metastat d10

b. metareplace -e d10 c1t1d0s1 (c1t1d0s1 is the mirror which needs maintenance)

c. metastat d10 (wait until state is okay)

3. /var

a. metastat d20

b. metareplace -e d20 c1t1d0s3 (c1t1d0s3 is the mirror which needs maintenance)

c. metastat d20 (wait until state is okay)


4. /opt

a. metastat d30

b. metareplace -e d30 c1t1d0s5 (c1t1d0s5 is the mirror which needs maintenance)

c. metastat d30 (wait until state is okay)


5. /usr

a. metastat d40

b. metareplace -e d40 c1t1d0s6 (c1t1d0s6 is the mirror which needs maintenance)

c. metastat d40 (wait until state is okay)


6. /home

a. metastat d50

b. metareplace -e d50 c1t1d0s7 (c1t1d0s7 is the mirror which needs maintenance)

c. metastat d50 (wait until state is okay)

7. shutdown -y -g0 -i6
 
shxt!!! machine would not boot anymore!!!

both disk are unbootable!!!

what did i do wrong?

should i have done the first idea? the second one is obviously incorrect.
 
You could have done a metadb -d -f /slice/of/bad/drive
then replace drive
then metareplace -e
OR
metadetach
metaclear
metadb
metainit
metattach

I don't know why you did a newfs? Why do you shutdown after each slice is done syncing? You can let all of them sync at the same time, it just slows everything down. Maybe if you want you can sync the metaroot and then do all the rest at the same time.

At the ok> did you type 'boot disk' and see if it will boot from that?

Otherwise printenv at the ok> to see what you aliases are to boot from.

If you ran a metastat it would have told you the correct command (metareplace) to run.

Again, where did you get all that other stuff? You didn't need to reboot at all after replacing the bad disk.

Boot off the CDROM and then change your /etc/vfstab to /dev/dsk... instead of the md... and then create the mirrors again.
 
You sure you didn't run newfs on the good disk?
 
ok, i did'nt boot after each metareplace. i've put that on my my doc, because my boss told me to reboot after each metareplace.

as for newfs, he told me to do that as well.

before doing a metareplace for each submirror, i wated until state was done syncing. once it's done, i proceed to the next.

i also checked w/ metastat if state was all okay. then i did a reboot.
 
Is your server back up?

Did you run 'boot disk' from ok>
 
i was able to boot to single user mode. i was going to boot to cdrom and fsck root, but my boss took over. he was pissed.

he tried to boot from both disk from ok prompt and it did'nt work. we keep getting to single user mode and we get a prompt to fsck root.

he wanted to know how he can fsck root, i told him he can boot cdrom -s from ok prompt and from there he can fsck root. he did'nt believe me. i just left him alone. he looks too frustrated to talk to.


my ego is totally hurt. i've only been working here for a month and a half. :+(

 
Your step number 8 on metadb your are doing a -f and that should only be done the first time you setup the replicas if none already exist. Not sure why or what happens, but that is in the Solaris documentation.
 
It sounds like your boss is the one with the ego problem, don't stress!

Like kHz said, running the newfs was pointless. The disks are mirrored, so when you resynchronise the mirrors the filesystem on the working disk would be copied to the replacement disk. Also no reboots should have been required except perhaps to physically replace the disk, if it was not hot-swappable. Although it would have been a good idea to reboot afterwards to make sure the disk was bootable,if it was the boot disk.

I presume it is a boot disk, in which case you would also need to install a boot block on the disk using this command:

[tt]installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk /dev/rdsk/cXtXdXsX[/tt]

Annihilannic.
 
Bad news indeed. I take it by 'pissed' you mean in the US idiom, not the Brit? Otherwise you might be in even more trouble! You do have a backup? Might be better to rebuild if it's possible. Use this as a learning experience and don't beat yourself up over it.
 
my boss is cool. he is just under alot of stress.
 
Ponetguy,
Another thing you may want to do is set up NVRAM so your server would be able to boot off root's mirror. This is for failover purposes. If your primary root mirror fails, you would automatically boot off the mirror.

Here are the steps...

1. Get the device path for root mirror:
ls -l /dev/dsk/c#T#d#s# Result should look like "../../devices/pci@1f,0/.../.../"
Copy down the path after /devices

2. Run init 0 to get you to the ok# prompt.

3. Enter the following command... Your nvalias name can be anything you want:
nvalias <alias_name> <copied device path>
Example: nvalias backup_root /pci1f,0/pci@1/..../..../

4. From the ok# prompt, get the current NVRAM boot-device setting:
printenv boot-device Result should look something like "boot-device= disk net"

5. Set new nvalias into NVRAM boot-device:
setenv boot-device disk backup_root net

 
you can also do that with the os running by using 'eeprom' and it will set the changes at the next boot.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top