Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

LED 554

Status
Not open for further replies.

khalidaaa

Technical User
Joined
Jan 19, 2006
Messages
2,323
Location
BH
Gents,

I had an LPAR with a SAN boot disk. I took the Fiber Channel adapter for this Device to another LPAR (which is production for now) and i assigned a new Fiber Channel to it but when i Activate it it hangs with LED 554 (which suggest either a JFS log corrupt or Superblock corrupt).

I went to the maintenance mode and tried to do fsck -y -V for all the /dev/hd? but when i exit i get serial of messages that looks like this

/etc/getrootfs [586]: 3*** Killed

(the stars are different numbers every time)

Any idea how to start this LPAR up?

Regards
Khalid
 
when the superblock is corrupted, there is a second copy of the superblock you can copy to the primary with :

fsck -p <filesystem>

If it's the log, you could try to create a new log

mklv -t jfs2log -y <newlog> <vgname>
format the log with logform /dev/newlogdevice
chfs -a log=<newlog> <filesystem>


rgds,

R.
 
Thanks RMGBELGIUM

right now i'm at home and i don't have access to the server

but would those two commands work while i'm in the maintenance mode?

fsck will work i guess but what do you mean by <filesystem>?

I tried to make a new log file by using the command logform but it wasn't successful! i had the problem even after doing this.

I just entered the maintenace mode and tried to access rootvg without mounting the filesystems but when i exit from there to mount the filesystems this errot i metioned above loops for ever (/etc/getrootfs [586]: 3*** Killed)
 
khalidaa, the answer I provided was just for fixing a corrupted superblock or jfslog, but I looked up the error code :

led 554 :

The IPL device could not be opened or a read failed (hardware not configured or missing). The system halts

So this really doesn't look good...
My knowledge about LPARS doesn't extend far enough to help you out here unfortunately...

rgds,

R.
 
Is your zoning still correct? You mentioned a replaced/newly assigned FC adapter. Can the LPAR still access its LUNs?

It isn't trying to boot from another LPAR's LUNs I hope?


HTH,

p5wizard
 
Thank you RMGBELGIUM :) you've already tried and i appreciate it.

Yes right P5wizard, i did fix the zoning on the SAN switch and the LPAR did see the AIX 5.2 boot thru this Fiber Channel i believe coz it was showing as a possible boot device!
 
I found an article that suggest if the superblock is corrupt try to run this command

dd count=1 bs=4k skip=31 seek=1 if=/dev/hd4 of=/dev/hd4

but even when i exit after to mount the rootvg filesystems i got the previous (/etc/getrootfs [586]: 4*** Killed) (note that this time it starts with 4***)

and the LED showing 0c48!?!

Any idea?
 
I rebooted the system in maintenance mode again and accessed rootvg without mount filesystems

i issued

ls -m /dev/hd5
The output was showing that i have hdisk15!?!

lspv gave me serial of disks starting from hdisk0 till hdisk15!?!

I think its all because i have 15 LUNs in the SAN for this LPAR's data and i have one LUN for rootvg! but i don't know why rootvg now is hdisk15?!?

any way, I've changed the bootlist (normal and service) to hdisk15, and when i tried to bosboot -ad /dev/hdisk15

it said that /usr/bin/ksh: bosboot: not found

then i mounted /usr

mount /dev/hd2

and i tried to bosboot again but whenever i issue this command a message says "Killed" appeared

so i tried to exit (to mount all rootvg's filesystems) and i got the following

Code:
/etc/getrootfs[542]: 4552 Killed
The mount of /dev/hd4 did not succed/
Exiting to shell.

INIT: EXECUTING /sbin/rc.boot 2

INIT: SINGLE USER MODE
/usr/bin/sh killed by signal 9

INIT: FATAL ERROR IN /usr/bin/sh

INIT: FATAL ERROR IN /usr/bin/sh
XIX s-shell
#

and then it gave me back the shell!

and whenever i issue any command it says

Code:
# df
df
df killed by signal 9

What the hell should i do with that now :(

any help is appreciated

Regards,
Khalid
 
go to maintanmce mode do a fsch on the following items, hd1, hd2,hd3,hd4,hd9var. then run /usr/sbin/logforn /def/hd8 answer yes to questio.
theb do lslv -m hd this will tell where ODM thinks boo lives do a ls -l /dev/hdisk(number from lslv command) compare it to ls -l /dev/ipldevice the major,minor should be the sam if not remone ipldevice then do ln /dev/hdisk(number) /dev/ipldevice then do a bosboot -ad /dev/ipldevice if all this it still willnot boot you need to reinstall
 
I can't do ls on the shell coz the filesystems are not mounted?!?

I did this

lspv -l hdisk15

and it gave me the filesystems that are on hdisk15 (/tmp, /usr, /home, /var, /opt, /, paging)

but when i issue ls command i get this

/usr/bin/ksh: ls: not found

coz the usr is not mounted?!?

when i mount usr, any thing i issue after that is Killed!

any other suggestions?

Regards,
Khalid
 
why don't follow directions boot into mainteance mode from boot media. then select manitenance, then select to boot disk(s) select option 2 start a shell before mounting then follow the instructions I gave
 
well, i followed exactly what you said plamb :) if you have read my previous comments, i was doing exactly what you were saying above but its not working. Ok, i will do it again and exactly in the sequence you mentioned and i'll let you know
 
plamb, i did exactly what you said and in the sequence you gave me

all the filesystems return to me a message with "File system is clean"

when i did logform to hd8 and it said that it will destroy the log and i said yes.

Then i did lslv -m hd5 and it showed that the boot disk is hdisk15

i tried to do ls -l /dev/hdisk15 but it returned
/usr/bin/ksh: ls: not found?!?!

the bosboot as well returned similar message as above

So what's next now?

Regards
Khalid
 
Now i know why it was pointing to hdisk15!

It is all because of my LUN assignment to the Host in the SAN config. So i just removed the host, created it again, and then added the LUN with hdisk0 first then the remainning data LUNs.

now i get hdisk0 as the boot disk when i go to maintenance mode but still this didn't solve the problem!!!
 
OMG :)

It was a silly mistake i made :p

when i created the LUNs in the sequence they were before i deleted the Host on the SAN, now it is booting with no problems!?! i guess this is because the ODM is having hdisk0 as the boot disk not hdisk15 as it was before!

Now it is running like a charm :)

Thanks guys for the help

Regards,
Khalid
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top