Smart questions
Smart answers
Smart people
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Member Login

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips now!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

Join Tek-Tips
*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

LINK TO THIS FORUM!

Add Stickiness To Your Site By Linking To This Professionally Managed Technical Forum.
Just copy and paste the
code below into your site.

Partner With Us!

"Best Of Breed" Forums Add Stickiness To Your Site
Partner Button
(Download This Button Today!)

Feedback

"...The level of expertise is awesome. The nature in which people respond is professional helpful and not the least condescending. I can't say that for most forums..."

Geography

Where in the world do Tek-Tips members come from?
teqmod (TechnicalUser)
22 Jun 09 16:22
Over the weekend we lost an array twice. Here are the specs of the system:

Dell PE 1750 w/onboard Perc and Raid 1 136GB drives
Powervault 220 Array enclosure
7 x 136GB HDD drives
Adaptec 2120S RAID controller

This machine is an exchange server

In the enclosure it showed we lost 4 drives all in the same 30 second period of time and we lost the array. It showed ID 9 12 13 14 were lost. We shutdown the system and reseated all drives. It came up with drive 12 failed and the array degraded. According to the log this was the first that failed when all drives went downWe reseated drive 12 again and the array rebuilt and was up and running like normal. 2 hours after the rebuild we lost drive 5 and when we went to the logs only drive 5 showed as failed. This drive has since been replaced and is currently in the process of rebuilding the array. Since this was multiple hardware failures we grabbed another temporary enclosure, attached it to the onboard RAID controller configured an array and pulled the exchange DBs off the problematic array and moved them to the temp array. The problem is now I do not know where to go to resolve the issue. I am not convinced it is a simple drive failure since it has always reported different drives. Since all of the DBs are not on the array in question anymore it seems fine but it also is just sitting there not doing anything. Does anyone know of any test software to do read/write tests on this array? Has anyone seen an issue like this before?

 
technome (IS/IT--Management)
23 Jun 09 8:38
After using multiple diag programs on raid arrays over the last 20 years, I have yet to find one which works reliably on raid arrays, even if the individual drives are placed on  standard drive interfaces for testing. Some diags may pickup bad sectors and will pickup very obvious drive failures, but you would be surprised how many drives will pass all tests, no errors, hanging off a standard disk interface in constant testing for weeks, only to fail once place back into an array. The only reliable testing is with a drive testing hardware device. That said, it could be anyone of the drives which has not been replaced.

This situation could be caused by a bug in the raid adapter firmware, so it should be the most up to date. Less likely, hard drive firmware, unless the drives are certain Seagates models with known issues. Different firmware revisions do not help the situation.

Reseat all cables.

Look at raid management software logs, any drive which has soft/hard errors is more likely an offending drive, then a drive which show no errors.

Pull the drives out with power off, make sure you know which slot each drive comes from, number them with a magic marker... any chips on the drive PCB boards which have abnormal hot spots? examine each drive PCBs carefully.

 

........................................
Chernobyl disaster..a must see pictorial
http://www.kiddofspeed.com/default.htm

jkupski (MIS)
23 Jun 09 13:50
I agree in general with technome, but I'll go on to say that the failure mode makes a controller or enclosure related problem far more likely than a problem with one or more of your disks.
teqmod (TechnicalUser)
23 Jun 09 14:03
I am replacing the one drive just because. I am just not real confident in the array at this time and want to test it before I put it back in production. I was hoping there would be a test software or even a disk burn in software I could use through the array to see if any errors would come up. I am also considering just moving hte array off the Adaptec cars and to the onboard PERC card. This will at least take one of the possible components out of the picture.
technome (IS/IT--Management)
23 Jun 09 14:20
The onboard Perc will not accept the array from the Adaptec, you may cause more problems by trying.
Dell has diags you can download from its' support site or from the disks you received upon purchase.

........................................
Chernobyl disaster..a must see pictorial
http://www.kiddofspeed.com/default.htm

teqmod (TechnicalUser)
23 Jun 09 14:52
Currently there is no data on the array. Since it failed all data was moved to a seperate array that was temporarily attached. I was thinking of attaching the drive enclosure to the onboard controller and creating a new array and migrating the data to it.

Not sure if the dell diagnostics will work through the current adaptec. Getting it out of the picture might be a good idea just from this perspective.
technome (IS/IT--Management)
23 Jun 09 16:40
Sounds like a good move, literally. Might be a good to create the array, let the initialization take place, and repeat the process a number of times; this will stress the drives,  heating up the the electronics/drives..at least if it is going to fail again, you have spent very little time in finding out.  

........................................
Chernobyl disaster..a must see pictorial
http://www.kiddofspeed.com/default.htm

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members!

Back To Forum

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close