Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

SQL Server 2000 Failover Cluster Logic

Status
Not open for further replies.

JabbaTheNut

Programmer
Joined
Jul 29, 2002
Messages
176
Location
US
I am a relative newbie, so forgive me.

I have been reading some material on how to set up a SQL Server 2000 failover cluster. The material describes two SQL 2000 servers and a shared storage device. The logic is that if one SQL 2000 server fails, the other will takeover. This sounds great for a SQL 2000 server machine/etc. failure. But what about the shared storage device? The descriptions I've read only illustrate one shared device. Wouldn't someone still be as much at risk of hardware failure? I understand that shared storage devices require system boards, controller cards, power supplies, etc. This sounds like just another server to me. How does adding two more servers reduce this risk if I am only using one shared device? What am I missing? Is my assumption about a shared storage device being simply another server with more disks incorrect? Please educate this newbie. I am trying to decide on a course of action for a database oriented website. Thanks :) Game Over, Man!
 
JabbatheNut

I'm no expert but I think the main point of failover clustering is that you have not just the benefit of redundancy of RAID on the storage device (which isnt necessarily in a server), but high availability as you are protected against other hardware failure eg netword card failure or software failure be it W2k or SQL. When the primary node fails the secondary node will take over the workload
 
Microsoft clutering is really very "basic". It protects against server failures only. To protect against disk storage look for a SAN storage solution ...
 
Your points are well taken. However, I still wonder about the shared storage device. Isn't there hardware failure risk (i.e. controller cards, etc.)? How do you protect yourself from this? Consider the following two scenarios:

1. A single SQL Server machine with a RAID-5 disk array

2. Two clustered SQL Server machines with a separate RAID-5 shared storage device.

In both cases there is RAID-5 disk failure protection.

In the case of scenario 1, If there is a controller card failure, you are down. There is no failover to protect you.

In the case of scenario 2, If there is a controller card failure in the shared storage device, you are down. There is no failover to protect you. Granted, you have failover protection if one of the SQL Server machines goes down. But it seems that you don't have adequate protection for the shared storage device.
Game Over, Man!
 
You can always write scenerios where you can't failover, etc. We had the scenerio you mentioned happen to us. A two-node cluster with two shared arrays and one of the shared arrays lost a RAID controller. Each shared array has two RAID controllers, but in this case the second controller was unable to take over the job of both. We replaced the bad controller and were back up without any data lost due to the hardware failure. (We had other problems that did cause us data loss, but it wasn't due to the configuration....it was a software issue).

What's the best solution, if you could have whatever you wanted and money wasn't a concern? Have an on-site active-passive cluster and an off-site active-passive cluster that mirrored your operational cluster. That way if you lose the operational one, you could immediately bring the off-site one operational. However, what happens if you lose both systems at the same time? Like I said at the beginning, you can keep making scenerios where no matter what you do the system is lost.

-SQLBill
 
I concur with SQLBill, but may I add that a SAN storage solution will protect against controller failures; thoug expensive. May server vendors have 100% redundant servers with protection against even CPU failures.

Dell has this resource on SAN:

Read about NEC redundant servers for Win2K:

Last, Stratus gave impetus to Windows redundant servers and still leading the pack in this area. You may find it helpfull to visit their web site. Great stuff !
By the way i do not work for any of the companies mentioned above :)

Good luck
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top