Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations wOOdy-Soft on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

General Exchange 2007 CCR question

Status
Not open for further replies.

SirCam

MIS
Aug 27, 2002
70
AU
Hi, I am running Exchange 2007 SP1 in a new environment. There are no active users, so usage is very light. We are using CCR for our mailbox server. Last week the network cables were pulled on the servers and the cluster resources failed. Restarting them today, 1 storage group came up fine, but the other 2 failed with replication issues and I had to perform a manual reseed on the passive node. Thinking back, this is the 4th reseed I have done in this environment in the last few months which I think is 4 too many. It seems to crap itself far too easily. Has anyone noticed this with CCR?
Thanks, Cam.
 
I think it's fair to say that you are in the category of users that belong in the "clustering will undoubtedly reduce your availability".
CCR doesn't do what you're seeing so you do have a problem that you've introduced rather than one that's a flaw in Exchange.
 
Maybe. But I would expect CCR to be more resiliant the what it is at the moment. I think when it comes time to implement 2007 into prod we will do log shipping at the SAN level and not use CCR. More reliable and faster.
 
The point is that CCR is resilient. The fact is that you have a configuration problem somewhere. You can't do "log shipping" at the SAN level. What you can do is to snapshot and replicate the volume/lun (whatever terminology you use on your SAN) which ammounts to the same thing.
Using SAN replication rather than CCR has it's benefits but also drawbacks. Remember that you won't be able to do an instant failover between nodes/servers when you're replicating.

I would suggest you formatted the boxes and started again. Perhaps follow a completely separate set of guides - heck, there are enough of them around - to implement it and test failovers.
 
Maybe it is a configuration issue, but we have seen these problems in the test network and in a brand new domain with basic services and no users. If we lose a server and it requires a reseed 20% of the time, that tells me that it is not very resilient and has much room for improvement.

Have you had any occasions where you've had to reseed?
 
Oh, I'd suspect that the database copy has diverged; that's why you need to reseed. If you're drive is undersized from a performance perspective, it's easy to get behind on log replay and the database diverges.

 
No, the servers are brand new puppies, dual CPU, 16GB RAM, connected via crossover cable for cluster traffic and 1Gb for normal traffic (They are kept in the same room). 4 x 300GB RAID 5 disks, so performance shouldn't be an issue.
 
Actually no. four 300GB disks are massively wrong for an Exchange 2007 database. Even if you mean you have five disks and have 1.2TB of storage available to you the performance will just suck and suck bad!

If I was your consultant called in to have a look at this I would tell you flat that your server was wrong, wrong, wrong for Exchange 2007. Your disk latency (queueing) is going to be a major issue once you get past a handful of users.

Forget about the CCR for now, the server is out of whack.
 
Lol, which hardware company do you have shares in? Trust me, latency is not an issue across RAID 5 300GB disks. No it's not ideal for a busy system, but will be more then adequate for a handful of users, especially now when there's no active users.

Test domain has same new servers but with 146GB 15k RAID 1 disks (2 disks per OS, db, and log volumes), no users, same issue.

CCR seems to be a bit touchy. Just wondering if you have had to reseed and under what circumstances.
 
Just you keep a good eye on the old Version Buckets then and you'll be ok.

But as far as re-seeding goes. I've done a lot of lab fails and back etc. Under normal circumstances I've never seen the problem you seem to be having and never heard, in any of the forums that I watch, of similar issues. And this is the kind of thing that if you don't see it in the lab you're never going to see it in production because you firstly have decent kit in production and secondly don't treat the Exchange anywhere near as badly as in the lab.
 
4 x 300GB RAID 5 disks"

Mistake #1. I'd say performance is an issue.

If P is the performance of a spindle and N is the number of spindles in an array, then for raid 5

Write performance = P*(N-1)/4 or, with 15K spindles, 130*(4-1)/4 = 97.5 IOPS

Read performance = P* (N-1) or 390 IOPS with 15K spindles.

With a read write ratio of 1:1 with outlook users in cached mode, 124 IOPS.



With more than just a few users (I'd assume you have that if you have 900GB available for the DBs) your perfprmace will suck. Even with the improvements in log replay in SP1, your performance will still suck. See for yourself. Use perfmon to collect the physical disk counters for your database and log drives (average sec/write and avarage sec/read). Pay particular attention to them during log replay.

 
See now, xmsre, I was trying to be diplomatic :)
Anyway, since the gloves are off, you are so right. Performance will be utterly, utterly god awful.
 
Thanks xmsre, that gives me something to work with. I'll check it out. Cheers.
 
Prior to SP1 there was only one weak excuse for RAID 5. That was on the replication target and only if you iused a lot of small drives. With the log replay improvements in SP1, even that arguement fell apart.

RAID 5 is well suited to applications that read a lot and write very little (read:write ratio over 5:1). A file server is potentially an application that would benefit from RAID 5. Exchange 2007 writes as much as it reads. RAID 5 will require many more spindles than RAID 1 or 10 to meet the write requirements. Probably the most common mistake in sizing storage for Exchange is to forget about the performance requirements and focus only on space. It's a path that results in RAID 5 sets with few large spindles and always leads to performance problems.



 
RAID5 = no brainer. Strap all disks together, never run out of room.
RAID1 = "but its only 2 disks".

From now on, Exchange servers should be scaled using IOPS not size even by small companies.
 
RAID 1 (2 drives, write penalty of 2) , RAID 10 or 0+1 (more than 2 drives, write penalty of 2), or RAID DP (3 or more drives, the more the merrier; no write penalty). The lower the write penalty, the better for applications that write a lot. See for a discussion on RAID types.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top