Hello all –
I have a mob of disenchanted users about to lynch me because of a problem that I cannot get to the bottom of. For the sake of simplification, let me summarize one aspect of the problem by saying that I recently migrated Microsoft Exchange Server 5.5 to 2K3, consolidating three sites in the process. Now I have a problem with the users at the remote sites, who require a restart of the centralized Exchange server once every few days in order to connect.
Although the problem manifests itself in one particular application (Exchange), I suspect that this might be a problem with the connecting routers and firewalls (I have submitted this in the MS Exchange forum as well). Forgetting a moment about Exchange, can someone tell me how I might determine whether a particular router (3640 or 2610) or Pix firewall (515 or 501) is overburdened at a given time?
I’ve done a “show process”, “show tcp stat” on the routers, and “show conn” and “show local-host” on the Pixes. But I’m suspecting there might be some kind of lingering connections that might be causing this. Exchange opens four RPC connections per client. Yet connections all time out per the defaults on the Pixes (rpc 10 minutes). The 501’s (at the remote sites) have a 50 local-host limit; the 515 has unlimited inside hosts. Everything appears normal. Is there a practical limit to the number of connections involving one host (the Exch server) that can be open on the routers or Pixes?
Maybe memory is a problem. The 3640 is maxed out at 128MB ram. But the (local) Pix 515 has only 32MB, and the (remote) 501s are at 16MB. I’m at a loss as to how the Cisco devices might be contributing to this, but don’t know what else to try.
This is a tough problem in that it affects only one application – MS Exchange, and yet only remote users of that application. I just don’t have enough troubleshooting tools or experience to nail this one down. Of course, one obvious thing I will try next time this happens is restarting the WAN devices one by one to see if that has any effect – assuming the users don’t kill me for the delays – not sure what that will prove in any case, without a better idea of the causes.
Any suggestions would be appreciated.
I have a mob of disenchanted users about to lynch me because of a problem that I cannot get to the bottom of. For the sake of simplification, let me summarize one aspect of the problem by saying that I recently migrated Microsoft Exchange Server 5.5 to 2K3, consolidating three sites in the process. Now I have a problem with the users at the remote sites, who require a restart of the centralized Exchange server once every few days in order to connect.
Although the problem manifests itself in one particular application (Exchange), I suspect that this might be a problem with the connecting routers and firewalls (I have submitted this in the MS Exchange forum as well). Forgetting a moment about Exchange, can someone tell me how I might determine whether a particular router (3640 or 2610) or Pix firewall (515 or 501) is overburdened at a given time?
I’ve done a “show process”, “show tcp stat” on the routers, and “show conn” and “show local-host” on the Pixes. But I’m suspecting there might be some kind of lingering connections that might be causing this. Exchange opens four RPC connections per client. Yet connections all time out per the defaults on the Pixes (rpc 10 minutes). The 501’s (at the remote sites) have a 50 local-host limit; the 515 has unlimited inside hosts. Everything appears normal. Is there a practical limit to the number of connections involving one host (the Exch server) that can be open on the routers or Pixes?
Maybe memory is a problem. The 3640 is maxed out at 128MB ram. But the (local) Pix 515 has only 32MB, and the (remote) 501s are at 16MB. I’m at a loss as to how the Cisco devices might be contributing to this, but don’t know what else to try.
This is a tough problem in that it affects only one application – MS Exchange, and yet only remote users of that application. I just don’t have enough troubleshooting tools or experience to nail this one down. Of course, one obvious thing I will try next time this happens is restarting the WAN devices one by one to see if that has any effect – assuming the users don’t kill me for the delays – not sure what that will prove in any case, without a better idea of the causes.
Any suggestions would be appreciated.