Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Shaun E on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Need help with AD Communication between sites

Status
Not open for further replies.

knappagh1

Technical User
Sep 16, 2004
82
GB
I'll try explain the setup and the problem as best i can, but please be patient with me. Im new to AD troubleshooting and its taken me about a week to fully figure out the setup alone.

We have got 2 External sites, each with a 2Mb BIP connection to our main site. Initally, i believe the plan was to have a DC on each site and for each site to be a child domain of the domain in the main site. Im not sure how this is achieved but we were led to believe that each site would be independant in that if the link between one of the external sites and the main site goes down, external sites would not be affected. Their users would still be able to log on to their local domains witout any hastle.

The frst time the connection between a site went down, the users were frozen out. They were unable to log on to their local domain, nor were they able to access any information stored on the local servers.

So, the decision was then taken to change the setup in this site so that they were logging onto the domain in the main site. The DC in that site, i am led to believe, is now an equal DC within the domain. So now, when the users in that site logon, the are doing so to the main domain, but through the local DC. This works fine until the link is broke. Once that happens, the same problem outlines above occurs.

Complications.
The main complication in the the external site with all the problems is that there are 2 companys on the site. We have some of our own staff there, who share the site with a sister company. Its only a small company run by the brother of our MD, but we are IT support for them. Thats ok until even the smallest thing goes wrong, then they make noise and IT take all the flack. So when they cannot access their information, IT should just leave the country!!!! So as a result we tend to keep a low profile in that site and would like to do as much as possible from our main site.

And to complicate things further, the guy who set everything up, has since left the company. He was the only one of the IT dept with AD experience. He also had drafted in some ouside help (a guy who "understood" AD and how to set it up) He has being out on site 3 times now, but is still unable to solve the problem, and wont be back either.

If any one can help, or if you need more information, just let me know.

Thanks for reading, and sorry for the long post.

Paddy.
 
Ok. Well it sounds to me like there some misunderstanding going on, but it should not be that difficult to fix. Let me try to get a grip on what you have. Please correct anything thats wrong:

3 sites (1 main and 2 external)
3 servers (1 at each site)
1 Domain

Is this correct?

if this is correct, I would suggest using a small hub and spoke topology for AD. This should provide you the ability to failover if something goes wrong at a single site. Are all of your links through the same provider, so they could be considered a WAN? Or are they completely seperate networks? Is a DC at one site able to communicate completely with a DC at another site, without interruption from a firewall of something?
 
I'll give you a full run down of exactly what we have on each site.

Site A (Main Site)
3 servers running 2003. One as the DC and the other 2 as file servers. We also ave 3 2000 servers running mail, backup etc.

Site B (Problem site)
3 Servers. 1 running 2003 (DC), 1 running 2000 (File server) and a NT4 server acting as a mail server.

Site C (2nd External Site)
1 Server running 2000 (DC)

The connections between the 3 sites are all identical and maintened by the same company. All the sites have a different IP range.

As far as im aware, there is/was a relationship set up between the DC's on each site, so they are able to communicate fine.

Paddy
 
How many people are at each site? Does each site have static IP's sp the servers can be easily located without any DNS tricks to find them?

Before beginning, make SURE you have your sites and subnets setup in the sites and subnets section of AD. If you do not, it will not know how to failover top another DC properly. Make sure you setup all of your subnets and make sure sites are identified properly. In sites and services under IP links, make sure your DEFAULTIPSITEIPLINK has all your sites included. Now you should create a link uder inter-site transports\IP for each site you have setup. For example, you should have one setup named From Site 1 to Site 2. This should be the same for all sites. I would set all replication times to 15 minutes. The cost is not really important. You then just select the sites that will replicate in that link. So, for the Site 1 to Site 2, you would select both sites. This is just a redundant way to make sure replication occurrs. Once this is done, you are all set to proceed.

BTW, make SURE that you DC's and clients are using IP's that are assigned as subnets in your sites and services. If not, failover will not work.

I would try a setup like this:

Site A

1 DC with all FSMO roles. If you have available hardware, I would split the PDC RID Master role from the Infrastructure role. This is a Microsoft best paractice. This server will be a "HUB" DC. It should run DNS as well and function as a "master" DNS. Master DNS is only figurative, so dont take it literally. I would also consider standing up a WINS server as well. I know without Exchange it is not said to be necessary, but in my experience AD is much happier with WINS around. All DC's will run DNS and will forward all unknown requests to this server. This server will then forward all unknown requests to your ISP's DNS or another upstream DNS. The DNS client on this box should point to itself as primary and that would be it unless you have another DNS in the same site.

Site B

1 DC installed with DNS. This will also run DNS. This DNS server will forward unknown requests to the DC at main site. The DNS client will point to itself as primary and the Site A server as secondary. All clients in this location will point to DNS the same way. All machines should point to the single WINS server as well, whereever it is.

Site C

1 DC installed with DNS. This will also run DNS. This DNS server will forward unknown requests to the DC at main site. The DNS client will point to itself as primary and the Site A server as secondary. All clients in this location will point to DNS the same way. All machines should point to the single WINS server as well, whereever it is.

Once this is all setup, I would force the kcc to run. You can do it by running this command:

repadmin /kcc DC_server_name

When this is dont, wait a few minutes and go into sites and services. Under each site you defined drill all the way down to NTDS settings. You should now see an automatically generated replication object. This is good.

Once this is all complete, it should work correctly. If the line at site B goes down, they should be able to authenticate against their local DC. If their DC goes down, they should be able to authenticate against the other DC's. I also would consider removing the 2000 DC or reloading it as 2003. A Native 2003 environment is always a better way to go. Another note would be to get rid of the NT servers and tighten AD security. That NT hole in your AD security could end up being major downfall.
 
Hi djtech2k,

Thanks for the reply. Answers to your questions first then a few queries.

Site 1 - 80 users.
Site 2 - 15 users.(including both companies)
Site 3 - 5 users.

In each site the servers all have static IP's.

Sites and Subnets.
You mention setting up sites and subnets in the sies and subnets section of AD. As far as im aware the Subnets for all the sites are set to 255.255.255.0 Each site has a different IP Range.

Site 1: 192.168.3.xxx
Site 2: 192.168.1.xxx
Site 3: 192.168.10.xxx

Is this setup correct?

Also, you mentioned FSMO roles and the PDC RID Master role. What are these and what are they used for?

I will start looking into this in more detail tomorrow, i have only an hour left till home, so all being well you will have a couple of stars winging their way to you!!!!!


 
Ok. The user distribution is ok, but I would maybe reconsider having a DC at site 3 since there's just 5 users. Thats not really a major deal as of now, but its food for thought.

As for sites and subnets. Here is what I am saying:

1) Go into ADSS and make sure that you have a site defined for each location you have.

2) Make sure that you have a site link setup for replication from the HUB site to the remote sites. There should be an entry for each. This can be viewed under Inter-Site transports\IP.

3) Go under SUBNETS and make sure that all of of your subnets are being defined. For example, if your site 1 uses 192.168.3.xxx, then you will need to define 2 subnets, 1 for 192.168.3.high and 192.168.3.low. So, when you create your subnets for site 1, you should create them with the following choices:

(192.168.3.LOW)
network = 192.168.3.0
subnet = 255.255.255.128
site = site1 (you should have already created this site)

(192.168.3.HIGH)
network = 192.168.3.128
subnet = 255.255.255.128
site = site1 (you should have already created this site)

Follow this method for all sites and subnets. Once this is all correct, you should force the kcc to run with the command I posted. This will have AD automatically review your replication topology and create the appropriate conection objects for replication. If this does not work with a few minutes, somethign is wrong.

Do you understand the method behind this madness? :)

The whole idea is to have subnets listed and have them associated to a site. Then, you need to make sure that each machine at each location is in a defined subnet in ADSS. This way when a machine asks for authentication, AD consults the topology and knows that you have a DC in your site because you are using a subnet assigned to your site and the DC is using a subnet in your site. Then your machine knows it can authenticate to your local DC. If it is not online, it will consult ADSS for another DC to authenticate to.

In the event that a site loses connectivity to the other sites, it should be able to still authenticate users for a time until it is too long. In that type of emergency, you could tell that local DC to assume its own FSMO Roles as a PDC Emulator, etc. Then, when the link came back up, you would have to drop those roles so it yould not conflict with the HUB DC in your main site. It is very unlikely that you should ever have to do that. If you have small outages every once in a while, you should be fine with your local DC at each site.

As for FSMO roles, if you want to know details, I would read on MS about it. Basically, there are a few roles that DC's can share that perform specific tasks. For example, a role is PDC Emulator. This is kind of an offshoot of what used to be a PDC in NT. The PDC Emulator is the "parent" DC if you will. These roles can reside on 1 server, or be split amongst different servers.
 
Hi djtech2k,

I managed to get the above looked at today. I got the sites and subnets setup the way you suggested and went to run the kcc command. Where exactly do i run this from? I have went to Start> run> typed "repadmin /kcc mgn-dc" (mgn-dc is the dc's name) but i get an error message telling me "Windows cannot fnd repadmin..."

Also, the DC on site 3 will be needed in the future. its only a new site and will expand quite quickly, so we are trying to be proactive and get the setup up right before it gets to many more users!!!
 
You will want to run the repadmin command from a command prompt.

To make sure you have all of the tools available, I would install the latest resource kit and support tools on your server and your workstation. Repadmin is in the support tools for 2003. I think replmon is as well and I use it all the time.
 
We're just testing the connection between the sites at the minute, now that everyone is on lunch. Im not at site two, but i believe they can log onto the computers but are unable to access the information on the servers.

OK, upon further investigation, it appears that when the connection goes down, they are logging on locally and because of this and the way the shares are set up, they cannot access the information. The shares are currently setup so that the username/domain-name can access them, but not username/local-computer. I have instructed them to try log onto the domain when the connection is down and see what happens then, so at the minute, im waiting for some feed back.
 
They may be able to get cached credentials from their local mahcines. Are you saying that the internet connection being down or just their DC? And where is this file server they are accessing?
 
We simulated the connection being down between the sites by turning off the router at their end. So, after following your above instructions, they should be logging into the DC in that site. Thats what we were testing over lunch. The connection has since being restored and im waiting on a report at the minute.

The file server is in site 2.

And now, on top of everythng else, since before lunch and before we took turned the router off, one of the users at site 2 cannot log on. I have checked and im able to log on on this site using the username/password combo, but there having no joy over there. You can, however, log on to the computer as administrator. We checked and site 2 is now logging onto their local DC. Is there something i have overlooked that would cause this?
 
Ok, so a remote sites loses it internet connection. It has a local DC, but users cannot login. How is the client machine getting its IP? Is its IP and the IP of the local DC defined in sites and subnets?
 
I assume there still pulling the IP's from the server in Site 1. All of the IP's are defined as above in the sites and subnets.
 
I would check the IP on the station that the person cannot login to. It is critical that the station IP and the local DC IP are in the same site in AD.

Also, I would check the event viewer on the local DC to see if it recorded anything specifically about the user unable to login.
 
I have being talking to my collegue who was in Site 2 Yesterday. It appears that they are getting the same error (ie. not being able to logon with the correct username/password combo) when the connection is down, and unfortunatly when the connection is up also. We have since turned the DC off in site 2, to allow the users to log onto the domain. They can logon without any problems. It would appear that the DC in site 2 is not recognising the Usernames or passwords or both, and its not forwarding them to the DC at the main site for authentication. Would this be a correct assessment?

Also, when the connection goes down, they are unable to logon with cached credentials. This means they can only logon to the local workstation, when in turn means they will not be able to get access to the shares on the local file server. This is a permissions issue wich we are looking into, but its a work around we dont want to have to relay on.
 
Your assessment seems fair, but I do not know why it would be that way. The DC in site 2 that is causing problems; is that DC 2003? How long ago was it built? Are all DC's running SP1?
 
The DC in site 2 is a new machine and build. We originally had an old server about 6/7 weeks back it developed terminal problems with its Raid controllers and is now a rather large paper weight!!! So we got a fairly beefy PC to use as a DC in site 2. So the machine its self is i would think 3 maybe 4 weeks old at the most, with 2003 Server installed. The Machine its self was built at site 1 since moved to site 2. Would this cause the mentioned conflicts?
 
I doubt it, but with the symptoms you are having it almost seems like a configuration problem with this DC or with AD in general. So if you unplug this DC from the network the users at site 2 can logon, but if you put it online they cannot? If thats so, whats the error they get? Are there any errors in evt viewer?
 
Im not sure of the exact error message, but im being told that its similar to the "wrong password" error message received when a wrong password is entered. Im also unable to get hold of the evt logs from the machine. Its turned off at the moment, and i have no way of getting hold of it until Monday morning.

Were starting to think along the lines of it being something in the AD setup and not so much the DC at site 2, although hoping its the site 2 DC, but nither of us here has much AD experience and no training what so ever. Im heading to a training course on "Planning, Implementing & Maintaining a MS Windows Server 2003 Active Directory Infrastructure" in a couple of weeks but if we could get this sorted before then, it would ideal. Is there any way to check the AD setup for something that might be obvious to yourself?
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top