Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations wOOdy-Soft on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Suddenly our entire domain can't print or obtain their profiles from s

Status
Not open for further replies.

MichealC4

Programmer
Jun 26, 2003
457
I'm losing serious sleep, and hair, over this. I've worked one case with Microsoft and about to work another, and can't get this resolved.

Here's the rundown of what's happening.

- Profiled and non-profiled* machines can't print. It isn't everybody (I've not yet had trouble printing, knock on wood), but it is a lot of people and seems to be getting worse.

- The first case was about when a new user logging in to a machine for the first time, they would get an error about can't find the profile and Windows would give up logging in. One of the machines was fixed by sticking the Default User profile on the network and letting Windows download that during login. That seemed to work for a few others as well, and to my knowledge they aren't having any trouble. However, that's not done much else for us. The printing problem appeared after I enabled icmp echo and echo/reply as appropriate to and from our servers so that Group Policy would work.

- Profiles are not downloading for a lot of profiled* users.

* = profiled/non-profiled meaning profiles are stored on the server with a mandatory setting (ntuser.man) where appropriate. Otherwise profiles are stored on the given machine, such as mine. If you need more information, let me know.

----------------------------
"Will work for bandwidth" - Thinkgeek T-shirt
 
Im just going to throw this out there... And I am trying to assist with your issue. Because you posted in this forum I am assuming you are running Active Directory. The ntuser.man file tells me that you are setting up maditory profiles because you want to manage the user desktop environment. Which I remember using manditory profiles in the old NT4 days. My suggestion, because of issues that manditory pofiles have been know to cause, is to start utilizing Group Policy within AD. Create a pland to ditch the ntuser.man and move to managing your desktop environments via GPOs. I know it is probably not what you want to hear right now, just don't want to see you losing more hair.

If you are not Running Active Directory, there is an NT4 Forum that you can post to. Maybe someone there will be able to assist.

Good luck to you...
 
Yes, I am running Active Directory. Running mandatory profiles (which is what I stated above) is not my choice. What problems are associated with mandatory profiles? I'm not aware of anything, and Microsoft hasn't mentioned migrating to a different solution. Thanks.

----------------------------
"Will work for bandwidth" - Thinkgeek T-shirt
 
Mainly the problem that I have seen with maditory profiles in NT, was corruption. Especially when access them over a network share. I am suprised that Microsoft has not mentioned the use of Group Policy Objects, as this is the newer method of managing the user environment. GPOs, from my understanding, were the answer to replacing Manditory and Roaming user profiles. Again, I know that this information does not solve the issue at hand. But if you are able to convince those in charge that this is the direction to go into, it would make your job much easier and the Users experience less of a headache.

I do wish you the best of luck in resolving the issue with Microsoft, I can only imagine that Management and Users are giving you a hard time. We have all been there....
 
Thanks. A hard time is a bit of an understatement. Pretty much everything is at a standstill. Network drives can be accessed only occassionally, printing hardly works on shared printers, profiles aren't working correctly. Big mess.

----------------------------
"Will work for bandwidth" - Thinkgeek T-shirt
 
You don't provide any details about your network. how many servers are involved? How many are Domain Controllers and how many are Global Catalogs? How is DNS configured?

Roaming or mandatory profiles serve a different purpose than Group Policies, Group Policies can be used to lock down an environment, but that is different from a mandatory profile where the user is not allowed to save documents or add icons to the desktop.

I would begin troubleshooting this by looking at the server event logs of each DC to see if there are any replicaiton problems. I have seen the problem you are talking about many times and there can be differnet causes of the problem.

1. If the server profile is corrupt, you will have problems.
2. If there are problems with replication and your users are authenticated by one server one time and another at a later time you might have problems intermittent problems.
3. If the server is running low on disk space you will have unexpected results.

These are just a few things for you to begin to investigate but the start is at looking at the event logs for an indication of what might be wrong. Check the local workstation logs too after a failed login attempt.

I hope you find this post helpful.

Regards,

Mark

Check out my scripting solutions at
 
To answer in order:

(I don't mean to be terse, I'm just running around trying to fix this and other problems)

We have 8 DC's, 4 of which are DNS servers. 2 of the 8 DC's are Global Catalog servers, one of those two also serve DNS.

We have updated the NIC drivers on two of the servers, and that helped things a bit, but still not fixed. And it only helped things when the network load was low (ie, at the end of the day with fewer students). When the network load was higher (ie, morning/afternoon when lots of students were using computer labs and wireless), things still failed. Another concern is that this happened suddenly. It didn't happen bit by bit, but it happened all at once.

I've checked the profiles on the server to ensure there isn't corruption, but that still doesn't explain why those who don't have profiles on the server are having problems.

----------------------------
"Will work for bandwidth" - Thinkgeek T-shirt
 
Verify how DNS is configured....

DNS Settings:

Configure the server NIC to only list itself or other DCs, no ISP DNS gets configured on the NIC TCP/IP properties.

In DHCP, set the DNS scope option to only provide the IP of your local DNS server

For any statically configured IPs, make sure the DNS only lists local DNS servers and not ISP DNS.

In the DNS snap-in on the forwarders tab enter your ISP DNS.

I hope you find this post helpful.

Regards,

Mark

Check out my scripting solutions at
 
Yes, that's all set correctly.

----------------------------
"Will work for bandwidth" - Thinkgeek T-shirt
 
Have you checked the event logs on each of your 8 DCs for any clues? If so, what errors have been recorded?

Also, what kind of network infrastructure do you have? Is this a switched environment? Have the workstations and switches been set for a specific speed? If both are set to Auto-Detect it will cause problems.

I hope you find this post helpful.

Regards,

Mark

Check out my scripting solutions at
 
Yes, switched environment. Kinda hard to run such a large environment without 'em. :p

As for speed, some are, some aren't (don't ask me why, I haven't a clue), but that's not caused issues, and wouldn't explain why all of a sudden we've had problems.

We did replace a switch that serves our servers in this location, but we are still not sure if that's fixed the issues.

----------------------------
"Will work for bandwidth" - Thinkgeek T-shirt
 
Couple of other things...

What was the last change made before having the issue? Did this issue occur after you replaced switch?

Have you scanned Servers and workstations for virus, worms?
 
The last changes that were made were, per Microsoft's request, we enabled ICMP to and from our servers and put the Default User folder on the netlogon share.

----------------------------
"Will work for bandwidth" - Thinkgeek T-shirt
 
As for speed, some are, some aren't (don't ask me why, I haven't a clue), but that's not caused issues, and wouldn't explain why all of a sudden we've had problems.

We did replace a switch that serves our servers in this location, but we are still not sure if that's fixed the issues.

Did you replace the switch before or after you started to have these problems? Different switches behave differently with respect to the auto-negotiation. In my experience when using the same brand of NIC as the switch it tends to work well but when you cross manufacturers there are problems. I suggest you specifically set the port and NIC on one test user and the same for all servers and see if that resolves the issue.

I hope you find this post helpful.

Regards,

Mark

Check out my scripting solutions at
 
The replacing of the switch was something we did yesterday. We've had this problem for a few days now.

----------------------------
"Will work for bandwidth" - Thinkgeek T-shirt
 
Do you have a network sniffer that can determine if one of your devices is generating exsessive network activity?

And how about virus and worms?

And I agree with Mark - I would at least attempt to hard code the Server NICs and the Switch ports they are connected to. Check to see if you switch has an error log and look for an abundance of connect/disconnect errors.
 
Yes, we've already checked for worms and virii and other malware, including running running rootkit checks on the servers.

As for switch errors with connect/disconnect errors, that's what prompted us to replace the switch in the first place.

----------------------------
"Will work for bandwidth" - Thinkgeek T-shirt
 
A suggested try hard coding the server and switch ports. As Mark has seen, I have also seen errors when the switch and the server are set to auto, in the Cisco world I believe they refer to this as flapping. The switch spends most of its time negotiating the link speed and duplex instead of just sending data. Hard coding eleminates the need to auto negotiate.
 
I understand that and I agree. But exlaining that to management is a little more difficult than that. However, this has never been a problem before. Right now things are holding just fine, so we'll see Tuesday once our students come back and load is back on the network whether or not things hold.

----------------------------
"Will work for bandwidth" - Thinkgeek T-shirt
 
Setting the switch now while load is low is the best time to do it. Changing this won't have any adverse affect. You are simply telling the switch what to support rather than having it try to figure it out.

It seems you are asking for advice but then just discounting it. The fact that something was never a problem before is irrelevent. Things change, drivers are updated, hotfixes are installed. You have now heard from two forum MVPs who have each observed the same behavior before. While there are no guarantees this would resolve your issue, it does eliminate a factor and should not be summarily dismissed because you don't want to talk to the management.

I hope you find this post helpful.

Regards,

Mark

Check out my scripting solutions at
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top