Smart questions
Smart answers
Smart people
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Member Login




Remember Me
Forgot Password?
Join Us!

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips now!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

Join Tek-Tips
*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Donate Today!

Do you enjoy these
technical forums?
Donate Today! Click Here

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.
Jobs from Indeed

Link To This Forum!

Partner Button
Add Stickiness To Your Site By Linking To This Professionally Managed Technical Forum.
Just copy and paste the
code below into your site.

Enkrypted (TechnicalUser)
17 May 11 9:39
I've been trying to figure out why my system randomly gets BSOD. It doesn't happen often, but seems to get into a loop for awhile. The last time this happened (about a month ago), this is what I did.

Power Supply - Was recently replaced due to old one possibly not sending enough/sending too much power to the system
Video Card - Recently updated to the latest drivers

Memory/Motherboard (Detailed information below)

So I decided to test the memory with Memtest (Crucial Ballistix PC8500 - 4GB in 1GB sticks). At first I kept all memory in the system. Here are the steps I tried:

4 sticks in system - Memtest failed
Removed memory and tried 1 stick at a time - Memtest passed on all 4
Tried 1 stick in every memory slot on MB - Memtest passed on all 4
Tried 2 sticks in both dual channel slots (1 DC slot at a time) - Memtest passed on both
Tried all 4 sticks in system again - Memtest passed

After that I unplugged the system for a minute and started it up. It worked fine until this most recent blue screen/reboot loop. The memory passed in Memtest in several different tests as well as the slots on the motherboard. I'm kind of stumped as to what the problem can be.

I looked at the event log and it is listed as a bugcheck. Here is the event log:

Description:
The computer has rebooted from a bugcheck.  The bugcheck was: 0x0000001e (0xffffffffc0000005, 0xfffff80002907f60, 0x0000000000000000, 0xffffffffffffffff). A dump was saved in: C:\Windows\MEMORY.DMP

When looking at the file, it is around 550MB and everytime I try to open it, the program I try to open with (Notepad) stops responding.

Any help with this is appreciated. TIA

Enkrypted
A+
 

stduc (Programmer)
17 May 11 11:07
Run memtest again without touching the hardware. If you get a fail then the issue is with one or more memory slots. You may be suffering from thermal creep. If it fails without re-seating the RAM and succeeds when you do re-seat the RAM this is almost certainly the issue. You could try cleaning the connections on the edge of the memory boards very carefully with a soft, clean eraser. Also see if you can get the clips to engage as fully as possible. Ultimately you may have to replace the RAM or the mobo though.

Quote (Enkrypted):

When looking at the file, it is around 550MB and everytime I try to open it, the program I try to open with (Notepad) stops responding.

It won't do you much good to open the file in Notepad, or any text editor. But thanks for the tip on notepads file size limit! Useful to know. LOL

You can analyse the dump with WinDbg, a free download from Microsoft. It will tell what was being executed when the RAM failed. But with memory errors you typically get a different result for each dump. So you won't really learn much.
Enkrypted (TechnicalUser)
17 May 11 12:05
Why would it fail initially and then pass the rest of the tests? (Each stick and slot was tested several times and they all passed afterwards). If it was an issue with a slot or particular stick, then I could see it failing consistently and not just one time.

Enkrypted
A+
 

stduc (Programmer)
17 May 11 13:19
OK - I am assuming that the first test that failed was done BEFORE you touched any of the RAM. I am then assuming you did remove/re-insert RAM sticks.

If that is then case then it is likely one or more slots is suffering from thermal creep. That is to say every time the PC is used and warms up the sticks move a bit and finally stop having a reliable connection. Hence my advice.

You could replace all of the RAM? Probably a waste of time and money. Additionally you could try increasing the RAM clock wait time in the BIOS? That might be worth a go . Personally I would re-test without touching the RAM, if it failed I would re-seat it all and re-test. If the second test passed I would use the PC until it failed again. I would then re-test the RAM and if that again was the failure I would dump the RAM and the mobo because time is money and you can spend hours on these intermittent faults.

If you have the time then try testing one stick at a time in slot one overnight in case that shows up a bad stick. If your mobo will accept one stick in any slot then you could try testing the same tested as good stick in each slot, again overnight to see if you have a bad slot. But that could take you a week!
FredWagner (MIS)
17 May 11 13:27
If you're getting thermal creep as stduc suggests, which seems likely, it could be that your system is running too hot. You've been in the case a number of times, so dust shouldn't be an issue. You ARE closing the case each time you test, I hope - proper airflow from the fans requires that the case be closed so the fans can circulate the air they way they are intended to. Leaving the cover off the case is not a way to help with the cooling - counter-intuitive as that may seem....

Fred Wagner

  

rclarke250 (TechnicalUser)
17 May 11 14:56
I would get the debug tools from Microsoft to read the dump file, you will also need to download the symbol pack. Then use the tool to read the file. Also it should be noted that I agree with Fred to an extent that his response will only be correct with well designed cases.   
BadBigBen (MIS)
17 May 11 18:07
1. instead of using notepad to open the file, use wordpad instead, it can handle larger files...

2. as to the 0x0000001E: KMODE_EXCEPTION_NOT_HANDLED BSOD, it is most likely related to the RAM module creepage,,,

Possible Resolutions to STOP 0x0A, 0x01E, and 0x50 Errors
http://support.microsoft.com/kb/183169

Ben
"If it works don't fix it! If it doesn't use a sledgehammer..."
How to ask a question, when posting them to a professional forum.
Only ask questions with yes/no answers if you want "yes" or "no"

tf1 (TechnicalUser)
18 May 11 7:07
I spent two months on a similar mission trying to track down a BSOD that maybe happened once or twice a week. When it happened, the system just froze for around 5-10 seconds and then went BSOD and other times after freezing, the screen went black. It actually needed a power off to recover: pressing the reset did nothing.

There was neither anything helpful or nothing at all in the Logs and only rarely was a dump successful but also uninformative when it did.

My MEM tests (which took up to 6 hours to complete) showed nothing. Then still suspecting memory, I remove two of the modules and hey presto! It didn't matter which two I used or which pair of slots I populated. Crazy!

So I upgraded my modules to 2 x 4GB Crucial Ballistix (PC2-6400) and it works fine (again in any pair of slots).

Strangely, the original memory were Crucial Ballistix 4 x 2GB modules. Fathom that out!

The remainder of my system is an Asus P5Q-VM with a Quad 9550 processor. Graphics were not the problem because it happened with both the HD 5750 and the on board Intel graphics. All drivers, firmware and software is up to date. So the real problem remains a mystery.
 

Regards: Terry

goombawaho (MIS)
18 May 11 7:45
How come I had never heard of "thermal creep" (as opposed to neighborhood creep and mission creep, which I hear all the time).

Is it very common?  I must have been living under a mushroom.

Wondering what percentage of RAM testing errors could be attributed to that vs. the memory actually being bad.
stduc (Programmer)
18 May 11 8:15
You probably haven't heard of it because it's a lot less common than it used to be. Once upon a time daughter boards would be visibly 1/2 out of their slots when you opened the case up! Score one for experience and two for age - LOL
goombawaho (MIS)
18 May 11 14:10
I guess I'd heard of it, I just thought it wasn't very much in play any longer.
hairlessupportmonkey (IS/IT--Management)
18 May 11 18:22
Ultra SCSI drives are another example of "creep" more to do with the high frequency generated by platters spinning at 10k and 15K, and they tend to sometimes error. often pull them out, let the array rebuild and they are fine......

ACSS - SME

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members!

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close