Server 2008 X64 Unexpected Reboots

kensiko

Member
Joined
Jun 27, 2011
Messages
5
Location
Canada
Hello,

I will start this thread without giving all the information, I will fill it as necessary.

We built a new server 2 months ago and we now use it. It's a normal PC box but it does have server hardware. There is one addition: OCZ Vertex 2 180GB which contains the database only. The OS is stored on a RAID1 of 500GB hard drives.

This server runs Oracle 11G 64 bits.

I'm an advanced PC user but I am not an IT professional. This new server was built following my recommendation because we had problems with the limited memory available in 32 bits.

So here is the issue: We discovered 2 weeks ago that the server was rebooting unexpectedly and randomly. There is no memory dump, so I suppose there is no BSOD. I'm still not sure if this could be a power supply problem, a motherboard problem or a Windows problem. We tried a few things. One we didn't do yet is to test the RAM. I think we should do it now, but if it would be the RAM I don't know why no memory dump would be created.

Anyway, as I said I will feed-up the information. But any suggestion is appreciated.
 
Here are the hardware specs:
eb52d9e6aa36e78f6c570f7b076c0757.jpg

8b0b24780d06f46199c315ad27539288.jpg

a1318d512efa4a8f28766253119c3c13.jpg

db41c7b99814718e6a5c41e0c7340252.jpg

Video card is a matrox G200
 
Hi,

Can you check the Event Viewer when the server reboots?

If there is an entry with: "Last reboot was unexpected", your server were in BSOD, if not, you can understand what's happening.

Just to know, is this server a DC?
 
Yes the event viewer was the way we detected the problem.

We get Event ID 41: Kernel-Power Critical
The system has rebooted without cleanly shutting down first. This error could be caused if the system stopped responding, crashed, or lost power unexpectedly.

I tried to find any memory dump but could not. I looked if the memory dump was activated and in which folder, it is activated, but nothing is in the specified folder.

What is a Server DC ?
 
Mhhh Kernel-Power error is one of the many unuseful error provided by Microsoft.

As you can read in this article:

http://support.microsoft.com/kb/2028504/en-us

Your case looks like an hardware issue. Memory fault or overheating or power supply are the most commonly problems. Check with MemTest(http://www.memtest.org/#downiso) the RAM, then make sure your server is not too hot and finally try to determine if the wattage supplied is enough. If you are not sure, post your srv config.
 
Yes this is not very useful. It only tells us the PC rebooted without shutting down. It's the exact same message when we press Reset or we have a loss of electricity.

RAM is the next test (we will use the one integrated to Windows). I also saw there are updates for the EFI BIOS, but that is more risky, we simply can't work without this server.

Heat is surely not a problem, the 20-30 PC are cooled in a closed room and the temperature is monitored and controlled.
 
Don't use the memtest integrated with MS, it can't find everything, use memtest :)

Updating the BIOS is ok only if in the release note, there's a correction of a problem you are experiencing now.
 
Back
Top