Jump to content

Recommended Posts

Posted

Hello,

 

I will start this thread without giving all the information, I will fill it as necessary.

 

We built a new server 2 months ago and we now use it. It's a normal PC box but it does have server hardware. There is one addition: OCZ Vertex 2 180GB which contains the database only. The OS is stored on a RAID1 of 500GB hard drives.

 

This server runs Oracle 11G 64 bits.

 

I'm an advanced PC user but I am not an IT professional. This new server was built following my recommendation because we had problems with the limited memory available in 32 bits.

 

So here is the issue: We discovered 2 weeks ago that the server was rebooting unexpectedly and randomly. There is no memory dump, so I suppose there is no BSOD. I'm still not sure if this could be a power supply problem, a motherboard problem or a Windows problem. We tried a few things. One we didn't do yet is to test the RAM. I think we should do it now, but if it would be the RAM I don't know why no memory dump would be created.

 

Anyway, as I said I will feed-up the information. But any suggestion is appreciated.

Posted

Hi,

 

Can you check the Event Viewer when the server reboots?

 

If there is an entry with: "Last reboot was unexpected", your server were in BSOD, if not, you can understand what's happening.

 

Just to know, is this server a DC?

--------------------------------------------------------

Tu peux aussi crire en franais.

Du kannst auch auf Deutsch schreiben.

Puoi scrivere anche in italiano.

--------------------------------------------------------

Posted

Yes the event viewer was the way we detected the problem.

 

We get Event ID 41: Kernel-Power Critical

The system has rebooted without cleanly shutting down first. This error could be caused if the system stopped responding, crashed, or lost power unexpectedly.

 

I tried to find any memory dump but could not. I looked if the memory dump was activated and in which folder, it is activated, but nothing is in the specified folder.

 

What is a Server DC ?

Posted

Mhhh Kernel-Power error is one of the many unuseful error provided by Microsoft.

 

As you can read in this article:

 

http://support.microsoft.com/kb/2028504/en-us

 

Your case looks like an hardware issue. Memory fault or overheating or power supply are the most commonly problems. Check with MemTest(http://www.memtest.org/#downiso) the RAM, then make sure your server is not too hot and finally try to determine if the wattage supplied is enough. If you are not sure, post your srv config.

--------------------------------------------------------

Tu peux aussi crire en franais.

Du kannst auch auf Deutsch schreiben.

Puoi scrivere anche in italiano.

--------------------------------------------------------

Posted

Yes this is not very useful. It only tells us the PC rebooted without shutting down. It's the exact same message when we press Reset or we have a loss of electricity.

 

RAM is the next test (we will use the one integrated to Windows). I also saw there are updates for the EFI BIOS, but that is more risky, we simply can't work without this server.

 

Heat is surely not a problem, the 20-30 PC are cooled in a closed room and the temperature is monitored and controlled.

Posted

Don't use the memtest integrated with MS, it can't find everything, use memtest :)

 

Updating the BIOS is ok only if in the release note, there's a correction of a problem you are experiencing now.

--------------------------------------------------------

Tu peux aussi crire en franais.

Du kannst auch auf Deutsch schreiben.

Puoi scrivere anche in italiano.

--------------------------------------------------------

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...