Increase the timeout values as described on technet. Thus, any iDrac firmware bug resulting in a memory leak may cause these issues. Īs the iDrac versions in question are different (1.5.3 in 2010 and 1.51.51 in 2014), I theorize that the issue is related to iDrac memory leaks in general and not a specific bug. This forum thread from 2011 describes a similar issue. Further research revealed this to be a returning issue. In this case it was the FibreChannel HBA. We recently had an issue on a Dell M820 blade server where a memory leak in iDrac resulted in the mezzanine slots on the server being intermittently switched off. Monitoring of server status is the most frequently used feature, but the BMC controller is also used for remote power control and is able to affect the state of internal components on the motherboard. If the out of band controllers have a problem, this can and will affect the BMC, which in turn may affect the motherboard.
Post originally from 2010, updated 2014.04.04 and superseded by EventID 1004 from IPMIDRV v2 in 2016 Problem This makes sense, as most blade servers have a local out of band controller that is continuously talking to a chassis management controller to provide a central overview of the chassis.Ĭontinue reading “EventID 1004 from IPMIDRV v2” In my experience, this issue is more frequent on blade servers than rack-based servers. This could be competing monitoring agents querying data to frequently, an issue with the BMC itself, or an issue with the out of band management controller. If such operations fail routinely during the day, it is a sign of a conflict. If you have some kind of monitoring agent running on the server, such as SCOM or similar, the error could be triggered by said agent trying to read the current voltage levels on the motherboard. To understand this, we have to look at what is actually happening. Lately though, I have found this error to be a symptom of more serious problems. Thus an increase in the timeout values may be in order as described on technet.
#Poolmon.exe server 2008 windows
I have found that the Windows default settings for the timeouts may cause conflicts, especially on blade servers. As the error message states, you can resolve this error by increasing the timeout, and this is usually sufficient. I have seen these error messages on systems from several suppliers, most notably on IBM and Dell blade servers, but most server motherboards have a BMC. The BMC is also used for communication between the motherboard and dedicated out of band management boards such as Dell iDRAC. It is a microcontroller responsible for communication between the motherboard and management software. The BMC (Baseboard Management Controller) is a component found on most server motherboards. The frequency may vary from a couple of messages per day upwards to several messages per minute. However the operation failed due to a timeout.”
#Poolmon.exe server 2008 driver
“The IPMI device driver attempted to communicate with the IPMI BMC device during normal operation. The system event log is overflowing with EventID 1004 from IPMIDRV.
IPMI seems to be an endless source of “entertainment”…