Must be a hardware problem...

Must be a hardware problem...

  • Comments 3

Today I spent some time working on a relatively complex virtual machine setup (private domain + App-V server + App-V sequencing virtual machine + SCCM + SCOM).  I had this all running on Hyper-V configured on a server core computer off in another room - and I was managing it from my Vista desktop.

The problem was that after an hour or so the performance of the virtual machines went down the drain.  Windows were sluggish, simple operations were taking minutes, etc...  I immediately started reviewing my configuration.  I had created all of my virtual machines as dual-processor virtual machines, but as this was a quad core system it should be able to handle that fine.  I was running the system a little tight on memory - but not so tight that it should be causing the sort of problems I was seeing.

I checked out performance counters - and got a lot of odd readings - but nothing that pointed to an obvious problem.  So taking a stab in the dark, I recalled that I had recently changed the hard disk controller that was being used for my system disk - and decided to try moving it back to see if that changed anything.

I powered the system down remotely, grabbed my screw driver and tottered off to go and tweak the hardware.  After a minute with the hardware the real problem became immediately obvious.  A loose power cord had wedged itself into the CPU fan, and the CPU was running at ~10% capacity in an attempt to stop the whole computer from going up in smoke.

Once the cable was removed (and secured well away from the CPU fan) performance immediately returned to normal. 

Now I just need to figure out how to get SCOM to monitor for hardware events so I can be notified about this kind of stuff in the future - without needing to do all the trouble-shooting ahead of time.

Cheers,
Ben

Leave a Comment
  • Please add 6 and 1 and type the answer here:
  • Post
  • OpsMgr can monitor hardware events out of the box. It all comes down to what hardware vendor and the model of machine you are using. HP, IBM and Dell have hardware MP's that are reliant on their own management agents being installed on the machines you want to monitor.

    Whilst these agents are primarily designed for servers, I know in the case of some of the vendors software (HP for example) the agent can work on some desktop models.

    Apart from that, it won't be easy as you need to get the information into the Operating System (into the event log or a log file) for OpsMgr to easily collect it.

    Andy

  • To Add to that with an actual example, after just standing up OpsMgr for my client last week, he recieved an alert that the CPU tempature was at a critical level for 4 of their LOB servers, followed by shutdown alerts. After going in 9pm, he found the racks fan died and the servers shutdown automatically from overheating. The imported HP management pack detected this without any configuration needed. I would assume CPU degradation like yours would have triggered an alert before getting down to 10% as well.

     Another thing to look at is OpsMgr+VMM2008+PRO tips. Through Opsmgr, if information comes in regarding degradation of health for any one host system (Hyper-V), it can move those systems to another host or trigger high availability measures to increase optimization. Just ideas..  

  • Hi, is there a new version of VPC under development? I like VPC a lot, but it cannot support 64-bit guest and I don't like Hyper-V, which is so slow. So I am expecting a VPC 2009 or so :)

Page 1 of 1 (3 items)