Larry Osterman's WebLog

Confessions of an Old Fogey
Blog - Title

WFP is my new best friend

WFP is my new best friend

  • Comments 31
I've mentioned our computer setup a couple of times before - Valorie's got her laptop, Daniel, Sharron and I each have our own desktop computers, and there are a couple of other machines floating around the house.  Since the kids machines don't have internet access, we've got a dedicated machine sitting in our kitchen whose sole purpose is to let the kids get their email and surf the net.  The theory is that if they're surfing in the kitchen, it's unlikely they'll go to bad places on the net.

It also means we can easily allow them to run as non admins when they surf but be admins on their machines (which is necessary for some of the games they play).

Ok, enough background.  Yesterday night, I was surfing the web from the kitchen machine, and I noticed that the menu bar on IE had disappeared.  Not only that, but I couldn't right click on any of the toolbars to enable or disable them.  All the IE settings looked reasonable, IE wasn't running in full screen mode, it was just wierd.

Other than this one small behavior (no menus in either IE or other HTML applications (like the user manager and other control panel applets), the machine was working perfectly.  The behavior for HTAs was wierd - there was a windows logo in the middle of the window where the menu bar should be, but that was it.

I ran an anti-spyware and virus scan and found nothing. I went to the KB to see if I could find any reason for this happening, but found nothing.

I even tried starting a chat session with PSS but it never succeeded in connecting.

I must have spent about 2 hours trying to figure out what was wrong.

The first inkling of what was actually wrong was when Daniel asked me to get up so he could read his email - he got this weird message about "Outlook Express could not be started because MSOE.DLL could not be initialized".  That was somewhat helpful, and I went to the KB to look it up.  The KB had lots of examples of this for Win98, but not for XP SP2.  So still no luck.

And then I had my Aha!.  I ran chkdsk /f to force a full chkdsk on the drive and rebooted.

Within a second or so on the reboot, chkdsk started finding corruptions in the hard disk.  One of the files that was corrupted was one of the OE DLL's, another was something related to browsing, and there were a couple of other corrupted files.

I rebooted after running chkdsk, and now I got a message that msimn.exe was invalid or corrupt.  I looked at the file, and yup, MSIMN.EXE had a 0 length. Obviously it was one of the files corrupted on the disk.

So now I had a system that almost was working, but not quite.

During my trolls through the KB, I'd run into the SFC command.  The SFC (System File Checker) is a utility in XP and Win 2K3 that will verify that all files protected by WFP (Windows File Protection) are valid.  If it finds invalid files, it restores them from the cache directory.  As per the KB article, I ran SFC /SCANNOW and waited for a while.  Darned if it didn't find all the files that had been corrupted and repaired them.

So Daniel got his email back, IE got its menus back, and the machine seems to be back on its feet again!

Man, I love it when stuff works the way it's supposed to.

 

Btw, my guess is that the data corruptions have either been there for a while and we didn't notice them, or they were introduced during a rapid series of power failures we had on Saturday and Sunday (this machine isn't currently on a UPS so...).

  • Had a similar issue a year ago, where fonts and graphics became weird in dialogs after I woke up my work-laptop in the morning, including the logon window. I directly suspected some *dlg*.dll which was confirmed after running a chkdsk. I was in a hurry for a meeting so I went to the next cubicle and asked that guy to give me a copy of his dll on a floppy. Good idea to run SFC. I totally forgot about it and WFP, but I had less than 40 minutes to recover and wasn't thinking clear. I remember I rushed to the meeting during rebooting of windows after replacing the file.
  • Yes NTFS can survive a power failure, but hard drives sometimes cannot.

    I've personally had a case when the power cut off right in the middle of disk write and so the respective sector was permanently damaged. The disk would just cause some confusing errors on a particular file, with no automatic sector remapping that was expected to happen in this case.  Scandisk/checkntfs didn't help either, so I had to back up the data and then zero all sectors using low-level disk management utility from the manufacturer (I think it was IBM DeskStar from the infamous DTLA series).

    I'd guess that's what happened with Larry's harddrive.
  • When the power fails, all bets are off, no matter what filesystem you are using.

    The fundamental problem is that the vast majority of hard drives lie about write completion due to cache, and even more lie about write cache flushing.  Due to caching, not only can writes be rearranged, but delayed indefinitely under heavy load.  A journaling filesystem can only detect the corruption, not recover the data.

    If you have write caching on, expect to get data loss during a power failure.  Too many drives simply lie about it, period.  Get a UPS, or turn off write caching and accept the performance hit as the price of reliability.  Or, use an enterprise-class RAID controllers and/or drive with a battery-backed write cache.

    (FYI, the FreeBSD Handbook cites the same problem, as does the man page for Linux hdparm.  This is not a Windows problem, this is across the entire storage industry.)
  • The menu bar disappearing and Windows logo in the middle problem seems to be often caused by win32k running out of desktop heap; at least that's been by far the most often cause I've seen for this particularly annoying problem.
  • Larry, I'd definitely put my money on the power failures. I do mainly computer repair work at the moment, and every time there's a storm or power failure we get a rash of people coming in with problems.

    Every time.

    We try to educate people to turn off/unplug everything during a storm or buy a UPS, but most people aren't convinced that $200 to protect that $2000 PC is a good investment... *shakes head*
  • Most decent HDDs take advantage of the few milliseconds they get before power failure to flush and park the heads. I believe that's what they do when they detect a sliding of the voltage below a certain threshold. In deed, power-supplies advertise the number of milliseconds they hold after power failure at maximum load.

    Larry, I really hoped you'd give my question an answer. But, well, I guess, somethings, no one wants to get close to.
  • I've also needed SFC /SCANNOW and it has saved me.  My question is this: Say I have WinXP SP2 and I've kept it up-to-date with all the fixes and updates.  I have a problem that is solved by running SFC (thanks to WFP!) and it gets the executable from cache or from the installation CD.  Now, all the updates are logged as having been applied, but the ORIGINAL executable exists within the file system.  When I go to Microsoft Update and ask for updates/fixes, does the scan that take place merely query the log files (thinking I have the updated executable) or does it properly recoginize that the old executable I am using needs to be updated because it has been replaced by a (hopefully) more secure and/or error-free version (update)?

    Not that I'm paranoid about things (ok, I am, but ...).
  • Ashod, Larry's on the audio team, not the NTFS team. Microsoft is a very big company and nobody know everything about everything. Perhaps you should ask someone on the NTFS team?  Personally, I've never had this problem, and I've never known anyone else to have it either. Maybe you should do a scan for viruses or spyware. Did you try using Process Explorer or something to see who had a handle to the folder?
  • Dean, I've had essentially the same problem Ashod describes.  (Not quite the same, because scandisk did fix it, and I got ACCESS DENIED errors trying to even open the directory.)  In my case, though, I knew exactly when it happened, though I've forgotten which program I was running that did it.  I seem to recall that it was repeatable, too -- apparently the program was doing something low-level with the filesystem that caused problems.

    In any case, I remember that I ran both chkdsk and scandisk on the affected drive, and only one of them reported the problem (and fixed it).  That may work in Ashod's case as well, if he's only run the one that doesn't fix the problem.
  • Ok if i had a SINGLE file corropted (sorry not a native speaker and it's 5 am in the morning) i'll just format it (if there has been a reasonable time passed since last time like 1 monts) an idea if your using only general stuff on that computer i would advice you to run linux on it damm even a linux on disc versions will be fine if you got compatible hardware so you wont have to do anything about it and you seem to have enough knowledge about comps. so you can arange it to save settings on hd. (you can find how to's everywhere) and even a 1gb hd would be enough. whatever this is what i do with the computers i got for everyones use i thought i might share...
  • PingBack from http://paidsurveyshub.info/story.php?title=larry-osterman-s-weblog-wfp-is-my-new-best-friend

  • PingBack from http://woodtvstand.info/story.php?id=83676

  • PingBack from http://menopausereliefsite.info/story.php?id=1236

  • PingBack from http://homelightingconcept.info/story.php?id=3205

  • PingBack from http://barstoolsite.info/story.php?id=5922

Page 2 of 3 (31 items) 123