April, 2013

  • The Old New Thing

    Dangerous setting is dangerous: This is why you shouldn't turn off write cache buffer flushing

    • 71 Comments

    Okay, one more time about the Write-caching policy setting.

    This dialog box takes various forms depending on what version of Windows you are using.

    Windows XP:

      Enable write caching on the disk
    This setting enables write caching in Windows to improve disk performance, but a power outage or equipment failure might result in data loss or corruption.

    Windows Server 2003:

      Enable write caching on the disk
    Recommended only for disks with a backup power supply. This setting further improves disk performance, but it also increases the risk of data loss if the disk loses power.

    Windows Vista:

      Enable advanced performance
    Recommended only for disks with a backup power supply. This setting further improves disk performance, but it also increases the risk of data loss if the disk loses power.

    Windows 7 and 8:

      Turn off Windows write-cache buffer flushing on the device
    To prevent data loss, do not select this check box unless the device has a separate power supply that allows the device to flush its buffer in case of power failure.

    Notice that the warning text gets more and more scary each time it is updated. It starts out just by saying, "If you lose power, you might have data loss or corruption." Then it adds a recommendation, "Recommended only for disks with a backup power supply." And then it comes with a flat-out directive: "Do not select this check box unless the device has a separate power supply."

    The scary warning is there for a reason: If you check the box when your hardware does not satisfy the criteria, you risk data corruption.

    But it seems that even with the sternest warning available, people will still go in and check the box even though their device does not satisfy the criteria, and the dialog box says right there do not select this check box.

    And then they complain, "I checked this box, and my hard drive was corrupted! You need to investigate the issue and release a fix for it."

    Dangerous setting is dangerous.

    At this point, I think the only valid "fix" for this feature would be to remove it entirely. This is why we can't have dangerous things.

  • The Old New Thing

    Some trivia about the //build/ 2011 conference

    • 12 Comments

    Registration for //build/ 2013 opens tomorrow. I have no idea what's in store this year, but I figure I'd whet your appetite by sharing some additional useless information about //build/ 2011.

    The internal code name for the prototype tablets handed out at //build/ 2011 was Nike. I think we did a good job of keeping the code name from public view, but one person messed up and accidentally let it slip to Mary-Jo Foley when they said that the contact email for people having tax problems related to the device is nikedistⓐmicrosoft.com.

    The advance crew spent an entire week preparing those devices. One of the first steps was unloading the devices from the pallettes. This was done in a disassembly line: The boxes were opened, the devices were fished out, then removed from the protective sleeve. At the end of this phase, you had one neat stack of boxes and one neat stack of devices.

    The advance crew also configured the hall so they would be ready to start once Redmond sent down the final bits of the Developer Preview build. The hall was divided into sections, and each section consisted of eight long tables. Four of the tables were arranged in a square, and the other four tables were placed outside the square, one parallel to each side, forming four lanes.

     
     
           
     
     

    Along the inner tables, there were docking stations, each with power, wired access to a private network, and a USB thumb drive. Along the outer tables, there were desk organizers like this one, ready to hold several devices in a vertical position, and next to the organizer was a power strip with power cables at the ready.

    In this phase of the preparation, the person working the station would take a device, pop it into a docking station, and power it on with the magic sequence to boot from USB. The USB stick copied itself to a RAM drive, then ran scripts to reformat the hard drive and copy all the setup files from the private network onto the hard drive, then it installed the build onto the machine, installed Visual Studio, installed the sample applications, flashed the firmware, and otherwise prepared the machine for unboxing. (Not necessarily in that order; I didn't write the scripts, so I don't know what they did exactly. But I figure these were the basic steps.) Once the setup files were copied from the private network, the rest of the installation could proceed autonomously. It didn't need any further access to the USB stick or the network. Everything it needed was on the RAM drive or the hard drive.

    The scripts changed the screen color based on what step of the process it was in, so that the person working the station could glance over all the devices to see which ones needed attention. Once all the files were copied from the network, the devices were unplugged from the docking station and moved to the vertical desk organizer. There, it got hooked up with a power cable and left to finish the installation. Moving the device to the second table freed up the docking station to accept another device.

    Assuming everything went well, the screen turned green to indicate that installation was complete, and the device was unplugged, powered down, and placed in the stack of devices that were ready for quality control.

    The devices that passed quality control then needed to be boxed up so they could be handed out to the conference attendees. Another assembly line formed: The devices were placed back in the protective sleeves, nestled snugly in their boxes, and the boxes closed back up.

    Now, I'm describing this all as if everything ran perfectly smoothly. Of course there were problems which arose, some minor and some serious, and the process got tweaked as the days progressed in order to make things more efficient or to address a problem that was discovered.

    For example, the devices were labeled preview devices, but shortly before the conference was set to begin, the manufacturer registered their objection to the term, since preview implies that the device will actually turn into a retail product. They insisted that the devices be called prototype devices. This meant that mere days before the conference opened, a rush print job of 5000 stickers had to be shipped down to the convention center in order to cover the word preview with the word prototype. A new step was added to the assembly line: place sticker over offending word.

    Another example of problem-solving on the fly: The SIM chip for the wireless data plan was preinstalled in the device. The chip came on a punch-out card, and the manufacturer decided to leave the card shell in the box. Okay, I guess, except that the card shell had the SIM card's account number printed on it. Since the reassembly process didn't match up the devices with the original boxes, you had all these devices with unmatched card shells. In theory, somebody might call the service provider and give the account number on the shell rather than the number on the SIM card. To fix this, a new step was added to the assembly line: Remove the card shells. All the previously-assembled boxes had to be unpacked so the shells could be removed. (At some point, somebody discovered that you could extract the shells without removing the foam padding if you held the box at just the right angle and shook it, so that saved a few seconds.)

    Now about the devices themselves: They were a very limited run of custom hardware, and they were not cheap. I think the manufacturing cost was in the high $2000s per unit, and that doesn't count all the sunk costs. I found it amusing when people wrote, "What do you mean a free tablet? Obviously they baked that into the cost of the conference registration, so you paid for it anyway." Conference registeration was $2,095 (or $1,595 if you registered early), which nowhere near covered the cost of the device.

    Some people whined that Microsoft should have made these devices available to the general public for purchase. First of all, these are developer prototypes, not consumer-quality devices. They are suitable for developing Windows 8 software but aren't ready for prime time. (For one thing, they run hot. More on that later.) Second of all, there aren't any to sell. We gave them all away! It's not like there's a factory sitting there waiting for orders. It was a one-shot production run. When they ran out, they ran out.¹

    Third, these devices, by virtue of being prototypes, had a high infant morality rate. I don't know exactly, but I'm guessing that maybe a quarter of them ended up not being viable. One of the things that the advance crew had to do was burn in the devices to try to catch the dead devices. I remember the team being very worried that the hardware helpdesk at the conference would be overwhelmed by machines that slipped through the on-site testing. Luckily, that didn't happen. (Perhaps they were too successful, because everybody ended up assuming that pumping out these puppies was a piece of cake!)

    Doing a little back-of-the-envelope calculations, let's say that the machines cost around $2,750 to produce, and that a quarter of them failed burn-in. Add on top of that a 25% buffer for administrative overhead, and you're looking at a cost-per-device of over $4,500. I doubt there would be many people interested in buying one at that price.

    Especially since you could buy something very similar for around $1100 to $1400. It won't have the hardware customizations, but it'll be close.

    The hardware glitches that occurred during the keynote never appeared during rehearsals in Redmond. But when rehearing in Anaheim, the hardware started flaking out like crazy and eventually self-destructing. (And like I said, those devices weren't cheap!) One of my colleagues got a call from Los Angeles: "When you come down here, bring as many extra Nikes as you can. We're burning through them like mad!" My colleague ended up pissing off everybody in the airport security line behind her when she got to the X-ray machine and unloaded nine devices onto the conveyer belt. "Great, I just put tens of thousands of dollars worth of top-secret hardware on an airport X-ray machine. I hope nothing happens to them."

    Why did the devices start failing during rehearsals in Anaheim, when the ran just fine in Redmond? Because in Anaheim, the devices were being run at full brightness all the time (so they show up better on camera), and they were driving giant video displays, and they were sitting under hot stage lights for hours on end. On top of that, I'm told that the HDMI protocol is bi-directional, so it's possible that the giant video displays at the convention center were feeding data back into the devices in a way that they couldn't handle. Put all that together, and you can see why the devices would start overheating.

    What made it worse was that in order to cram all the extra doodads and sensors into the device, the intestines had to be rearranged, and the touch processor chip ended up being placed directly over the HDMI processor chip. That meant that when the HDMI chip overheated, it caused the touch processor to overheat, too. If you watched the keynote carefully, you'd see that shortly before the machine on stage blew up, you saw the touch sensor flip out and generate phantom touches all over the screen. That was the clue that the machine was about to die from overheating and it would be in the presenter's best interest to switch to another machine quickly. (The problem, of course, is that the presenter is looking out into the audience giving the talk, not staring at the device's screen the whole time. As a result, this helpful early warning signal typically goes unnoticed by the very person who can do the most about it.)

    The day before the conference officially began, Jensen Harris did a preview presentation to the media. One of the glitches that hit during his presentation was that the system started hallucinating an invisible hand that kept swiping the Word Hunt sample game back onto the screen. Jensen quipped, "This is our new auto-Word Hunt feature. We want to make sure you always have Word Hunt when you need it. We've moved beyond touch. Now you don't even need to touch your PC to get access to Word Hunt."

    Jensen's phenomenal calm in the face of adversity also manifested itself during his keynote presentation. You in the audience never noticed it, but at one point, one of the demo applications hit a bug and hung. Jensen spotted the problem before it became obvious and smoothly transitioned to another device and continued. What's more, while he was talking, he went back to the first device and surreptitiously called up Task Manager, killed the the hung application, and prepared the device for the next demo. All this without skipping a beat.

    We are all in awe of Jensen.

    When he stopped by the booth, Jensen said to me, "I don't know how you can stand it, Raymond. Now I can't walk down the hallway without a dozen people coming up to me and wanting to say something or shake my hand or get my autograph!" (One of the rare times we are both in the same room.)

    Welcome to nerd celebrity, Jensen. You just have to smile and be polite.

    Bonus chatter: What happened to the devices that failed quality control? A good number of them were rejected for cosmetic reasons (scuff marks, mostly). As a thank-you gift to the advance crew for all their hard work, everybody was given their choice of a scuffed-up device to take home. The remaining devices that were rejected for purely cosmetic reasons were taken back to Redmond and distributed to the product team to be used for internal testing purposes.

    ¹ My group had one of these scuffed-up devices that we used for internal testing. Somebody dropped it, and a huge spiderweb crack covered the left third of the screen, so you had to squint to see what was on the screen through the cracks. We couldn't order a replacement because there was nowhere to order replacements from. We just had to continue testing with a device that had a badly cracked screen.

  • The Old New Thing

    How can I figure out which user modified a file?

    • 20 Comments

    The Get­File­Time function will tell you when a file was last modified, but it won't tell you who did it. Neither will Find­First­File, Get­File­Attributes, or Read­Directory­ChangesW, or File­System­Watcher.

    None of these the file system functions will tell you which user modified a file because the file system doesn't keep track of which user modified a file. But there is somebody who does keep track: The security event log.

    To generate an event into the security event log when a file is modified, you first need to enable auditing on the system. In the Local Security Policy administrative tool, go to Local Policies, and then double-click Audit Policy. (These steps haven't changed since Windows 2000; the only thing is that the Administrative Tools folder moves around a bit.) Under Audit Object Access, say that you want an audit raised when access is successfully granted by checking Success (An audited security access attempt that succeeds).

    Once auditing is enabled, you can then mark the files that you want to track modifications to. On the Security tab of each file you are interested in, go to the Auditing page, and select Add to add the user you want to audit. If you want to audit all accesses, then you can choose Everyone; if you are only interested in auditing a specific user or users in specific groups, you can enter the user or group.

    After specifying whose access you want to monitor, you can select what actions should generate security events. In this case, you want to check the Successful box next to Create files / write data. This means "Generate a security event when the user requests and obtains permission to create a file (if this object is a directory) or write data (if this object is a file)."

    If you want to monitor an entire directory, you can set the audit on the directory itself and specify that the audit should apply to objects within the directory as well.

    After you've set up your audits, you can view the results in Event Viewer.

    This technique of using auditing to track who is generating modifications also works for registry keys: Under the Edit menu, select Permissions.

    Exercise: You're trying to debug a problem where a file gets deleted mysteriously, and you're not sure which program is doing it. How can you use this technique to log an event when that specific file gets deleted?

  • The Old New Thing

    Dark corners of C/C++: The typedef keyword doesn't need to be the first word on the line

    • 29 Comments

    Here are some strange but legal declarations in C/C++:

    int typedef a;
    short unsigned typedef b;
    

    By convention, the typedef keyword comes at the beginning of the line, but this is not actually required by the language. The above declarations are equivalent to

    typedef int a;
    typedef short unsigned b;
    

    The C language (but not C++) also permits you to say typedef without actually defining a type!

    typedef enum { c }; // legal in C, not C++
    

    In the above case, the typedef is ignored, and it's the same as just declaring the enum the plain boring way.

    enum { c };
    

    Other weird things you can do with typedef in C:

    typedef;
    typedef int;
    typedef int short;
    

    None of the above statements do anything, but they are technically legal in pre-C89 versions of the C language. They are just alternate manifestations of the quirk in the grammar that permits you to say typedef without actually defining a type. (In C89, this loophole was closed: Clause 6.7 Constraint 2 requires that "A declaration shall declare at least a declarator, a tag, or the members of an enumeration.")

    That last example of typedef int short; is particularly misleading, since at first glance it sounds like it's redefining the short data type. But then you realize that int short and short int are equivalent, and this is just an empty declaration of the short int data type. It doesn't actually widen your shorts. If you need to widen your shorts, go see a tailor.¹

    Note that just because it's legal doesn't mean it's recommended. You should probably stick to using typedef the way most people use it, unless you're looking to enter the IOCCC.

    ¹ The primary purpose of this article was to tell that one stupid joke. And it's not even my joke!

  • The Old New Thing

    Technically not lying, but not exactly admitting fault either

    • 11 Comments

    I observed a spill suspiciously close to a three-year-old's play table. I asked, "How did the floor get wet?"

    She replied, "Water."

    It's not lying, but it's definitely not telling the whole story. She'll probably grow up to become a lawyer.

  • The Old New Thing

    The phenomenon of houses with nobody living inside, for perhaps-unexpected reasons

    • 11 Comments

    In London, some of the most expensive real estate is in neighborhoods where relatively few people actually live. According to one company's estimate, 37% of the the residences have been purchased by people who merely use them as vacation homes, visiting only for a week or two per year and leaving the building empty the remainder of the year. In other words, the people who can afford to live there choose not to.

    This same phenomenon is reported in other cities. For example, only 10% of the condos in the Plaza Hotel are occupied full-time.

    Another example of a house with nobody living inside is the case where the house is a façade for an industrial building, most commonly an electrical substation or a subway ventilation shaft.

    I find both categories fascinating.

  • The Old New Thing

    Dreaming about games based on Unicode

    • 20 Comments

    I dreamed that two of my colleagues were playing a game based on pantomiming Unicode code points. One of them got LOW QUOTATION MARK, and the other got a variety of ARROW POINTING NORTHEAST, ARROW POINTING EAST, ARROW POINTING SOUTHWEST.

    I wonder how you would pantomime ZERO WIDTH NON-JOINER.

  • The Old New Thing

    Where did the research project RedShark get its name?

    • 21 Comments

    Project code names are not arrived at by teams of focus groups who carefully parse out every semantic and etymological nuance of the name they choose. (Though if you read the technology press, you'd believe otherwise, because it turns out that taking a code name apart syllable-by-syllable searching for meaning is a great way to fill column-inches.) Usually, they are just spontaneous decisions, inspired by whatever random thoughts jump to mind.

    Many years ago, there was an internal user interface research project code named RedShark. Not Red Shark but RedShark, accent on the Red. Where did this strange name come from?

    From a red shark, of course.

    When the project started up, the people in charge were sitting around and realized they needed to give the project a name. It so happened that the office they were sitting in belonged to a team member who collected a lot of strange toys. One of those toys was an small inflatable red shark.

    Somebody looked around the room and spotted the red shark. "Let's call it RedShark." Nobody else had a better idea, so the name passed by default.

    That small inflatable red shark became their mascot and hung from the ceiling in the hallway.

    No deep, hidden meaning. Just a $3 cheap plastic toy that happened to be in the right place at the right time.

  • The Old New Thing

    Your tenant and your lover, in your dreams

    • 6 Comments

    I dreamed that a friend of mind said, "Between your tenant and your lover, you should get along with at least one of them."

  • The Old New Thing

    If you're going to use an interlocked operation to generate a unique value, you need to use it before it's gone

    • 35 Comments

    Is the Interlocked­Increment function broken? One person seemed to think so.

    We're finding that the Interlocked­Increment is producing duplicate values. Are there are any know bugs in Interlocked­Increment?

    Because of course when something doesn't work, it's because you are the victim of a vast conspiracy. There is a fundamental flaw in the Interlocked­Increment function that only you can see. You are not a crackpot.

    LONG g_lNextAvailableId = 0;
    
    DWORD GetNextId()
    {
      // Increment atomically
      InterlockedIncrement(&g_lNextAvailableId);
    
      // Subtract 1 from the current value to get the value
      // before the increment occurred.
      return (DWORD)g_lNextAvailableId - 1;
    }
    

    Recall that Interlocked­Increment function increments a value atomically and returns the incremented value. If you are interested in the result of the increment, you need to use the return value directly and not try to read the variable you incremented, because that variable may have been modified by another thread in the interim.

    Consider what happens when two threads call Get­Next­Id simultaneously (or nearly so). Suppose the initial value of g_lNext­Available­Id is 4.

    • First thread calls Interlocked­Increment to increment from 4 to 5. The return value is 5.
    • Second thread calls Interlocked­Increment to increment from 5 to 6. The return value is 6.
    • First thread ignores the return value and instead reads the current value of g_lNext­Available­Id, which is 6. It subtracts 1, leaving 5, and returns it.
    • Second thread ignores the return value and instead reads the current value of g_lNext­Available­Id, which is still 6. It subtracts 1, leaving 5, and returns it.

    Result: Both calls to Get­Next­Id return 5. Interpretation: "Interlocked­Increment is broken."

    Actually, Interlocked­Increment is working just fine. What happened is that the code threw away the unique information that Interlocked­Increment returned and instead went back to the shared variable, even though the shared variable changed its value in the meantime.

    Since this code cares about the result of the increment, it needs to use the value returned by Interlocked­Increment.

    DWORD GetNextId()
    {
      // Increment atomically and subtract 1 from the
      // incremented value to get the value before the
      // increment occurred.
      return (DWORD)InterlockedIncrement(&g_lNextAvailableId) - 1;
    }
    

    Exercise: Criticize this implementation of IUnknown::Release:

    STDMETHODIMP_(ULONG) CObject::Release()
    {
     InterlockedDecrement(&m_cRef);
     if (m_cRef == 0)
     {
      delete this;
      return 0;
     }
     return m_cRef;
    }
    
Page 1 of 3 (30 items) 123