• Ntdebugging Blog

    Ntfs Misreporting Free Space (Part 2)

    • 4 Comments

    Continuing our discussion on the internals of disk usage, we will now shift our focus to internal metadata usage.

    …….. KB in …. Indexes.

     

    Consider for a moment a world without indexes…  The $MFT is a database containing records that are accessed via FRS (file record segment) numbers.  This FRS number includes an embedded sequence number that is updated anytime a file record is deleted & re-used.  A file record must have a new identity once it has been deleted & re-used, so the sequence number is part of this unique identity.  Without indexes, you would have to find files by remembering their FRS / Sequence numbers.  It would be like remembering your favorite web sites by remembering the IP addresses.  In this way, file indexes are like a DNS database so we don’t have to find files using FRS numbers.  The folder structure has a reserved FRS number for the root.  On all NTFS volumes, the root folder is FRS 0x5.  Since the root is in a well known location, it can be accessed without doing any index lookups.

     

    Each folder is like a set of DNS records containing information about a domain in the name space.  The records contain information about the names and FRS numbers for the files “in” the folder.  I put “in” in quotes because the folder itself does not actually contain any files (just records containing basic information about the files).  The files are actually records in the MFT that are accessed by FRS number, so the index entries map names to FRS numbers.  Folders use two types of metadata streams: $INDEX_ROOT:”$I30” and $INDEX_ALLOCATION:”$I30” to track the names that exist in their namespace.  The streams have an attribute type code and a name.  For example, $INDEX_ALLOCATION is the attribute type, and the attribute name is “$I30”.  The “$I30” name is a short tag indicating that the stream contains file name indexes (as opposed to security indexes, reparse indexes, etc.) 

     

    Why “$I30”?  Filenames are largely alphanumeric, and the first alphanumeric character in the UNICODE table is 0x30 (48 for those who are hexadecimally challenged). “$I30” is a shorthand method for saying “Index that’s alphanumeric”.

     

    When a file is created, the $FILENAME information is packaged into a name index record which is stored in the parent folder’s $I30 index.  With the exception of having one or two $I30 related streams, there is very little difference between a file and a folder.

     

    Now, let’s create a new folder "NewFolder" in the root, and look at the $I30 index entry created in the root for "NewFolder".

     

    D:\>md d:\NewFolder

     

    D:\>dir d:\

     Volume in drive D is d

     Volume Serial Number is 4447-4F88

     

     Directory of d:\

     

    10/08/2008  02:42 PM    <DIR>          NewFolder

                   0 File(s)              0 bytes

                   1 Dir(s)      68,620,288 bytes free

     

    If you open up an NTFS exploration tool and read the $INDEX_ALLOCATION:$I30 for the root folder, you will find an index entry in the root folder containing the filename “NewFolder”.  In addition to NFI.EXE, there are some data recovery utilities that can be used to examine NTFS metadata, but I am not able to give any brand names on the blog.  NFI.EXE is a useful tool for drilling down into NTFS, and it’s FREE in the OEM Support Tools Phase 3 Service Release 2.  Since NFI is free, it is also not an officially “supported” utility.  Despite this, NFI can tell you a lot of information about the allocated ranges of any file.  Also, you can give it a logical sector number and it will find the file that owns the sector.  For the purpose of this demonstration though, we will be using the standard command line interface “NFI.EXE C:” or “NFI.EXE [drive\path]”.

     

    C:\Windows\system32>c:\shared\Disktools\nfi.exe d:\

    NTFS File Sector Information Utility.

    Copyright (C) Microsoft Corporation 1999. All rights reserved.

     

    Root Directory

        $STANDARD_INFORMATION (resident)

        $FILE_NAME (resident)

        $SECURITY_DESCRIPTOR (resident)

        $INDEX_ROOT $I30 (resident)

        $INDEX_ALLOCATION $I30 (nonresident)

            logical sectors 78920-78927 (0x13448-0x1344f)

        $BITMAP $I30 (resident)

        Attribute Type 0x100 $TXF_DATA (resident)

     

    Here is the sector in the root directory $I30 index allocation that contains our “NewFolder” index entry.

     

    LBN 78922

     

    0x0000   c6 06 3b 47 75 29 c9 01-c6 06 3b 47 75 29 c9 01   ╞.;Gu)╔.╞.;Gu)╔.

    0x0010   c6 06 3b 47 75 29 c9 01-c6 06 3b 47 75 29 c9 01   ╞.;Gu)╔.╞.;Gu)╔.

    0x0020   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x0030   06 00 00 20 00 00 00 00-07 00 24 00 53 00 65 00   ... ......$.S.e.

    0x0040   63 00 75 00 72 00 65 00-0a 00 00 00 00 00 0a 00   c.u.r.e.........

    0x0050   60 00 50 00 00 00 00 00-05 00 00 00 00 00 05 00   `.P.............

    0x0060   c6 06 3b 47 75 29 c9 01-c6 06 3b 47 75 29 c9 01   ╞.;Gu)╔.╞.;Gu)╔.

    0x0070   c6 06 3b 47 75 29 c9 01-c6 06 3b 47 75 29 c9 01   ╞.;Gu)╔.╞.;Gu)╔.

    0x0080   00 00 02 00 00 00 00 00-00 00 02 00 00 00 00 00   ................

    0x0090   06 00 00 00 00 00 00 00-07 03 24 00 55 00 70 00   ..........$.U.p.

    0x00a0   43 00 61 00 73 00 65 00-03 00 00 00 00 00 03 00   C.a.s.e.........

    0x00b0   60 00 50 00 00 00 00 00-05 00 00 00 00 00 05 00   `.P.............

    0x00c0   c6 06 3b 47 75 29 c9 01-c6 06 3b 47 75 29 c9 01   ╞.;Gu)╔.╞.;Gu)╔.

    0x00d0   c6 06 3b 47 75 29 c9 01-c6 06 3b 47 75 29 c9 01   ╞.;Gu)╔.╞.;Gu)╔.

    0x00e0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x00f0   06 00 00 00 00 00 00 00-07 03 24 00 56 00 6f 00   ..........$.V.o.

    0x0100   6c 00 75 00 6d 00 65 00-05 00 00 00 00 00 05 00   l.u.m.e.........

    0x0110   58 00 44 00 00 00 00 00-05 00 00 00 00 00 05 00   X.D.............

    0x0120   c6 06 3b 47 75 29 c9 01-e7 f8 69 2e d8 38 c9 01   ╞.;Gu)╔.τ°i.╪8╔.

    0x0130   e7 f8 69 2e d8 38 c9 01-e7 f8 69 2e d8 38 c9 01   τ°i.╪8╔.τ°i.╪8╔.

    0x0140   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x0150   06 00 00 10 00 00 00 00-01 03 2e 00 00 00 00 00   ...►............

    0x0160   25 00 00 00 00 00 01 00-68 00 54 00 00 00 00 00   %.......h.T.....

    0x0170   05 00 00 00 00 00 05 00-36 61 f9 9f 75 29 c9 01   ........6a∙ƒu)╔.

    0x0180   8c 08 e1 8d 61 38 c9 01-8c 08 e1 8d 61 38 c9 01   î.ßìa8╔.î.ßìa8╔.

    0x0190   8c 08 e1 8d 61 38 c9 01-00 00 00 00 00 00 00 00   î.ßìa8╔.........

    0x01a0   00 00 00 00 00 00 00 00-00 00 00 10 00 00 00 00   ...........►....

    0x01b0   09 00 4e 00 65 00 77 00-46 00 6f 00 6c 00 64 00   ..N.e.w.F.o.l.d.

    0x01c0   65 00 72 00 6c 00 75 00-23 00 00 00 00 00 01 00   e.r.l.u.#.......

    0x01d0   88 00 74 00 00 00 00 00-05 00 00 00 00 00 05 00   ê.t.............

    0x01e0   f6 7d 8e 47 75 29 c9 01-a6 06 ab 47 75 29 c9 01   ÷}ÄGu)╔.ª.½Gu)╔.

    0x01f0   a6 06 ab 47 75 29 c9 01-a6 06 ab 47 75 29 44 00   ª.½Gu)╔.ª.½Gu)D.

     

    Below is the $I30 index entry in human readable format.  Notice that it has everything needed to populate a WIN32_FIND_DATA  structure, and most importantly, the FRS number of our newly created folder.  The complete index record contains a duplicate copy of the $FILE_NAME attribute from the file record, and this allows FindFirstFile()/FindNextFile() to get all pertinent information about our found file without actually opening the file.

     

    FileReference            FRS,SEQ <0x25, 0x1> // FRS and Sequence number for "NewFolder"

    ParentDirectory          FRS,SEQ <0x5, 0x5>  // FRS and Sequence number for the root folder.

    CreationTime           : 10/08/2008 LCL 14:40:04.520

    LastModificationTime   : 10/08/2008 LCL 14:40:04.707

    LastChangeTime         : 10/08/2008 LCL 14:40:04.707

    LastAccessTime         : 10/08/2008 LCL 14:40:04.707

    Allocated Length       : 0

    File Size              : 0

    File Attribute Flags   : 0x10000000          // Attribute flags

    File Name              : "NewFolder"

     

    Now let’s do a pop quiz on indexes to see if everyone is on the same page…

     

    Suppose that you write a fancy new application and you call FindFirstFile() / FindNextFile() in a loop.  The cFilename string returned during one of the iterations is “MyFile.txt” (you also have the WIN32_FIND_DATA for the same file).

     

    1.       Where did the name “MyFile.txt” come from?

     

    2.       If you call FindFirstFile()/FindNextFile() with a wildcard “*.*”, is it necessary to open each found file to retrieve the WIN32_FIND_DATA?

     

    3.       When you call FindFirstFile(), what is NTFS doing behind the scenes?

     

    4.       What happens when you close the search handle?

     

    Answers

    1.         If you said “from the file’s parent folder’s $I30 index”, then you are correct.

    2.         If you said “no”, then you are correct.  The WIN32_FIND_DATA is also retrieved from the $I30 index.  There is no need to open the individual files to get this information.

    3.         When you call FindFirstFile, NTFS opens the $I30 index stream(s) for the target folder and scans through the index entries for the first record that matches the specified wild card.  A search context is also created to keep track of the current search location in the index stream.

    4.         The target folder’s index handle is closed and the search context is freed.

     

    If you passed the quiz (or at least understand the answers), you're ready to read on…

     

    In short, high index usage is the result of having a large number of index entries.  Common sense would dictate that you probably have the same number of indexes as you have files - Right?  Well....the answer is not quite that simple.  Suppose that you have 8.3 names turned on and you create a file called "tiny.txt".  This file is both 8.3 and LFN compliant, so there will be exactly one index entry created for this file.  Now consider what happens when you create a file named "MyFileHasAReallyLongName.txt".  This is NOT 8.3 compliant, so NTFS will create an 8.3 name ("MyFile~1.txt").  Now NTFS has to maintain an 8.3 index entry, AND an LFN index entry for a single file.  This effectively doubles index usage (plus, long filenames have to be stored in the index and that also makes the LFN filename index larger than normal).  If you plan to create a large number of files on a volume, then it is a good practice to either use 8.3 compliant names, or disable 8.3 name creation altogether.

     

    If you have a large folder and want to see how many bytes are in use by indexes, then use contig.exe (from http://technet.microsoft.com/en-us/sysinternals/bb545046.aspx) to find out the allocated length of the folder's $INDEX_ALLOCATION.  Then divide this number by how many files are in the folder.  That will give you bytes per index entry. 

     

    Below is an example of how to determine how to determine index stream size for a folder.

    In my “System32” folder, I had a $I30 index allocation which was 536,576 bytes long.  It contained records for 2,460 files, so this averages out to 218 bytes per index.  The presence of 8.3 names can be discovered by running “DIR /X”.  On my systems, I don’t have a need for 8.3, so I turned off 8.3 via the registry (refer to KB121007).

    Whenever possible, try to distribute large numbers of files across several volumes.  If you have to put millions of files on a single volume, try to keep your filenames short to save space and improve performance. 

    …….. KB in …. bad sectors.

     

    When a bad sector is detected by CHKDSK /R or if a write occurs because of a bad sector on disk, the cluster that contains the bad sector will be added to the allocated range of $Badclus.  If $Badclus contains any allocated ranges, then it's time to consider replacing the hard drive.

     

    IMPORTANT:  If you have a software mirrored volume, and one hard disk has bad sectors, then it is likely that one of the drives in the mirror is going to fail soon.  If this happens, keep in mind that when you replace the failing drive, the regenerated mirror set will still still have sectors marked in the $Badclus file even though the mirror is healthy.  Since a mirror is a perfect block-by-block copy of the volume, all information for all files is duplicated between the members (including $Badclus).  For this reason, the $Badclus information is mirrored to the working drive as well as the failing drive.

     

     

    …….. KB in use by the system.

     

    System usage is comprised of $MFT, $Logfile, $Secure, and all other supporting structures in the MFT.  If you are looking for system usage, you will need to drill down into the NTFS metadata files.

     

    In most cases, high system usage cannot be “fixed”, but it can be kept under control by proper configuration and user education.  NFI will give you the information about the size of the various internal metadata files, and you can research the details on how each of the internal system files work, but there simply isn’t enough room in a blog post to talk about them all.  However, we will discuss the two most common problems that we see:  1. High $MFT usage, and 2. Bloated Security Stream in the $Secure File.

     

    1.       High $MFT Usage  Every file on the volume is defined by ONE OR MORE file records that are exactly 1KB in size.  If the MFT is large, it's because you have a large number of file records in the MFT (free records are also included in the total MFT size).  Below are two different ways to view the MFT information.  FSUTIL will show you the valid data length, while NFI will give a view of where the fragments of the $MFT:$DATA attribute are laid out on disk.

    Unfortunately, if your $MFT is too big and you want it to be smaller, you will have to reformat the drive.  Just keep in mind that once you restore your files, you will have 1KB of MFT allocated for each file on the drive (lots of extra file records are needed to restore your 20GB compressed files), but I will assume that everyone read part 1 and they are not going to do that.  C|;3)

     

    2.       Bloated Security Stream in the $Secure File

    Following good development practices will save you lots of headaches with your $Secure file.  If you write applications that ACL & re-ACL files over and over and over, your $Secure file probably looks like mine (love those logon scripts from the IT department)…

    To those of you who are savvy with file system internals, you probably noticed something was missing in the picture above.  The $Secure file shows $SII/$SDH INDEX_ROOT(s)/INDEX_ALLOCATION(s), but where is the actual security stream $DATA:$SDS?

     

    There simply was no more room in the base file record for the $DATA:$SDS attribute, so it was moved to a child record.  To find the child record, we can read the $ATTRIBUTE_LIST (via sector editor) and find the pointer to the file record(s) that hold the $SDS stream metadata.  To keep the legal department happy, I can’t give you the data types, but I can tell you that my $DATA:$SDS stream (shown below) is split between two child records because the $SDS stream is heavily fragmented.  The first child record is FRS 0x1888, and the other is FRS 0x12d7e.  If you were to read those two file records, they would each contain the mapping information for approximately 400 fragments of my security stream.

     

    LBN 3537832

     

    0x0000   10 00 00 00 20 00 00 1a-00 00 00 00 00 00 00 00   ... ..→........

    0x0010   09 00 00 00 00 00 09 00-00 00 d4 00 09 00 00 00   ...............

    0x0020   30 00 00 00 20 00 00 1a-00 00 00 00 00 00 00 00   0... ..→........

    0x0030   09 00 00 00 00 00 09 00-07 00 24 04 53 65 53 63   ..........$.SeSc

    0x0040   80 00 00 00 28 00 04 1a-00 00 00 00 00 00 00 00   Ç...(..→........

    0x0050   88 18 00 00 00 00 37 00-00 00 24 00 53 00 44 00   ê↑....7...$.S.D.

    0x0060   53 00 02 00 01 01 00 00-80 00 00 00 28 00 04 1a   S.......Ç...(..→

    0x0070   f9 01 00 00 00 00 00 00-7e 2d 01 00 00 00 0f 00   ∙.......~-......

    0x0080   00 00 24 00 53 00 44 00-53 00 00 00 00 00 00 00   ..$.S.D.S.......

    0x0090   90 00 00 00 28 00 04 1a-00 00 00 00 00 00 00 00   É...(..→........

    0x00a0   09 00 00 00 00 00 09 00-60 17 24 00 53 00 44 00   ........`$.S.D.

    0x00b0   48 00 00 00 00 00 00 00-90 00 00 00 28 00 04 1a   H.......É...(..→

     

    My security stream may look scary because it has 400 fragments, but it is only about 3.3MB plus the size of the $SII & SDH streams.  If it were to grow past the 1GB range, I would start looking for the cause of the growth. 

     

    In theory, you can bloat your $SDS stream by creating lots of unique security descriptors, but this is usually not the cause of bloating.  Instead, most mischief is caused by application developers who call SetFileSecurity() without properly preparing their security descriptor buffer.

     

    Most applications:

     

    1.       Allocate some heap memory.

    2.       Init the SD via InitializeSecurityDescriptor(). 

    3.       Set up the ACE’s.

    4.       Assign security to the target object.

     

    The problem is that heap memory is like recycled paper.  When you call InitializeSecurityDescriptor() the first few bytes of your buffer will say “I’m a security descriptor”, but the ending bytes will have some text from an e-mail you decided not to send to your boss.  As the SD is filled in with ACE’s, the letter to the boss is overwritten with the ACE’s.  At that point, your buffer looks like a valid SD to the system, but there’s still some slack space at the end that says “Porsche destroyed in the fire.  Yours truly, Larry”.  When you send this buffer to SetFileSecurity(), NTFS takes this buffer and computes a hash value to determine whether this SD is unique (the salutation to your boss is also included in the hash).  If the hash is identical to a hash value in the $SDH stream, then we do a comparison between the new & existing SD’s.  If they match a byte-per-byte comparison, then the existing SD is used.  If not, your new SD is added to the stream (along with the bad news about the boss’ car).  To prevent this, always zero your entire security descriptor buffer prior to calling InitializeSecurityDescriptor().  You will prevent $SDS bloating and your boss will never know about the car.

     

     

     

    I hope you all find this information useful in your sleuthing efforts. 

     

    Best regards,

     

    Dennis Middleton “The NTFS Doctor”

     

     

     

  • Ntdebugging Blog

    Remote kernel or user mode debugging of dumps or live systems

    • 2 Comments

     

    GES (Global Escalation Services) is not only responsible for helping our external customers, but we spend a great deal of time collaborating with engineers and developers around the world at our support and development sites.  We often look at large dump files, but in some cases we perform a live debug to determine root cause of a server failure.  In the case of a memory dump, the files are usually very large, so copying the files over the network, even on the fastest WAN connection, can take a LONG time.

    The solution is remote debugging.    

    Here is how you do it!

    First of all it takes two to tango: a remote person and the expert who will help the remote person by debugging the process using the debugger installed on their machine.

    Let’s say you are the expert who is helping the remote person. Here’s how the process works:

    1.       The remote person opens a dump file, debugs a process, or kernel debugs a machine at some remote location using windbg.

    2.       The remote person decides, “I NEED HELP!”

    3.       The remote person simply types in .server tcp:port=9999 at the windbg prompt.

     

     

    Notice the following output.

     

    Server started.  Client can connect with any of these command lines

    0: <debugger> -remote tcp:Port=9999,Server=MyServerName

     

     

    1.       The remote person sends email or IM to the person they want help from with the connection string <debugger> -remote tcp:port=9999,Server=MyServerName

    2.       The expert runs WINDBG -remote tcp:port=9999,Server=MyServerName from the debuggers directory.

    3.       At this point the remote person should see the following message at the remote debugger site.

     

    EXPERTMACHINE\expert (tcp 165.33.5.122:54546) connected at Tue Mar 25 15:36:53 2008

     

    Once connected, the expert can issue any command to debug the dump or target machine remotely.   The great part is many people can connect to the remote debugger session if needed.  It’s a great collaboration tool, and something we use every day at Microsoft.   

     

    When remote debugging, I find it useful to save the debug session in a log file.  It’s as easy as typing .logopen C:\mydebuggersession.log in windbg before the remote debug session starts.  This gives everyone the opportunity to look at the debug session later if necessary.

     

    One last thing to keep in mind about remote debugging is security. I recommend using the .noshell command to prevent the execution of remote shell commands. Without the .noshell command, it is possible for people connected to your session to use the .shell (Command Shell) command to execute an application or a Microsoft MS-DOS command directly from the debugger.

     

    Thanks Jeff-

     

     

  • Ntdebugging Blog

    Windows Hotfixes and Updates - How do they work?

    • 13 Comments

    Today I would like to talk about some of the work the Windows Serviceability (WinSE) team does regarding servicing Windows and releasing updates.

    The operating system is divided into multiple components. Each component can consist of one or more files, registry keys, configuration settings, etc.  WinSE releases updates based on components rather than the entire operating system. This reduces a lot of overhead with having to install updates to components that have not changed. Depending on the severity and applicability of the problem, there are different kinds of release mechanisms. Keep in mind, though, the actual fix still remains the same.

    1.       Updates and Security Updates

    These Updates are typically available on Windows Update. They frequently contain security fixes, and from time to time also contain reliability rollup packages. These updates are thoroughly tested and Microsoft highly recommends that you update your computer with these releases. In fact, most are automatically downloaded to your machine if you have Windows Update turned on. In most cases, Update releases are also available as standalone downloads from the download center.

     

    2.       Hotfixes

    When an individual customer reports a bug to Microsoft for a specific scenario, the WinSE team releases Hotfixes to address these problems. Hotfixes are not meant to be widely distributed and go through a limited amount of testing due to the customer's need for an urgent fix.  Hotfixes are developed in a separate environment than the regular Updates.  This allows Microsoft to release Updates that do not include the Hotfix files, thereby minimizing risk for the customer.

    Once the Hotfix is ready and packaged by WinSE, a KB article is written describing the problem, with instructions on how to obtain the Hotfix.  Microsoft recommends that only customers experiencing the particular problem install the Hotfix for that problem.

    Note: Hotfixes are also sometimes referred to as LDRs, or QFE's (Quick Fix Engineering). The term QFE is an old term that is mostly no longer used in reference to current versions of Windows.

     

    3.       SP  - Service Pack

    The service pack is a major update in the life of an OS. It contains a wide variety of fixes as well as all the GDR and LDR fixes that were released since the previous service pack was shipped. This is a thoroughly tested release and highly recommended by Microsoft for installation. This is usually available as a standalone release, and is then released through Windows Update as well.

     

     

    GDR vs. LDR branches

    Now that we have described the different kinds of updates, let's take a deeper look into how these fixes are built. When a new OS or service pack is released, 2 branches are created from the release code tree -a GDR (general distribution release) branch and a LDR (limited distribution release) branch. Hotfixes are built solely from the LDR branch, while Updates for broad release are built from the GDR branch.

    Service Packs are built from a third branch that contains all Updates , Hotfixes and additional fixes.  This way the new service pack is shipped with all the fixes from both branches.

    Note – Once the new service pack is shipped, the code branches from the previous release are still active and serviced as necessary.

    Installing a Hotfix

    By default, all components on Windows systems start on the GDR branch following each major release. When you install updates from Windows Update for a GDR component, it gets upgraded with the GDR version.

    When you install a specific Hotfix, the files and components in the Hotfix package are migrated to the LDR branch. At this point, that particular component is marked as a LDR component. If you install a newer Update over this component, the Windows servicing technology will automatically install the appropriate latest version from the LDR branch for you. This is possible because each Update package ships with both the GDR and LDR versions of the component.

    Once a component is marked as a LDR component, the only way to move back to the GDR branch is to uninstall all Hotfixes for that component, or move to the next available service pack.

     

    What would happen if a user installed a Hotfix, and then sometime later installed the next service pack? Well, in that case it depends on the Hotfix and when it was built.

    1.       If the Hotfix was built before the service pack, then the component will be moved to the GDR version contained in the service pack.

    2.       If the Hotfix was built after the service pack, the component will be migrated to the post-service pack version of the component, and will stay on the same branch that it was originally on.

     

    In order to make this work, these packages contain both the RTM GDR version, the RTM Hotfix branch, and the SP1 Hotfix and GDR version of each binary.

     

    All fixes built for Windows are cumulative in nature by branch, i.e. a new update will contain the new fix, as well as all the previous fixes for that branch. Referencing the chart above, installing fix #4 can get you fixes #2 and #4 on the GDR branch. If the component is on the LDR branch, then the user would get fixes #1-4.

     

    Finally, the servicing technology has to handle the case where you need the functionality of an older Hotfix (e.g. “Fix #1” in the diagram above) but you may already have installed “Fix #4” which might be a critical security update.  What happens is that when the GDR branch of a fix is installed, it also places a copy of the Hotfix version of the same fix on the system.  When you run the installer for Hotfix #1, it detects that a newer version of the file is already installed, but it also detects that it needs to migrate it to the Hotfix version of the binary that was previously stored on the system. The result is that you end up with the Hotfix binary for Fix #4, which has both the Hotfix you need plus the cumulative set of security fixes.

     

    Stay tuned for more, in the next blog entry, I will talk about the staging mechanism that Windows uses to install Updates and Hotfixes as well as the uninstall process. Also, I will talk about how to determine the branch a file is built from.

     

    - Omer 

     

    More Information

    Description of the standard terminology that is used to describe Microsoft software updates

    Description of the contents of Windows XP Service Pack 2 and Windows Server 2003 software update packages

     

  • Ntdebugging Blog

    How to Determine Which Resource is Causing the Cluster Resource Monitor to Crash – Possible Deadlock

    • 1 Comments

    Hello, my name is John Marlin, and I am a Support Escalation Engineer on the Microsoft Platform Cluster Services Support team.  I wanted to talk about the Windows 2003 Cluster Resource Monitor and with what happens when it crashes. In this blog I’ll show you how to look under the hood to determine why it crashed.

    As a foundation for this article we need to understand the basics of Cluster Resource Monitor.  Below is taken from the Microsoft MSDN site describing the Cluster Resource Monitor.

     

    A Resource Monitor provides a communication, monitoring, and processing layer between the Cluster service and one or more resources. Resource Monitors have the following characteristics:

    ·         A Resource Monitor always runs in a process separate from the Cluster service. If a resource fails, the Resource Monitor isolates the Cluster service from the effects. If the Cluster service fails, the Resource Monitor allows its resources to shut down gracefully.

    ·         To work with a resource, a Resource Monitor loads the resource DLL responsible for that resource type into its process.

    ·         When the Cluster service requests an operation on a resource, the Resource Monitor routes the request to the appropriate entry point function of the resource DLL responsible for the resource. The Resource Monitor performs default processing for some resource operations.

    ·         A Resource Monitor stores synchronized state data, allowing the Cluster service and resource DLLs to operate asynchronously, checking and updating resource status as needed.

    ·         A Resource Monitor periodically checks the operational status of all of its resources. For more information on this process, see Resource Failure.

     

    By default, the Cluster service creates one Resource Monitor per node.

     

    As the MSDN information states, everything currently running on the node is in the one Resource Monitor.  If the Resource Monitor crashes, the system will dump the Resource Monitor Process to a file called RESRCMON.DMP, and create a new instance of the process.  Because it must create a new one, all resources in the monitor are gone and need to be restarted.  When this occurs, you would see the following entry in the Windows System Event Log. 

     

    Event ID:  1146

    Source:  ClusSvc

    Description:  The cluster resource monitor died unexpectedly, an attempt will be made to restart it

     

    After this, you could also see other resource failures (Event ID: 1069) as well as disk related events such as Lost Delayed Writes, etc.  You would see the disk related events because the disk(s) would be considered down and since there is data in the cache of the HBA, it has nowhere to write it.  Hence, lost delayed writes exist until the disk is brought back online.  For our examples here, we will ignore these disk related events as we will focus on why the Resource Monitor crashed.

     

    There are a couple reasons why a Resource Monitor would crash such as an Access Violation (0xC0000005) or a Deadlock (0xC0000194).  In a previous blog, we talked about the Access Violation (0xC0000005).  This blog will focus on deadlocks (0xC0000194) where a particular resource DLL was not responding properly.

     

    Along with the above System Event (Event ID: 1146) where the Resource Monitor died, you will see 0xc00000494 entries in the Cluster Log file. 

     

    NOTE:  The Cluster Log will convert times to Greenwich Mean Time (GMT), so you must ensure you do the proper GMT conversion of time to get to the location in the Cluster Log.

     

    00000a20.00002238::2008/02/06-20:10:25.762 ERR  [RM] Exception. Code = 0xc0000194, Address = 0x77E4BEE7

    00000a20.00002300::2008/02/06-20:10:25.762 ERR  [RM] Exception parameters: 0, d2fa2000, fffffff4, 7ffda000

    00000a20.00002238::2008/02/06-20:10:25.762 ERR  [RM] CallStack:

    00000a20.00002238::2008/02/06-20:10:25.762 ERR  [RM] Frame      Address

    00000a20.00002300::2008/02/06-20:10:25.762 INFO [RM] GenerateMemoryDump: Start memory dump to file C:\WINNT\Cluster\resrcmon.dmp

     

    Now that we see this entry in the log, we should take a look at the Resource Monitor dump to see what caused the failure.  The first thing to examine is the register states, specifically the ESP (stack pointer) value.

     

    0:017> r

    eax=00970000 ebx=000f0cd8 ecx=00000007 edx=7c8285ec esi=000f0cb0 edi=000f0d08

    eip=7c8285ec esp=0227ee10 ebp=0227ee20 iopl=0         nv up ei pl zr na pe nc

    cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246

    ntdll!KiFastSystemCallRet:

    7c8285ec c3              ret

     

    Starting at the stack pointer address 0227ee10, we use the dds command to dump the raw stack.  We are looking for the value on the stack just below the routine resrcmon!GenerateMemoryDump.  It will take several iterations of the dds command to finally get to the value because the call was made much earlier in the stack.

     

    0:017> dds 0227ee10

    0227ee10  00700053 xpsp2res+0xc0053

    0227ee14  00630061

    0227ee18  005f0065

    *** pages removed ***

    0227f650  028a0c90

    0227f654  01ece948

    0227f658  01e94518

    0227f65c  0227f8ac                                           <<-- this address

    0227f660  0100e638 ResrcMon!GenerateMemoryDump+0x180

    0227f664  ffffffff

    0227f668  00001258

    0227f66c  00000018

     

    Now that we have our value, we will use the kv= command with the value 0227f8ac to dump out the stack contents.

     

    0:017> kv=0227f8ac

     

    ChildEBP RetAddr  Args to Child             

    0227eea0 7c826d2b 000801a8 0000066c 0227eff4 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

    0227ef10 7c826c9b 7c833c4e ffffffff 0227ef5c ntdll!ZwClose+0xc (FPO: [1,0,0])

    0227efec 77e63f55 ffffffff 02a00000 00000008 ntdll!ZwAllocateVirtualMemory+0xc (FPO: [6,0,0])

    0227f8ac 0100e989 0227fb78 01003024 00000000 kernel32!UnmapViewOfFile+0x14 (FPO: [Non-Fpo])

    0227f8c4 01008b2c 0227fb78 01003024 0227fb78 ResrcMon!GenerateExceptionReport+0x7e (FPO: [Non-Fpo])

    0227f8d8 76348d17 0227fb78 0227fb78 0227f8f8 ResrcMon!RmpExceptionFilter+0x14 (FPO: [Non-Fpo])

    0227f8e8 7786d6d2 0227fb78 77ecb7c0 0227fb50 netshell!__CxxUnhandledExceptionFilter+0x4a (FPO: [Non-Fpo])

    0227f8f8 77e761b7 0227fb78 00000000 00000000 netman!__CxxUnhandledExceptionFilter+0x4a (FPO: [Non-Fpo])

    0227fb50 77e792a3 0227fb78 77e61ac1 0227fb80 kernel32!UnhandledExceptionFilter+0x12a (FPO: [Non-Fpo])

    0227fb58 77e61ac1 0227fb80 00000000 0227fb80 kernel32!BaseThreadStart+0x4a (FPO: [SEH])

    0227fb80 7c828752 0227ff3c 0227ffdc 0227fc5c kernel32!_except_handler3+0x61 (FPO: [Uses EBP] [3,0,7])

    0227fba4 7c828723 0227ff3c 0227ffdc 0227fc5c ntdll!ExecuteHandler2+0x26

    0227fc4c 7c82863c 0227e000 0227fc5c 00010007 ntdll!ExecuteHandler+0x24

    0227ff2c 77e4bee7 0227ff3c 000a7ca0 c0000194 ntdll!RtlRaiseException+0x3d

    0227ff8c 01007ddd c0000194 00000000 00000000 kernel32!RaiseException+0x53 (FPO: [Non-Fpo])

    0227ffb8 77e64829 000009ec 00000000 00000000 ResrcMon!RmpTimerThread+0xa8 (FPO: [Non-Fpo])

    0227ffec 00000000 01007d35 000a7ca0 00000000 kernel32!BaseThreadStart+0x34 (FPO: [Non-Fpo])

     

    Based on the stack above, we have an exception at address 0x227fb78 which we will use to set the failing context.

     

    0:017> dc 0227fb78

    0227fb78  0227ff3c 0227fc5c 0227fba4 7c828752  <.'.\.'...'.R..|    <<-- Exception and Context Records

    0227fb88  0227ff3c 0227ffdc 0227fc5c 0227fc40  <.'...'.\.'.@.'.

    0227fb98  0227ffdc 7c828766 0227ffdc 0227fc4c  ..'.f..|..'.L.'.

    0227fba8  7c828723 0227ff3c 0227ffdc 0227fc5c  #..|<.'...'.\.'.

    0227fbb8  0227fc40 77e61a60 00000000 0227ff3c  @.'.`..w....<.'.

    0227fbc8  0227ffdc 7c8315c2 0227ff3c 0227ffdc  ..'....|<.'...'.

    0227fbd8  0227fc5c 0227fc40 77e61a60 00000102  \.'.@.'.`..w....

    0227fbe8  00000000 00000000 4f464e49 000a0d20  ........INFO ...

     

    The first DWORD is the Exception Record (0x0227ff3c) and the second is the Context Record (0x0227fc5c) that holds our true stack where the problem occurred.   Based on the .exr output, we see that this is a possible deadlock.

     

    0:017> .exr 0227ff3c

    ExceptionAddress: 77e4bee7 (kernel32!RaiseException+0x00000053)

       ExceptionCode: c0000194

      ExceptionFlags: 00000000

    NumberParameters: 0

     

    0:017> !error c0000194

    Error code: (NTSTATUS) 0xc0000194 (3221225876) - {EXCEPTION}  Possible deadlock condition.

     

    So let’s set the context to get the thread that caused the Resource Monitor to crash.

     

    0:017> .cxr 0227fc5c

    eax=0227ff3c ebx=00000000 ecx=77e61d43 edx=7c8285ec esi=00000000 edi=00000102

    eip=77e4bee7 esp=0227ff38 ebp=0227ff8c iopl=0         nv up ei pl zr na pe nc

    cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246

    kernel32!RaiseException+0x53:

    77e4bee7 5e              pop     esi

     

    0:017> kv

      *** Stack trace for last set context - .thread/.cxr resets it

    ChildEBP RetAddr  Args to Child             

    0227ff8c 01007ddd c0000194 00000000 00000000 kernel32!RaiseException+0x53 (FPO: [Non-Fpo])

    0227ffb8 77e64829 000009ec 00000000 00000000 ResrcMon!RmpTimerThread+0xa8 (FPO: [Non-Fpo])     <<--

    0227ffec 00000000 01007d35 000a7ca0 00000000 kernel32!BaseThreadStart+0x34 (FPO: [Non-Fpo])

     

    Based on the names of the functions in the stack below, we are in the timer thread.  As a side note, with the 0xC0000194 Resource Dumps, we tend to be either in the Timer Thread itself or the Event List that holds the current resource.  In this instance, we are in the timer thread. 

     

    Now let’s look at all the threads using the ~*kb command to find the Event List for this timer thread. I truncated the results showing thread 1 and thread 18.

     

    0:001> kv

    ChildEBP RetAddr  Args to Child             

    0095fc5c 7c827d0b 77e61d1e 00000958 00000000 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

    0095fc60 77e61d1e 00000958 00000000 00000000 ntdll!NtWaitForSingleObject+0xc (FPO: [3,0,0])

    0095fcd0 77e61c8d 00000958 ffffffff 00000000 kernel32!WaitForSingleObjectEx+0xac (FPO: [Non-Fpo])

    0095fce4 310079b8 00000958 ffffffff 010119e0 kernel32!WaitForSingleObject+0x12 (FPO: [Non-Fpo])

    WARNING: Stack unwind information not available. Following frames may be wrong.

    0095fd04 31005464 00000958 000ab638 00000000 JohnHungApp!Startup+0x26de

    0095fe28 0100a385 00000002 0095fe58 01006548 JohnHungApp!Startup+0x18a

    0095fe34 01006548 000ab638 01c868a2 00088038 ResrcMon!Resmon_LooksAlive+0x14 (FPO: [Non-Fpo])

    0095fe58 01006728 00000000 00000102 00000000 ResrcMon!RmpPollBucket+0x156 (FPO: [Non-Fpo])

    0095fe78 010068c9 0008d908 00000000 00000000 ResrcMon!RmpPollList+0xa3 ß-- My Poll Event List

    0095ffb8 77e64829 00000102 00000000 00000000 ResrcMon!RmpPollerThread+0x133 ß-My Timer thread

    0095ffec 00000000 01006796 0008d908 00000000 kernel32!BaseThreadStart+0x34 (FPO: [Non-Fpo])

     

    0:018> kv

    ChildEBP RetAddr  Args to Child             

    022bfbd8 7c827d0b 77e61d1e 000003bc 00000000 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

    022bfbdc 77e61d1e 000003bc 00000000 00000000 ntdll!NtWaitForSingleObject+0xc (FPO: [3,0,0])

    022bfc4c 77e61c8d 000003bc ffffffff 00000000 kernel32!WaitForSingleObjectEx+0xac (FPO: [Non-Fpo])

    022bfc60 10032719 000003bc ffffffff 00000000 kernel32!WaitForSingleObject+0x12 (FPO: [Non-Fpo])

    WARNING: Stack unwind information not available. Following frames may be wrong.

    022bfc78 100fdb0e 000003bc ffffffff 100f7c6c JohnHungApp!VxWaitForEvent+0x22

    022bfd68 100c66d8 ffffffff 00000001 00000000 JohnHungApp!getAllChildNodes+0x8d172

    022bfde4 10012357 0026d838 01ece3f0 022bff44 JohnHungApp!getAllChildNodes+0x89914

    022bffb8 77e64829 0095fd14 00000000 00000000 JohnHungApp!Startup+0x24a1 ------ My Timer Thread

    022bffec 00000000 3100767d 0095fd14 00000000 kernel32!BaseThreadStart+0x34 (FPO: [Non-Fpo])

    Threads 1 and 18 are of most interest as they show JohnHungApp Resource DLL waiting while the PollerThread performs a LooksAlive check.  So this is a likely cause as it did not respond within the expected 80 minute window, causing the crash.  The thread waits on the resource for 80 minutes.

    The next course of action is to review the Poll List to ensure that this is working with JohnHungApp, and to determine type of resource.  So let’s dump the DWORD values of my list (adding +0x114, which is the offset in the data structure that contains the string). Note: We cannot guarantee offsets in future versions of the component.

     

    0:017> dc 0008d908+0x114 l1

    0008da1c  0008d090                             ....                 <<-- My Resource

     

    0:017> dc 0008d090

    0008d090  63727352 00000001 0009ec20 000a7ca8  Rsrc.... ....|..

    0008d0a0  000a15f0 0008d158 0008d038 000a7070  ....X...8...pp..     <<-- My Resource strings

    0008d0b0  00001388 0000ea60 60180000 000a7b70  ....`......`p{..

    0008d0c0  00000000 00000000 00000000 00000000  ................

    0008d0d0  00000000 00000001 601c1d8d 601c3633  ...........`36.`

    0008d0e0  601c34cf 601c3549 601c35b5 601c21c5  .4.`I5.`.5.`.!.`

    0008d0f0  601c20e1 00000000 00000000 601c3712  . .`.........7.`

    0008d100  601c222f 00000003 00000006 0000000c  /".`............

     

    0:017> du 0x000a7070                         <<-- Resource Displayed in Cluster

    000a7070  "Johns Hung Resource"

     

    0:017> du 0x0008d038                         <<-- GUID in registry (HKLM\Cluster\Resources)

    0008d038  "0502cab5-3e1f-47d4-b490-e5301be7"

    0008d078  "2928"

     

    0:017> du 0x0008d158                         <<-- Resource Type

    0008d158  "Johns Hung App"

     

    0:017> du 0x000a15f0                         <<-- Specific DLL being Used

    000a15f0  "johnhungapp.dll"

     

    So this confirms that Johnhungapp.dll is the resource with the problem.  At this point we know the name and GUID of the resource.  Remember, we put an 80 minute time out check on the resource.   Sometimes you can get a little more information from the Cluster Log to possible narrow down the root cause.  So we are back to the original entry of where the Resource Monitor dumped.

     

     

    00000a20.00002238::2008/02/06-20:10:25.762 ERR  [RM] Exception. Code = 0xc0000194, Address = 0x77E4BEE7

    00000a20.00002300::2008/02/06-20:10:25.762 ERR  [RM] Exception parameters: 0, d2fa2000, fffffff4, 7ffda000

    00000a20.00002238::2008/02/06-20:10:25.762 ERR  [RM] Exception parameters: 0, d2fa2000, fffffff4, 7ffda000

    00000a20.00002238::2008/02/06-20:10:25.762 ERR  [RM] CallStack:

    00000a20.00002238::2008/02/06-20:10:25.762 ERR  [RM] Frame      Address

    00000a20.00002300::2008/02/06-20:10:25.762 INFO [RM] GenerateMemoryDump: Start memory dump to file C:\WINNT\Cluster\resrcmon.dmp

     

    If we traverse back 80 minutes, we see the following information in the log.

     

    00001d38.00000568::2008/02/06-18:50:10.179 WARN [FM] FmpHandleResourceTransition: Resource Name = 0502cab5-3e1f-47d4-b490-e5301be72928 [Johns Hung Resource] old state=2 new state=4

    00001d38.00000568::2008/02/06-18:50:10.179 INFO [FM] FmpPropagateResourceState: resource 0502cab5-3e1f-47d4-b490-e5301be72928 failed event.

    00001d38.00000568::2008/02/06-18:50:10.179 INFO [FM] FmpHandleResourceFailure: taking resource 0502cab5-3e1f-47d4-b490-e5301be72928 and dependents offline

    00001d38.00000568::2008/02/06-18:50:10.179 INFO [FM] TerminateResource: 0502cab5-3e1f-47d4-b490-e5301be72928 depends on dfc6b244-888c-43b4-ad3e-f1d26853c9a4. Terminating first

    00001d38.00000568::2008/02/06-18:50:10.622 INFO [FM] RestartResourceTree, Restart resource 0502cab5-3e1f-47d4-b490-e5301be72928

    00001d38.00000568::2008/02/06-18:50:10.959 INFO [FM] FmpPropagateResourceState: resource 0502cab5-3e1f-47d4-b490-e5301be72928 online event.

    00001d38.00000568::2008/02/06-18:50:10.959 INFO [FM] FmpOnlineWaitingTree, Start resource 0502cab5-3e1f-47d4-b490-e5301be72928

    00001d38.00000568::2008/02/06-18:50:10.959 INFO [FM] FmpRmOnlineResource: bringing resource 0502cab5-3e1f-47d4-b490-e5301be72928 (resid 1198584) online.

    00001d38.00000568::2008/02/06-18:50:10.959 INFO [CP] CppResourceNotify for resource Johns Hung Resource

    00001d38.00000568::2008/02/06-18:50:10.959 INFO [FM] FmpRmOnlineResource: called InterlockedIncrement on gdwQuoBlockingResources for resource 0502cab5-3e1f-47d4-b490-e5301be72928

     

    And nothing else other than Cluster issuing a reprieve (wait) every three minutes (default timeout value under the resource) until we dumped the resource monitor.  So now we know the resource actually failed and was attempting to come back online.  Because it was not coming back online, it caused the Resource Monitor to crash.  Since this is a third party resource, you would need to engage the programmer or vendor of the resource to see why it had the issue it did.

     

     

    Some Additional Information Regarding the Exception Address

    In some cases the steps to determine the exception record may not be necessary. The debugger sometimes provides the exception record information when the dump is first opened. Additionally it could show the stack that we found above by simply entering the .ecxr command...

     

    0:017> .ecxr

    eax=0227ff3c ebx=00000000 ecx=77e61d43 edx=7c8285ec esi=00000000 edi=00000102

    eip=77e4bee7 esp=0227ff38 ebp=0227ff8c iopl=0         nv up ei pl zr na pe nc

    cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246

    kernel32!RaiseException+0x53:

    77e4bee7 5e              pop     esi

     

    0:017> kb

      *** Stack trace for last set context - .thread/.cxr resets it

    ChildEBP RetAddr  Args to Child             

    0227ff8c 01007ddd c0000194 00000000 00000000 kernel32!RaiseException+0x53

    0227ffb8 77e64829 000009ec 00000000 00000000 ResrcMon!RmpTimerThread+0xa8

    0227ffec 00000000 01007d35 000a7ca0 00000000 kernel32!BaseThreadStart+0x34

     

    You could also get to the same information using the above original steps (dds 0227ff38) but stopping at the resrcmon!RmpExceptionFilter (Resource Monitor handles the exception) which has the exception as the first parameter.

     

    0:017> dds 0227ee10

    0227ee10  00700053 xpsp2res+0xc0053

    0227ee14  00630061

    0227ee18  005f0065

    0227ee1c  00610043

    0227ee20  00610074

    *** pages removed ***

    0227f650  028a0c90

    0227f654  01ece948

    0227f658  01e94518

    0227f65c  0227f8ac                                        <<-- pointer to Exception address stack

    0227f660  0100e638 ResrcMon!GenerateMemoryDump+0x180

    0227f664  ffffffff

    0227f668  00001258

    0227f66c  00000018

    *** pages removed ***

    0227f874  0100c1fe ResrcMon!except_handler3

    0227f878  01005528 ResrcMon!`string'+0xc

    0227f87c  ffffffff

    0227f880  0100d27b ResrcMon!ClRtlLogPrint+0x499

    0227f884  0100e96c ResrcMon!GenerateExceptionReport+0x61

    0227f888  00000001

    *** pages removed ***

    0227f8b8  01003024 ResrcMon!`string'

    0227f8bc  00000000

    0227f8c0  7786d687 netman!__CxxUnhandledExceptionFilter

    0227f8c4  0227f8d8

    0227f8c8  01008b2c ResrcMon!RmpExceptionFilter+0x14       <<-- Frame 4 in kv= 0227f8ac above

    0227f8cc  0227fb78                                        <<-- Frame 3 in kv= 0227f8ac above

    0227f8d0  01003024 ResrcMon!`string'

  • Ntdebugging Blog

    Unlocking some puzzles requires building a better key... board

    • 2 Comments

    Hi, this is Matt from the Windows Performance team.  Sometimes we are presented with problems that defy our usual troubleshooting and require a creative approach.  In a recent case, we needed a way to test the responsiveness of an application as text was typed into its fields.  Initially, we tested the program using a script that used the SendKeys method to time entry time.  Unfortunately, these tests aren’t completely realistic, since the script can be affected by the processor utilization on the system, and the script can’t create hardware interrupts like a keyboard does.  Realizing that only real keyboard input would be a valid test, and that the rate of typing needed to be reproduced exactly for each test, I set about building an automated keyboard. 

     

    First, I found an old PS/2 style keyboard that hadn’t been used in years and opened it up.  Luckily, it was old enough to use all through-hole components, which made it easier to modify.  The main component I cared about was the keyboard encoder, which was a COP943C.  A search online turned up a datasheet for the keyboard encoder with a sample circuit design that looked very similar to this keyboard.  The document shows there are a couple of steps to determining which pins need to be shorted to generate a particular key. 

     

    Each key has an ID number that is shown in figure one of the PDF (figure one below).  After finding the proper ID, a table is consulted to determine the row and column pins used to create that key code (figure two below).  Finally, those row and column numbers are translated into physical pins on the encoder using the schematic diagram (figure three below).  For example, the letter ‘a’ is number 31.  The matrix shows 31 is made with the L5 (column 6) pin and C6 (row 3) pin.  The pin out shows this to be physical pins 14 and 19.  When tested, shorting these pins creates an ‘a’.

     

    See the video:

     

    Figure 1: Key Codes

     ketpic1

    Figure 2: Key Code Matrix

     key2

    Figure 3: Encoder Pin Out

     key3

    Now that we know how the keyboard circuit works, we need a method to generate key “presses.”  For this, I found a board I assembled a year or two ago using a PCB and components from moderndevice.com.  The board is an Arduino clone that is based on Atmel’s ATmega168 microcontroller.  One of the great things about using an Arduino is their IDE, which allows for C programming with a number of pre-defined functions to make development quick.  Also, the boot loader is already taken care of, which makes the work easier. 

     

    Wiring the board to the keyboard was straightforward.  Figure 4 shows how to control a relay with an Arduino, and triggering a keyboard is rather similar.  A resistor is placed between a digital out pin of the Arduino and the base pin of a transistor.  The collector then goes to one pin of the keyboard encoder needed to type the letter desired, and the emitter goes to the other pin.

     

    key4

    Figure 4: Arduino-controlled Relay[ii]

    In order to save on solder joints, I decided to chain together the transistors, which affected the key selection.  Additionally, because I wanted to leave the encoder in the original circuit and some of the pins were blocked by other components (resistors, capacitors), specific pins were selected.  Figure 5 shows the layout of the transistors.  These were soldered to a prototyping board with hook up wire to connect back to a breadboard with resistors and the Arduino and hook up wire soldered directly to the pins of the keyboard encoder.

    key5

     

    Figure 5: Transistor Layout

     

    These pins selected allowed characters a, s, z, space, and enter to be typed.  All that remained was to write some software to trigger the transistors.  The code first sets the digital pins to output and logic low, turns on a LED to show it is working, then waits 3 minutes to allow time for the PC to boot and application in question to be launched.  The LED then goes out for a five-second warning, and then the loop sequence begins.  The loop turns on the LED, types “as z” followed by enter, then turns off the LED and sleeps for 2.5 seconds before starting again.

     

    // Sample code to drive keyboard encoder

     

    // Matt Burrough

     

    // September, 2008

     

     

    int ledPin = 13;             // Use digital pin 13 for a status LED

     

    int sPin = 3;                // Connect pin 3 to the transistor connected to the s leads

     

    int aPin = 4;                // Pin 4 is for a

     

    int zPin = 5;                // Pin 5 is z

     

    int enterPin = 6;            // Pin 6 is enter

     

    int spacePin = 7;            // Pin 7 is space

     

    int holdKey = 30;            // Milliseconds to "hold" each key down

     

    int betweenKeys = 50;        // Milliseconds to wait between key presses

     

     

    void setup() {               // Initial setup code (runs at power-on)

     

      setupPin(ledPin);          // Set up each pin with function below

     

      setupPin(sPin);        

     

      setupPin(aPin);

     

      setupPin(zPin);

     

      setupPin(enterPin);

     

      setupPin(spacePin);

     

      digitalWrite(ledPin, HIGH);  // Turn on the LED to show the board is on

     

      delay(180000);               // Wait 3 minutes to allow time for PC to boot

     

      digitalWrite(ledPin, LOW);   // Turn off the LED

     

      delay(5000);                 // Wait 5 seconds

     

    }

     

     

    void loop() {

     

      digitalWrite(ledPin, HIGH);  // Turn the LED on

     

     

     

      typeKey(aPin);               // Type keys

     

      typeKey(sPin);

     

      typeKey(spacePin);

     

      typeKey(zPin);

     

      typeKey(enterPin);

     

     

     

      digitalWrite(ledPin, LOW);    // Turn the LED off

     

      delay(2500);                  //Pause 2.5 seconds

     

    }

     

     

    void setupPin(int pin) {        // Used to set up pins...

     

      pinMode(pin, OUTPUT);         // Set the digital pin as output

     

      digitalWrite(pin, LOW);       // Turn off the pin

     

    }

     

     

    void typeKey(int pin) {         // Type a key...

     

      digitalWrite(pin, HIGH);      // "Press down" on a key

     

      delay(holdKey);               // Hold down the key

     

      digitalWrite(pin, LOW);       // "Release" the key

     

      delay(betweenKeys);           // Pause between keys

     

    }

     

     

    Figure 6: Code Sample

     key6

    That’s how I made an automated keyboard.  I hope that you’ve found this post interesting; I’ll leave you with a photo of the finished product. 

     

     

     





Page 1 of 1 (5 items)