• Ntdebugging Blog

    NTFS And 4K Disks

    • 3 Comments

    Since the 1960’s, hard disks have always used a block size of 512 bytes for the default read/write block size.  Recently drive manufacturers have been moving toward a larger block size to improve performance and reliability.  Currently there are two types of disks available with a 4KB sector size: 512 byte emulated, and 4KB block sized disks.

     

    Disks with 4KB block size and 512 bytes per sector emulation

    For performance reasons, drive manufacturers have already produced disks with 4KB native block size, which use firmware to emulate 512 bytes per sector.  Because of the emulated 512 byte sector size, the file system and most disk utilities will be blissfully unaware that they are running on a 4KB disk.  As a result, the on-disk structures will be completely unaffected by the underlying 4KB block size.  This allows for improved performance without altering the bytes per sector presented to the file system.  These disks are referred to as 512e (pronounced “five-twelve-eee”) disks.

     

    Disks with 4KB block size without emulation

    When the logical bytes per sector value is extended to 4KB without emulation, the actual file system will have to adjust to this new environment.  Actually, NTFS is already capable of functioning in this environment provided that no attached FS filter drivers make false assumptions about sector size.  Below are the highlights of what you should expect to see on a disk with a 4KB logical sector size.

     

    1. It will not be possible to format with a cluster size that is smaller than the 4KB native block size.  This is because cluster size is defined as a multiple of sector size.  This multiple will always be expressed as 2n .

    2. File records will assume the size of the logical block size of 4KB, rather than the previous size of 1KB.  This actually improves scalability to some degree, but the down-side is that each NTFS file record will require 4KB or more in the MFT.

    3. Sparse and compressed files will continue to have 16 clusters per compression unit.

    4. Since file records are 4 times their normal size, it will be possible to encode more mapping pairs per file record.  As a result, larger files can be compressed with NTFS compression without running into file system limitations.

    5. Since the smallest allowable cluster size is 4KB, NTFS compression will only work on volumes with a 4KB cluster size.

    6. Bytes per index record will be unaffected by the 4K block size since all index records are 4KB in size.  The on-disk folder directory structures will be completely unaffected by the new block size, but a performance increase may be seen while accessing folder structure metadata.

    7. The BIOS Parameter Block (BPB) will continue to have the same format as before, but the only positive value for clusters per File Record Segment (FRS) will be 1.  In the case where clusters per FRS is 1, the FRS byte size is computed by the following equation:

    image

     

    NTFS BIOS Parameter Block Information

     

      BytesPerSector      :        4096

      Sectors Per Cluster :           1

      ReservedSectors     :           0

      Fats                :           0

      RootEntries         :           0

      Small Sectors       :           0 ( 0 MB )

      Media Type          :         248 ( 0xf8 )

      SectorsPerFat       :           0

      SectorsPerTrack     :          63

      Heads               :         255

      Hidden Sectors      :          64

      Large Sectors       :           0 ( 0 MB )

     

      ClustersPerFRS      :           1

      Clust/IndxAllocBuf  :           1

      NumberSectors       :                50431 ( 196.996 MB )

      MftStartLcn         :                16810

      Mft2StartLcn        :                    2

      SerialNumber        :  8406742282501311868

      Checksum            :                    0 (0x0)

     

    If the cluster size is larger than the FRS size, then ClustersPerFrs will be a negative number as shown in the example below (0xf4 is -12 decimal).  In this case, the record size is computed with the equation:

    image

     

    In short, NTFS will always force a 4096 byte cluster size on disk with a 4KB sector size regardless of the cluster size.

     

    NTFS BIOS Parameter Block Information

     

      BytesPerSector      :        4096

      Sectors Per Cluster :           4

      ReservedSectors     :           0

      Fats                :           0

      RootEntries         :           0

      Small Sectors       :           0 ( 0 MB )

      Media Type          :         248 ( 0xf8 )

      SectorsPerFat       :           0

      SectorsPerTrack     :          63

      Heads               :         255

      Hidden Sectors      :          64

      Large Sectors       :           0 ( 0 MB )

     

      ClustersPerFRS      :         f4

      Clust/IndxAllocBuf  :         f4

      NumberSectors       :                50431 ( 196.996 MB )

      MftStartLcn         :                 4202

      Mft2StartLcn        :                    1

      SerialNumber        :  7270585088516976380

      Checksum            :                    0 (0x0)

     

    8. Aside from the 4KB file record size, there are a few other things to know about 4KB drives.  The code for implementing update sequence arrays (USA’s) has always worked on a 512 byte assumed sector size and it will continue to do so.  Since file records are 4 times their normal size, the update sequence arrays for file records now contain 9 entries instead of 3.  One array entry is required for the sequence number (blue) and eight array entries for the trailing bytes (red).  The original purpose of USA is to allow NTFS to detect torn writes.  Since the file record size is now equal to the block size, the hardware is capable of writing the entire file record at once, rather than in two parts.

     

        _MULTI_SECTOR_HEADER MultiSectorHeader {

                ULONG      Signature             : 0x454c4946 "FILE"

                USHORT     SequenceArrayOffset   : 0x0030

                USHORT     SequenceArraySize     : 0x0009

        }

     

     

    0x0000   46 49 4c 45 30 00 09 00-dd 24 10 00 00 00 00 00   FILE0...Ý$.....

    0x0010   01 00 01 00 48 00 01 00-b0 01 00 00 00 10 00 00   ....H...°......

    0x0020   00 00 00 00 00 00 00 00-06 00 00 00 00 00 00 00   ................

    0x0030   02 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x0040   00 00 00 00 00 00 00 00-10 00 00 00 60 00 00 00   ...........`...

    0x0050   00 00 18 00 00 00 00 00-48 00 00 00 18 00 00 00   .......H......

    0x0060   f8 f1 5b 89 36 d2 cb 01-f8 f1 5b 89 36 d2 cb 01   øñ[‰6ÒË.øñ[‰6ÒË.

    0x0070   f8 f1 5b 89 36 d2 cb 01-f8 f1 5b 89 36 d2 cb 01   øñ[‰6ÒË.øñ[‰6ÒË.

    0x0080   06 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x0090   00 00 00 00 00 01 00 00-00 00 00 00 00 00 00 00   ................

    0x00a0   00 00 00 00 00 00 00 00-30 00 00 00 68 00 00 00   ........0...h...

    0x00b0   00 00 18 00 00 00 03 00-4a 00 00 00 18 00 01 00   .......J......

    0x00c0   05 00 00 00 00 00 05 00-f8 f1 5b 89 36 d2 cb 01   ........øñ[‰6ÒË.

    0x00d0   f8 f1 5b 89 36 d2 cb 01-f8 f1 5b 89 36 d2 cb 01   øñ[‰6ÒË.øñ[‰6ÒË.

    0x00e0   f8 f1 5b 89 36 d2 cb 01-00 00 01 00 00 00 00 00   øñ[‰6ÒË.........

    0x00f0   00 00 01 00 00 00 00 00-06 00 00 00 00 00 00 00   ................

    0x0100   04 03 24 00 4d 00 46 00-54 00 00 00 00 00 00 00   ..$.M.F.T.......

    0x0110   80 00 00 00 48 00 00 00-01 00 40 00 00 00 01 00   €...H.....@.....

    0x0120   00 00 00 00 00 00 00 00-ff 00 00 00 00 00 00 00   ........ÿ.......

    0x0130   40 00 00 00 00 00 00 00-00 00 10 00 00 00 00 00   @..............

    0x0140   00 00 10 00 00 00 00 00-00 00 10 00 00 00 00 00   ..............

    0x0150   22 00 01 aa 41 00 ff ff-b0 00 00 00 50 00 00 00   "..ªA.ÿÿ°...P...

    0x0160   01 00 40 00 00 00 05 00-00 00 00 00 00 00 00 00   ..@.............

    0x0170   01 00 00 00 00 00 00 00-40 00 00 00 00 00 00 00   ........@.......

    0x0180   00 20 00 00 00 00 00 00-08 10 00 00 00 00 00 00   . .............

    0x0190   08 10 00 00 00 00 00 00-21 01 a9 41 21 01 fd fd   .......!.©A!.ýý

    0x01a0   00 69 b4 05 80 fa ff ff-ff ff ff ff 00 00 00 00   .i´.€úÿÿÿÿÿÿ....

    0x01b0   00 00 10 00 00 00 00 00-22 00 01 aa 41 00 ff ff   ......."..ªA.ÿÿ

    0x01c0   b0 00 00 00 50 00 00 00-01 00 40 00 00 00 05 00   °...P.....@.....

    0x01d0   00 00 00 00 00 00 00 00-01 00 00 00 00 00 00 00   ................

    0x01e0   40 00 00 00 00 00 00 00-00 20 00 00 00 00 00 00   @........ ......

    0x01f0   08 10 00 00 00 00 00 00-08 10 00 00 00 00 02 00   ..............

    .

    .

    .

    0x03c0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x03d0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x03e0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x03f0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 02 00   ................

    .

    .

    .

    0x05d0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x05e0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x05f0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 02 00   ................

    .

    .

    .

    0x07d0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x07e0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x07f0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 02 00   ................

    .

    .

    .

    0x09d0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x09e0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x09f0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 02 00   ................

    .

    .

    .

    0x0bd0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x0be0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x0bf0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 02 00   ................

    .

    .

    .

    0x0dd0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x0de0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x0df0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 02 00   ................

    .

    .

    .

    0x0fd0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x0fe0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x0ff0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 02 00   ................

     

    I don’t actually own a 4KB disk, but I was able to give you this preview thanks to a nifty tool called VStorControl.  Vstor is a tool which allows you to create virtualized SCSI disks with arbitrary block sizes and is available for download with the Windows 7 SDK.

     

    That’s all for now,

    Dennis Middleton “The NTFS Doctor”

  • Ntdebugging Blog

    A Classic Case of Whodunit

    • 5 Comments

    Sometimes we encounter problems that just don't make sense.  I don't mean a high powered lawyer talking about Chewbacca, I mean sometimes computers do things that defy logic.

     

    The below bugcheck is one example.  At first glance, some people will blame the first third party code they see and declare "It's the anti-virus!"  That is a classic example of people defying logic, but this article is about computers defying logic, so there must be something else going on here.

    3: kd> .bugcheck

    Bugcheck code 00000050

    Arguments c73fdb0b 00000001 809327b8 00000000

    3: kd> k

    ChildEBP RetAddr

    b7b4d5a8 8085e6cd nt!KeBugCheckEx+0x1b

    b7b4d620 8088bc18 nt!MmAccessFault+0xb25

    b7b4d620 809327b8 nt!_KiTrap0E+0xdc

    b7b4d6bc 808ef973 nt!ObReferenceObjectByHandle+0x16e

    b7b4d75c 80888c7c nt!NtQueryInformationFile+0xcd

    b7b4d75c 8082ea49 nt!KiFastCallEntry+0xfc

    b7b4d7e8 b88db606 nt!ZwQueryInformationFile+0x11

    b7b4d864 b88db6c3 NAVAP+0x2e606

    b7b4d88c b88b30f6 NAVAP+0x2e6c3

    b7b4d8e0 b88b3338 NAVAP+0x60f6

    b7b4d900 b88b6a37 NAVAP+0x6338

    b7b4d948 b8993348 NAVAP+0x9a37

    b7b4d96c b8995af8 SYMEVENT!SYMEvent_GetSubTask+0x1438

    b7b4d9e8 b898fe32 SYMEVENT!EventObjectDestroy+0x338

    b7b4d9f8 b89963e8 SYMEVENT+0x4e32

    b7b4da48 8081dcdf SYMEVENT!EventObjectCreate+0x3e8

    b7b4da5c 808f8275 nt!IofCallDriver+0x45

    b7b4db44 80936b13 nt!IopParseDevice+0xa35

    b7b4dbc4 80932e04 nt!ObpLookupObjectName+0x5a9

    b7b4dc18 808ea231 nt!ObOpenObjectByName+0xea

    b7b4dc94 808eb4cb nt!IopCreateFile+0x447

    b7b4dcf0 808edf4a nt!IoCreateFile+0xa3

    b7b4dd30 80888c7c nt!NtCreateFile+0x30

    b7b4dd30 7c82ed54 nt!KiFastCallEntry+0xfc

    059293e0 00000000 0x7c82ed54

     

    The basic premise of all troubleshooting is logic.  I often use a series of questions to shape the logic for the problem I am investigating.  I start all blue screen debugs with the same question, "Why did the system crash?".  The answer to this question is usually in the bugcheck code.

    3: kd> .bugcheck

    Bugcheck code 00000050

    Arguments c73fdb0b 00000001 809327b8 00000000

     

    The debugger.chm help file has a description of this error under the topic "Bug Check 0x50: PAGE_FAULT_IN_NONPAGED_AREA".  It explains that this error happens when invalid memory is accessed, and it shows what the four bugcheck parameters mean.

    Parameter Description
    1 Memory address referenced
    2 0: Read operation
    1: Write operation
    3 Address that referenced memory (if known)
    4 reserved

     

    Interpreting our bugcheck code, address c73fdb0b was written to by the instruction at address 809327b8

     

    If you refer to the documentation on page fault handling in x86 you will see that the processor stores the address being faulted on in cr2 prior to calling the page fault handler.  We can use this to reconfirm the data in the bugcheck code.

    3: kd> r cr2

    Last set context:

    cr2=c73fdb0b

     

    We can confirm that virtual address c73fdb0b really is invalid by looking at the PTE.

    3: kd> !pte c73fdb0b

                        VA c73fdb0b

    PDE at C06031C8            PTE at C0639FE8

    contains 000000021BB36963  contains 0000000000000000

    pfn 21bb36    -G-DA--KWEV  not valid

     

    The next question is why did the instruction at address 809327b8 attempt to write to c73fdb0b?  The call stack and trap frame can answer this question.

    3: kd> kv

    ChildEBP RetAddr  Args to Child

    b7b4d5a8 8085e6cd 00000050 c73fdb0b 00000001 nt!KeBugCheckEx+0x1b (FPO: [Non-Fpo]) (CONV: stdcall)

    b7b4d620 8088bc18 00000001 c73fdb0b 00000000 nt!MmAccessFault+0xb25 (FPO: [Non-Fpo]) (CONV: stdcall)

    b7b4d620 809327b8 00000001 c73fdb0b 00000000 nt!_KiTrap0E+0xdc (FPO: [0,0] TrapFrame @ b7b4d638)

    b7b4d6bc 808ef973 000012e4 00000080 00000180 nt!ObReferenceObjectByHandle+0x16e (FPO: [Non-Fpo]) (CONV: stdcall)

    b7b4d75c 80888c7c 000012e4 b7b4d858 b7b4d810 nt!NtQueryInformationFile+0xcd (FPO: [Non-Fpo]) (CONV: stdcall)

    b7b4d75c 8082ea49 000012e4 b7b4d858 b7b4d810 nt!KiFastCallEntry+0xfc (FPO: [0,0] TrapFrame @ b7b4d778)

    b7b4d7e8 b88db606 000012e4 b7b4d858 b7b4d810 nt!ZwQueryInformationFile+0x11 (FPO: [5,0,0])

    WARNING: Stack unwind information not available. Following frames may be wrong.

    b7b4d864 b88db6c3 000012e4 e1e5ba38 00000000 NAVAP+0x2e606

    b7b4d88c b88b30f6 8b0bff88 b88b3040 b7b4d8fc NAVAP+0x2e6c3

    b7b4d8e0 b88b3338 8b0bff88 00003f80 b88b3040 NAVAP+0x60f6

    b7b4d900 b88b6a37 00000000 b7b4d9ac b88b6a42 NAVAP+0x6338

    b7b4d948 b8993348 e11998b8 b7b4d9a4 00000001 NAVAP+0x9a37

    b7b4d96c b8995af8 00000000 b7b4d9ac b8997526 SYMEVENT!SYMEvent_GetSubTask+0x1438

    b7b4d9e8 b898fe32 b7b4da2c e162be44 b7b4da2c SYMEVENT!EventObjectDestroy+0x338

    b7b4d9f8 b89963e8 b7b4da2c 8b60cc50 b7b4da2c SYMEVENT+0x4e32

    b7b4da48 8081dcdf 8b916f10 8b3557c8 8b3557c8 SYMEVENT!EventObjectCreate+0x3e8

    b7b4da5c 808f8275 b7b4dc04 8cb7dcb0 00000000 nt!IofCallDriver+0x45 (FPO: [Non-Fpo]) (CONV: fastcall)

    b7b4db44 80936b13 8cb7dcc8 00000000 8b452008 nt!IopParseDevice+0xa35 (FPO: [Non-Fpo]) (CONV: stdcall)

    b7b4dbc4 80932e04 00000000 b7b4dc04 00000040 nt!ObpLookupObjectName+0x5a9 (FPO: [Non-Fpo]) (CONV: stdcall)

    b7b4dc18 808ea231 00000000 00000000 2775a801 nt!ObOpenObjectByName+0xea (FPO: [Non-Fpo]) (CONV: stdcall)

    b7b4dc94 808eb4cb 059293e8 80100080 05929384 nt!IopCreateFile+0x447 (FPO: [Non-Fpo]) (CONV: stdcall)

    b7b4dcf0 808edf4a 059293e8 80100080 05929384 nt!IoCreateFile+0xa3 (FPO: [Non-Fpo]) (CONV: stdcall)

    b7b4dd30 80888c7c 059293e8 80100080 05929384 nt!NtCreateFile+0x30 (FPO: [Non-Fpo]) (CONV: stdcall)

    b7b4dd30 7c82ed54 059293e8 80100080 05929384 nt!KiFastCallEntry+0xfc (FPO: [0,0] TrapFrame @ b7b4dd64)

    059293e0 00000000 00000000 00000000 00000000 0x7c82ed54

    3: kd> .trap b7b4d638

    ErrCode = 00000002

    eax=00000000 ebx=8b40af78 ecx=00000180 edx=b7b4d6e8 esi=8b81bb40 edi=e5c265c8

    eip=809327b8 esp=b7b4d6ac ebp=b7b4d6bc iopl=0         nv up ei ng nz na pe nc

    cs=0008  ss=0010  ds=0023  es=0023  fs=0030  gs=0000             efl=00010286

    nt!ObReferenceObjectByHandle+0x16e:

    809327b8 894a04          mov     dword ptr [edx+4],ecx ds:0023:b7b4d6ec=00000000

     

    Reading the instruction that we failed on, at address 809327b8, we can see that it dereferenced edx+4 where edx is b7b4d6e8.  This would result in a write to the memory at address b7b4d6ec.

    3: kd> !pte b7b4d6ec

                        VA b7b4d6ec

    PDE at C0602DE8            PTE at C05BDA68

    contains 000000010BA77863  contains 00000001BB583963

    pfn 10ba77    ---DA--KWEV  pfn 1bb583    -G-DA--KWEV

     

    3: kd> dd b7b4d6ec

    b7b4d6ec  00000000 00000000 00000000 00000000

    b7b4d6fc  00000000 b7b4d804 808ef01e 00000098

    b7b4d70c  00000005 00000000 8b81bb40 808ef045

    b7b4d71c  b7b4d86c 00000000 b7b4d838 b7b4d800

    b7b4d72c  00000000 00000000 00000007 00000001

    b7b4d73c  00000000 00000000 b7b4d6dc 00000000

    b7b4d74c  ffffffff 80880c90 80802b70 ffffffff

    b7b4d75c  b7b4d778 80888c7c 000012e4 b7b4d858

     

    This is where the logic starts to break down.  The code wanted to write to b7b4d6ec, which is a valid address.  The bugcheck code and cr2 say we failed writing to address c73fdb0b.  This does not make sense.

     

    The analogy I often use for scenarios such as this one is: If I ask my intern to get me a Mountain Dew from the break room, and he comes back to say we are out of coffee, am I at fault or is my intern broken?  Applying the same logic to this crash, if ObReferenceObjectByHandle asks the hardware to write to address b7b4d6ec and the hardware came back saying it cannot write to address c73fdb0b, is the software at fault or is the hardware broken?  Clearly the hardware is broken if it does not do what the software asks of it.

     

    In this instance, the customer replaced the processor and afterwards the system was stable.

Page 1 of 1 (2 items)