I always wanted to find a simple way to figure out the answer for this puzzling question. Usually I ended up opening Task Manager to find out what process eats all my CPU resources. But this doesn't work in most of the cases. For example, what about the case when the "System process" (i.e. a kernel-mode component) is eating all my CPU?

But I just found a true gem called Kernrate. This tool (available for free download here) does present accurate CPU statistics at API level! Even in Kernel mode.

All you have to do is to run Kernrate_i386_XP.exe from the Kernrates directory, and then press Ctrl-C after a while. You get all sorts of interesting statistics (context switches, interrupts per second, etc.) like the ones below. But the most interesting part is the CPU percentage spent in all the kernel-mode components, in decreasing order of consumed CPU time. Below, I ran Kernrate while doing a recursive DIR /S C: on a separate command shell:

C:\KrView\Kernrates>Kernrate_i386_XP.exe
 /==============================\
<         KERNRATE LOG           >
 \==============================/
Date: 2004/12/21   Time: 15:33:21
Machine Name: AOLTEAN-H4
Number of Processors: 1
PROCESSOR_ARCHITECTURE: x86
PROCESSOR_LEVEL: 6
PROCESSOR_REVISION: 0800
Physical Memory: 480 MB
Pagefile Total: 1125 MB
Virtual Total: 2047 MB
PageFile1: \??\E:\pagefile.sys, 720MB
OS Version: 5.1 Build 2600 Service-Pack: 2.0
WinDir: E:\WINDOWS

Kernrate User-Specified Command Line:
Kernrate_i386_XP.exe


Kernel Profile (PID = 0): Source= Time,
Using Kernrate Default Rate of 25000 events/hit
Starting to collect profile data

***> Press ctrl-c to finish collecting profile data
===> Finished Collecting Data, Starting to Process Results

------------Overall Summary:--------------

P0     K 0:00:13.656 (38.2%)  U 0:00:02.484 ( 7.0%)  I 0:00:19.578 (54.8%)  DPC 0:00:00.312 ( 0.9%)  Interrupt 0:00:00.296 ( 0.8%)
       Interrupts= 107928, Interrupt Rate= 3022/sec.


Total Profile Time = 35718 msec

                                       BytesStart          BytesStop         BytesDiff.
    Available Physical Memory   ,       107065344,       112259072,         5193728
    Available Pagefile(s)       ,       374460416,       371945472,        -2514944
    Available Virtual           ,      2132889600,      2131841024,        -1048576
    Available Extended Virtual  ,               0,               0,               0

                                  Total      Avg. Rate
    Context Switches     ,       395167,         11063/sec.
    System Calls         ,      1106131,         30968/sec.
    Page Faults          ,        20595,         577/sec.
    I/O Read Operations  ,         2346,         66/sec.
    I/O Write Operations ,         1102,         31/sec.
    I/O Other Operations ,        32185,         901/sec.
    I/O Read Bytes       ,       346338,         148/ I/O
    I/O Write Bytes      ,        83614,         76/ I/O
    I/O Other Bytes      ,      5832690,         181/ I/O

-----------------------------

Results for Kernel Mode:
-----------------------------

OutputResults: KernelModuleCount = 135
Percentage in the following table is based on the Total Hits for the Kernel

Time   13080 hits, 25000 events per hit --------
 Module                                Hits   msec  %Total  Events/Sec
amdk7                                  7381      35718    56 %     5166162
nv4_disp                               3834      35718    29 %     2683520
ntoskrnl                                819      35718     6 %      573240
win32k                                  341      35718     2 %      238675
hal                                     288      35718     2 %      201579
Ntfs                                    180      35718     1 %      125986
NVENET                                   64      35718     0 %       44795
USBPORT                                  38      35718     0 %       26597
atapi                                    20      35718     0 %       13998
ino_fltr                                 19      35718     0 %       13298
nv4_mini                                 17      35718     0 %       11898
usbohci                                  11      35718     0 %        7699
watchdog                                  9      35718     0 %        6299
tcpip                                     7      35718     0 %        4899
HIDPARSE                                  7      35718     0 %        4899
Npfs                                      5      35718     0 %        3499
HIDCLASS                                  4      35718     0 %        2799
NDIS                                      4      35718     0 %        2799
sr                                        4      35718     0 %        2799
ftdisk                                    4      35718     0 %        2799
usbhub                                    3      35718     0 %        2099
PCIIDEX                                   3      35718     0 %        2099
ACPI                                      3      35718     0 %        2099
mouhid                                    2      35718     0 %        1399
hidusb                                    2      35718     0 %        1399
mouclass                                  2      35718     0 %        1399
TDI                                       2      35718     0 %        1399
PartMgr                                   2      35718     0 %        1399
rdbss                                     1      35718     0 %         699
psched                                    1      35718     0 %         699
VIDEOPRT                                  1      35718     0 %         699
imapi                                     1      35718     0 %         699
CLASSPNP                                  1      35718     0 %         699

================================= END OF RUN ==================================
============================== NORMAL END OF RUN ==============================

Funny enough, in the code above we can see that a third-party driver on my machine - nv4_disp.sys - which is consuming 29% of my CPU. (update: this driver is related to my NVidia chipset). But anyway, this is not a truly relevant test because in a "dir /s" the bottleneck is not the CPU but the I/O needed to read the file system metadata into the cache.

Going back to our Kernrate - you can even zoom in a certain module to find more, if you use the "-z <module>" option. This command zooms into the Windows kernel (ntoskrnl.exe):

C:\KrView\Kernrates>Kernrate_i386_XP.exe -z ntoskrnl
 /==============================\
<         KERNRATE LOG           >
 \==============================/
Date: 2004/12/21   Time: 15:37:38
Machine Name: AOLTEAN-H4
Number of Processors: 1
PROCESSOR_ARCHITECTURE: x86
PROCESSOR_LEVEL: 6
PROCESSOR_REVISION: 0800
Physical Memory: 480 MB
Pagefile Total: 1125 MB
Virtual Total: 2047 MB
PageFile1: \??\E:\pagefile.sys, 720MB
OS Version: 5.1 Build 2600 Service-Pack: 2.0
WinDir: E:\WINDOWS

Kernrate User-Specified Command Line:
Kernrate_i386_XP.exe -z ntoskrnl


Kernel Profile (PID = 0): Source= Time,
Using Kernrate Default Rate of 25000 events/hit
CallBack: Finished Attempt to Load symbols for 804d7000 \WINDOWS\system32\ntoskrnl.exe

Starting to collect profile data

***> Press ctrl-c to finish collecting profile data
===> Finished Collecting Data, Starting to Process Results

------------Overall Summary:--------------

P0     K 0:00:01.406 (24.3%)  U 0:00:00.859 (14.8%)  I 0:00:03.531 (60.9%)  DPC 0:00:00.031 ( 0.5%)  Interrupt 0:00:00.062 ( 1.1%)
       Interrupts= 23804, Interrupt Rate= 4106/sec.


Total Profile Time = 5796 msec

                                       BytesStart          BytesStop         BytesDiff.
    Available Physical Memory   ,       117850112,       114122752,        -3727360
    Available Pagefile(s)       ,       370819072,       368578560,        -2240512
    Available Virtual           ,      2132889600,      2130681856,        -2207744
    Available Extended Virtual  ,               0,               0,               0

                                  Total      Avg. Rate
    Context Switches     ,       206440,         35612/sec.
    System Calls         ,       372915,         64330/sec.
    Page Faults          ,        14872,         2566/sec.
    I/O Read Operations  ,          186,         32/sec.
    I/O Write Operations ,          180,         31/sec.
    I/O Other Operations ,        19183,         3309/sec.
    I/O Read Bytes       ,        39296,         211/ I/O
    I/O Write Bytes      ,        11940,         66/ I/O
    I/O Other Bytes      ,      3000748,         156/ I/O

-----------------------------

Results for Kernel Mode:
-----------------------------

OutputResults: KernelModuleCount = 135
Percentage in the following table is based on the Total Hits for the Kernel

Time   1893 hits, 25000 events per hit --------
 Module                                Hits   msec  %Total  Events/Sec
amdk7                                  1309       5796    69 %     5646135
ntoskrnl                                292       5796    15 %     1259489
hal                                      91       5796     4 %      392512
Ntfs                                     79       5796     4 %      340752
win32k                                   56       5796     2 %      241545
NVENET                                   28       5796     1 %      120772
ino_fltr                                  8       5796     0 %       34506
atapi                                     6       5796     0 %       25879
CLASSPNP                                  4       5796     0 %       17253
Npfs                                      3       5796     0 %       12939
PCIIDEX                                   3       5796     0 %       12939
watchdog                                  2       5796     0 %        8626
nv4_mini                                  2       5796     0 %        8626
sr                                        2       5796     0 %        8626
PartMgr                                   2       5796     0 %        8626
ftdisk                                    2       5796     0 %        8626
nv4_disp                                  1       5796     0 %        4313
tcpip                                     1       5796     0 %        4313
USBPORT                                   1       5796     0 %        4313
NDIS                                      1       5796     0 %        4313

===> Processing Zoomed Module ntoskrnl.exe...


----- Zoomed module ntoskrnl.exe (Bucket size = 16 bytes, Rounding Down) --------
Percentage in the following table is based on the Total Hits for this Zoom Module

Time   292 hits, 25000 events per hit --------
 Module                                Hits   msec  %Total  Events/Sec
CcUnpinDataForThread                     32       5796    10 %      138026
KiDispatchInterrupt                      27       5796     8 %      116459
ZwYieldExecution                         18       5796     5 %       77639
FsRtlIsNameInExpression                  14       5796     4 %       60386
KiIpiServiceRoutine                      13       5796     4 %       56073
SeUnlockSubjectContext                    9       5796     2 %       38819

NtAllocateVirtualMemory                   8       5796     2 %       34506
ObReferenceObjectByHandle                 8       5796     2 %       34506
ExAllocatePoolWithTag                     8       5796     2 %       34506
NtRequestWaitReplyPort                    7       5796     2 %       30193
ExInterlockedPopEntrySList                7       5796     2 %       30193
SeDeleteAccessState                       6       5796     1 %       25879
ExAcquireResourceExclusiveLite            6       5796     1 %       25879
ExReleaseResourceLite                     6       5796     1 %       25879
NtOpenProcessTokenEx                      5       5796     1 %       21566
ObCreateObject                            5       5796     1 %       21566
ObfDereferenceObject                      5       5796     1 %       21566
wcstombs                                  4       5796     1 %       17253
MmMapLockedPagesSpecifyCache              4       5796     1 %       17253
IoBuildPartialMdl                         4       5796     1 %       17253
RtlCopyUnicodeString                      4       5796     1 %       17253

This time nv4_dist was not a huge time-consuming module. but ntoskrnl was in the second place instead. We notice here KiDispatchInterrupt (which is probably the atapi!IdePortInterrupt interrupts). We also notice CcUnpinDataForThread, which denotes some cache manager data access (most likely for the cached NTFS metadata - the $MFT streams in special), and FsRtlIsNameInExpression which is the routine that matches file names to DOS pattern expressions in NTFS. In other words, it appears that we use FindFirstFile/FindNextFile which is not a surprise since the command that was run was DIR /S. It is however surprising that we spent 15% * 4% = 0.6% of kernel time when no real pattern was used in our DIR command.

There is also an Excel file called KrView.xls that allows you to generate all sorts of graphical summaries.

Anyway, this was only a quick tour. Download kernrate for yourself and enjoy! 

 

P.S. There is also managed code support but you need a DLL which is not part of kernrate normally.