Kernel Thread scheduling and understanding the RunList
Arguably, device hangs are the most challenging and elusive problems facing device developers today. One of the easiest ways of determining device hangs (or perf slowdowns) is to inspect the kernel’s RunList. It isn’t so much a list as it is a queue and knowing what threads (and their priorities) are scheduled is the key for solving hangs and device slowdowns.
The “scheduler” is in charge of determining what threads run at what time and for how long. To preserve the real-time nature of WinCE, higher priority threads will run to completion (or yield) as well as pre-empt lower priority threads. Since the UI is run at a relatively low priority, hangs are usually just higher priority threads hogging the UI’s chance at the processor.
Luckily, the Scheduler already keeps sorted list of threads ready to run is maintained by the Scheduler is accessible from your debugger:
Poking around a bit, we can get a better understanding of how this list is kept (the structure it is actually more complex than I abstract below):
The pTh pointer contains the value of the thread running when you broke into the debugger. The pRunnable pointer is where things start getting interesting – when a new thread is scheduled to run, the Scheduler traverses the list of already scheduled threads trying to match the priority. The list always begins with the highest priority followed by lower priorities – this allows the schedule to simply pick the thread “off the top” when it is time to run a new thread.
Using our diagram above, if we have a new thread (call it NewThread) scheduled to run at priority 250, the Scheduler will begin walking the thread list looking for the right priority to insert the thread. It will move past Thread A (priority 0) and past Thread B (priority 200) and insert it before Thread D (thread 251) which is at a lower priority:
The scheduler “pops” threads off the top of this list as they are run and are re-added if necessary by the scheduler. Notably, Thread E, F & G in the example above will not be run until the other, higher priority threads have run and yielded to the system.
This information can be leveraged into finding hangs by understanding what RunList threads are potentially blocking the (lower priority) UI threads from running. If your device has higher priority threads continually being scheduled before your lower priority UI threads, your UI will appear hung since those threads are not being processed.