HoppeRx - the cure for your ailing device

A community site dedicated to the support of device problems found by Hopper

Kernel Thread scheduling and understanding the RunList

Kernel Thread scheduling and understanding the RunList

  • Comments 4

Arguably, device hangs are the most challenging and elusive problems facing device developers today. One of the easiest ways of determining device hangs (or perf slowdowns) is to inspect the kernel’s RunList. It isn’t so much a list as it is a queue and knowing what threads (and their priorities) are scheduled is the key for solving hangs and device slowdowns.

 

The “scheduler” is in charge of determining what threads run at what time and for how long. To preserve the real-time nature of WinCE, higher priority threads will run to completion (or yield) as well as pre-empt lower priority threads. Since the UI is run at a relatively low priority, hangs are usually just higher priority threads hogging the UI’s chance at the processor.

 

Luckily, the Scheduler already keeps sorted list of threads ready to run is maintained by the Scheduler is accessible from your debugger:

 

        

Poking around a bit, we can get a better understanding of how this list is kept (the structure it is actually more complex than I abstract below):

        

The pTh pointer contains the value of the thread running when you broke into the debugger. The pRunnable pointer is where things start getting interesting – when a new thread is scheduled to run, the Scheduler traverses the list of already scheduled threads trying to match the priority. The list always begins with the highest priority followed by lower priorities – this allows the schedule to simply pick the thread ���off the top” when it is time to run a new thread.

 

Using our diagram above, if we have a new thread (call it NewThread) scheduled to run at priority 250, the Scheduler will begin walking the thread list looking for the right priority to insert the thread. It will move past Thread A (priority 0) and past Thread B (priority 200) and insert it before Thread D (thread 251) which is at a lower priority:

      

The scheduler “pops” threads off the top of this list as they are run and are re-added if necessary by the scheduler. Notably, Thread E, F & G in the example above will not be run until the other, higher priority threads have run and yielded to the system.

 

This information can be leveraged into finding hangs by understanding what RunList threads are potentially blocking the (lower priority) UI threads from running. If your device has higher priority threads continually being scheduled before your lower priority UI threads, your UI will appear hung since those threads are not being processed.

 

Comments
  • For several weeks, we have been running hopper testing on our devices.
    We found that 90% of failed devices stopped at transcriber screen.
    The AKU we are using is wm520.

    Would you please give us some tips on transcriber, can we exclude transcriber application from hopper testing to distinguish the rootcause of hopper failure (is it related to application, or it is related to BSP)?

    Thanks.
  • Hey Susan, A transcriber problem was investigated and I think was fixed in one of the AKU drops.
    Please check with your Technical Account Manager or your OEM-PM for more information about that.
  • This is one of the very good tips for hunting down the culprit for system hang.  But the blog only describes the scenario in concept or theory.  It does not say which field in the data structure points to the next thread to run so that a complete list can be rebuilt from raw data.  Do you think we can get this information?

    Thanks!
  • Threre is no image!

    Please update.

    Thank you for the posting..

Page 1 of 1 (4 items)
Leave a Comment
  • Please add 4 and 5 and type the answer here:
  • Post