Hi, I am Vivek Gupta, a software developer on the USB team. In this blog, I am going to talk about why USB selective suspend mechanism is needed and important, and how to implement it correctly in devices and drivers. I will start by discussing the concept of run-time power management in devices, discuss the USB specific mechanism of selective suspend and finally cover how this mechanism is implemented in USB 3.0.
Before we talk about selective suspend, let’s understand a more generic concept: run-time power management.
One way of conserving power in a system is to send the whole system to a low-power state such as sleep or hibernate. Because this mechanism requires turning off the system, it is only possible when the whole system is not in use. Even when the whole system is in use and is in working state, it is quite likely that certain components of the system are not active. Those components are said to be in an idle state. Run-time power management refers to sending idle components to a lower power state, until they need to be used again. The components can be hardware such as processor, memory, and so on; however in this discussion, we are only interested in run time power management of devices.
Devices and device drivers should aggressively pursue run-time power management of devices because the mechanism can lead to significant power savings. Because a driver stack includes more than one driver, coordination between drivers is required while sending the device to a lower power state and bringing it back to working state. Both Windows Driver Model (WDM) and Kernel-Mode Driver Framework provide mechanisms for this coordination. One driver in the device stack, typically the function driver, is the power policy owner. This power policy owner is responsible for detecting that the device is idle and initiating the process of transitioning the device into a lower power state. The power policy owner is also responsible for bringing the device back to working state, (also referred to as waking up the device) when the user needs to use that device.
How would the power policy owner driver know about the user’s intent to use a device, so that the driver can wake up the device? That depends on the kind of device. Let us say, a storage device that is in a low-power state, and a user needs to transfer a file from or to that device. When the user initiates the transfer process (by using a certain application), the power policy owner gets an I/O request from the application, and knows that it needs to wake up the device. However, if the device is a mouse, the device must send some sort of signal to initiate the wake-up process. The device-initiated power transition is known as remote wake-up feature in the USB world. Because the mouse needs to generate such a resume signal, it cannot completely turn itself off. Typically, the ability of the device to generate the resume signal is programmable. Therefore, before putting the device to sleep, the driver must instruct the device to turn on the remote wake-up feature when the device is suspended. This process is called arming the device for remote wake-up.
Typically, the bus driver (driver that manages the PDO) in a device stack receives notification about the resume signal from the device and notifies the function driver about the arrival of the signal. In Windows, to get that notification, the function driver sends a wait wake IRP (IRP_MN_WAIT_WAKE) to the bus driver. The bus driver keeps the IRP pending during the time the device is in the low-power state. When the bus driver receives notification about the wake signal from the device, the bus driver notifies the function driver by completing the previously sent wait wake IRP.
A device stack managing a parent device cannot transition to lower power state until all of its children are in low-power states. That is because a child device in working state typically requires its parent device to be in working state. For example, if a USB device is connected through a hub, the hub must be in working state for transfers to complete successfully.
After all of the children device stacks enter low-power states, the parent device stack can then go to a lower power state as well. Thus in a tree of devices, first the leaf nodes go to lower power, then their parent nodes go to lower power (assuming that they are idle), and so on until the root of the tree goes to lower power. Note that the wake-up mechanism described in the previous section will be repeated at each level in this node. Before any device can enter working state, all nodes (between that device and the root) must be in working state.
Now that we understand the concept of run-time power management, let us get into more details about USB devices. A USB 2.0 device is sent to suspend state by first putting the port to which the device is attached, into a suspend state. That is done by sending a control transfer to the hub if the device is attached through a hub, or by manipulating port registers if the device is connected directly to the root hub.
After the port is suspended, the parent controller or the hub stops sending SOFs (Start of Frames) to the USB device. When a USB device does not receive SOFs for 3ms, the device goes into suspend state. This mechanism is known as selective suspend. The term selective suspend refers to sending only a part of the USB tree to suspend state as opposed to global suspend that refers to sending the entire bus in suspend state by stopping SOFs at the controller level.
To understand how software and hardware work together in selective suspend, consider an example of a USB mouse attached to a USB host controller as shown in the following illustration.
Typically, multiple device stacks are involved in the management of a USB mouse. However, for the purposes of this discussion, we will collectively refer to all the drivers above the USB driver stack as the “mouse driver”. In the simplistic illustration that shows the device stacks, hub driver is the bus driver for the device and the mouse driver is the function driver. The mouse driver is also the power policy owner for the device. Arrow labeled “Power Policy Owner” in the preceding illustration points to the specific drivers that are power policy owner in the device stack for the USB HID device. The device has an interrupt endpoint that it uses to send data about various user-initiated events like button presses. The mouse driver typically keeps one or more interrupt transfers pending with the USB core stack to receive this data from the mouse as it arrives. Because the mouse driver is the power policy owner, the driver keeps track of the mouse usage and detects when the mouse is idle. The preceding diagram illustrates this setup.
After the mouse driver detects that the device is idle, the driver starts the process of sending the mouse to lower power. The mouse driver sends a wait wake IRP to the hub driver. The hub driver pends and stores away that IRP. The mouse driver then initiates transition to lower power state by requesting a set-power IRP (IRP_MN_SET_POWER) with D2 as the target state on its device stack. The D2 IRP first arrives on the mouse driver and as part of handling the IRP, the mouse driver cancels its pending interrupt transfers. Then, the hub driver receives the D2 IRP. Because the wait wake IRP is pending, the hub driver sends a control transfer to the device to arm the device for remote wake-up, and then suspends the port by manipulating certain port registers on the controller. The device transitions to a lower power state and only retains enough power to detect wake-up events and generate a resume signal on those events. The following diagram illustrates this process.
Note: If there is a hub in between the controller and mouse, the port-suspend operation is accomplished by sending a control transfer to that hub rather than performing register operations. Also, the hub driver and other drivers in the USB driver stack are coordinated to interact with the hardware. That interaction is not shown in the preceding illustration for brevity.
When a user wiggles the mouse, the mouse generates a particular resume signal on the wire upstream. The controller receives that resume signal and propagates back the signal to the mouse downstream. The controller then notifies the USB driver stack that the port has resumed by completing the interrupt transfer. The USB driver stack notifies the mouse driver that the mouse has entered working state by completing the wait wake IRP that the mouse driver had sent earlier. The mouse driver then sends a set-power IRP requesting D0 power state (D0 IRP) to its device stack to bring the mouse back to working state. The D0 IRP is first handled by the hub driver. Because the port has already resumed, the hub driver completes IRP without any processing. Upon completion, the D0 IRP then reaches the mouse driver. In the completion routine, the mouse driver can send an interrupt transfer to get mouse activity and resume normal functioning. The following figure illustrates this process.
Note that the ability of the mouse to send the resume signal is only activated after the mouse is in suspend state. If the mouse is not in suspend state, the mouse driver sends one or more pending interrupt transfers to know about user events. Thus, software can always respond to user events on the mouse within a reasonable period of time.
If we look at the functionality implemented by the mouse driver in the preceding scenario, it has used the generic mechanisms consisting of D-IRPs and wait wake IRPs, provided by Windows to implement selective suspend. Certain USB client drivers cannot use those mechanisms and need to implement a more involved method that requires sending the IOCTL_INTERNAL_USB_SUBMIT_IDLE_NOTIFICATION I/O control request. Let us see why that is the case.
USB specification allows a USB device to implement multiple functions that are active simultaneously, such multi-function devices are also known as composite devices. For example a composite device might contain a function for keyboard and another function for mouse. The Microsoft-provided USBCCGP driver typically loads as the function driver for composite devices and enumerates a PDO for each of the supported functions. Those individual PDOs are managed by their respective USB function drivers like the mouse driver. The following illustration shows the device stacks for the example.
Now let us see what happens if the function drivers managing individual functions of the composite device, in the preceding example, tried to use the D-IRP mechanism to do run-time power management. Each of those function drivers will have interrupt transfers pending to their respective interrupt endpoints. Each of the functions can become idle independently. For instance, consider the scenario where the mouse function becomes idle while the keyboard is still being used. The mouse driver for the mouse sends a wait wake IRP, cancels its interrupt transfer, and then sends a D2 IRP. Because the keyboard is still in the working state, USBCCGP stack cannot go to a low-power state. In USB 2.0, the only way for any part of the device to go to a low-power state is for the entire device to enter selective suspend. The device cannot turn on its ability to send a resume signal because the device is not in suspend state. So, if a user event occurs on the mouse, there is no way for the mouse to notify the mouse driver as there are no pending interrupt transfers with the mouse. In other words, mouse is unusable until the keyboard also becomes idle, which could be indefinitely long!
Based on the preceding scenario, it becomes clear that the only time a composite device can enter suspend state is when all of the functions are idle. In addition, each function driver can initiate transition to lower power only when the function knows that all other functions are in idle state. In order to accomplish this, Windows defines a two-step process called the idle IRP mechanism:
1. The function driver notifies the USB driver stack about the driver’s intent to go to a low power state by sending the IOCTL_INTERNAL_USB_SUBMIT_IDLE_NOTIFICATION I/O control request.
2. When all of the functions are ready to go to idle state, USB driver stack notifies the function drivers by invoking the idle callback routines implemented by the function driver. Only then can the function driver go to a low-power state.
In case of a composite device, USBCCGP implements the task of coordinating different function drivers and invoking their callbacks. Note that the Windows USB driver stack is designed in a way such that the function drivers do not need to distinguish between the cases where they are loaded on top of one of the functions of a multi-function device and the case where they are managing a single-function (non-composite) device. To achieve this goal, in case of a non-composite device, the hub driver invokes the idle callback instead of USBCCGP, which is not present.
A function driver need not implement the idle IRP mechanism and can use the D IRP mechanism in the following cases:
It should be noted that even though I say that the D-IRP mechanism might be good enough in the above scenarios, a driver might still need to implement the Idle IRP mechanism, if the driver is intended to run on versions of the Windows operating system earlier than Windows Vista. For information about the combination of OS versions and device types, which require the idle IRP mechanism, see USB Selective Suspend.
Certain function drivers are written to work for both non-composite devices and composite devices, so those drivers end up implementing the idle IRP mechanism and using it regardless of whether they are working on single-function and multi-function devices.
The selective suspend feature conserves power significantly. Therefore it is imperative for USB devices and their function drivers to implement this feature. Any USB device that does not implement the selective suspend feature might prevent other components of the system from transitioning to lower power. For instance, a device that is attached through a hub prevents the hub and also the host controller from going to a low-power state. This could in turn prevent other system components, like the processor, from going to lower power. Thus, any device that does not implement the feature appropriately can lead to a significant power drain on the system depending on the system’s configuration. This is particularly important for USB 2.0 devices because the 2.0 protocol is very chatty. For example, a pending interrupt transfer requires constant polling between the host controller and the device. Even if the device does not have an interrupt endpoint, a hub (between the device and the controller) with an interrupt endpoint keeps the periodic scheduler in the controller active.
Implementing power management correctly in a WDM function driver is a not a trivial task. In addition, implementing the idle IRP mechanism adds more complexity. I will not explain these implementation details but I think it is worth mentioning that WDF greatly reduces the complexity of implementing power management and also takes care of implementing the idle IRP mechanism. I strongly recommend that you use WDF if you are planning to add selective suspend support in the driver. For information about run-time power management in KMDF, see Supporting Idle Power-Down. For implementing selective suspend, choose the right option in the run-time power management method WdfDeviceAssignS0IdleSettings. For UMDF driver, see Power Policy Ownership in UMDF. This blog entry outlines how to implement selective suspend when WinUSB is being used as the function driver.
In a rare scenario where there might really be a valid reason why a driver cannot be written in WDF, I recommend that you carefully read MSDN documentation, Selective Suspend in USB Drivers, and Increase System Power Efficiency with Idle Detection.
Because selective suspend occurs even when there is no user activity, it is very important that transitions to and from the selective suspend be completely transparent to the user. Any deviation from such behavior could lead to inconsistent end-user experience.
For example, if a device is implemented in a way that it drops off the bus in suspend state, the user can see unexpected delays and might hear system sounds due to PnP re-enumeration of the device. Hardware vendors must implement and extensively test selective suspend on devices and hubs.
A device that is in selective suspend should be able to send resume signal when any user action that it would normally respond to. For example, if a mouse goes into selective suspend, it should be able to send a resume signal not only when the user clicks a button on it but also when the user moves the mouse. It should be noted that the requirement to wake up on any event does not apply to system suspend. For system suspend, it might make sense for the device to wake up only on some special events.
Devices should also have enough buffers to cache the data associated with any user action that initiated the resume. Upon resume, the device driver can start interacting with the device and the device can communicate that data back to the driver. This ensures that the user does not have to repeat an action such as pressing keys of a keyboard.
Now let us see how USB3.0 affects selective suspend. USB 3.0 defines link power management (LPM) with different link states from U1 to U3 in the increasing order of power savings and in the increasing order of exit latencies. For U1 and U2 link states, after the software performs the initial setup, those states are entered and exited automatically at the hardware level, without further software intervention. The USB driver stack handles this setup, without the intervention of function drivers.
When the link state of a USB 3.0 port goes to U3, the attached device goes to selective suspend state. Let us look at this mechanism in more detail and compare it to the USB 2.0 selective suspend state, particularly from the point of view of drivers. Unlike USB 2.0, there are no SOFs that the hub stops sending. Instead, when the driver sends a particular link to U3 by sending a control transfer to the hub, the downstream device connected to that link goes directly into a suspend state. Like USB 2.0, a device is armed for remote wake-up by sending a control transfer directed to an interface in the device rather than the control endpoint (described in a subsequent section). When a device in suspend state is armed for remote wake-up detects a user directed event, the device sends a resume signal (U3 wake-up signal). The hub responds with the same signal back to the device. However unlike USB 2.0, the hub does not send a port status change interrupt event for the resume event. Instead, the device sends a FUNCTION_WAKE device notification that informs the host about the resume event.
The USB 3.0 specification also defines the function suspend feature, which is related to the selective suspend feature, from the point of the function drivers. In a USB 3.0 composite device, a function can be suspended independent of whether other functions in the device are in working or idle state. So when a function driver requests a low power transition, the corresponding function must be sent to suspend state. When all the functions are in suspend state, the device can be sent to suspend state by sending the upstream link to U3. Not only can the individual functions be suspended, they can also be armed for remote wake-up independently. Therefore, a function can be suspended and woken up, even if other functions are not in a suspended state. Therefore, the idle IRP mechanism that was described previously is not needed for USB 3.0 composite devices. When a function wakes up, it sends a wake up notification packet that informs the host which device and the exact function that woke up. Using that information, the USB driver stack can complete the wait wake IRP for the function that woke up and avoid the need to wake other functions that are in suspend state.
A function is armed for remove wake-up and suspend by sending a control transfer to the first interface of the function. Non-composite devices are considered a special case of composite devices. To arm a non-composite device for wake-up, the function driver must send a control transfer to the first interface in the selected configuration. Note that arming the device for remote wake-up does not apply. One consequence of this design is that depending on whether a device is a non-composite or composite, different drivers in the USB driver stack are responsible for arming the device for remote wake-up. If a composite device uses a customized driver instead of the USBCCGP, the custom driver must inform the USB driver stack that the driver will handle the arming process.
A USB driver stack for supporting USB 3.0 devices should ideally be designed in a way that little changes, if any, are needed in the function drivers to work with 3.0 devices. That is true for power management as well; differences between USB 2.0 and 3.0 should be absorbed by the USB driver stack. Existing function drivers written for USB 2.0 devices should work as is on USB 3.0 devices, provided of course that the class protocol remains unchanged. As I pointed out earlier, USB 3.0 devices do not require any coordination between functions for power management and hence the function drivers managing them do not need to implement the idle IRP mechanism. Of course, a lot of function drivers will be written generically to work on both USB 2.0 and 3.0 devices and so they will end up implementing the idle IRP mechanism anyways. Particularly because the USB 3.0 devices are required to work as 2.1 devices when there is no SuperSpeed support in the host.
In this blog, I mainly focused on selective suspend. I briefly covered LPM states U1 and U2 but did not go into details. LPM is a powerful mechanism for saving power but brings some new challenges. I will discuss that in my next blog post.