The following article has been publiched about this subject: http://support.microsoft.com/kb/931279
RDTSC Drift vs Not Synchronized
To clarify - SQL Server 2005 SP2 contains both Drift and a Non Synchronized warnings.
The Non Synchronized message is only recorded on 1 hour intervals to avoid flooding the log. At most the server will report 1 entry/per scheduler/per hour.
To detect the Non Synchronized condition SQL Server uses the current RDTSC of the base scheduler and compares the RDTSC of subsequent schedulers.
Variance > ~3ms – Consider Non Syncronized Counter Warning
Variance > ~10% of mean average - Print Drift Counter Warning
The system should avoid either warning but the Drift is more concerning.
The recent PASS (PSS Labs) actually encountered these messages so I felt it prudent to publish some basic information about them. The following messages may appear in the SQL Server error log.
The time stamp counter of CPU on scheduler id 2 is not synchronized with other CPUs.
- or -
CPU time stamp frequency has changed from 191469 to 1794177 ticks per millisecond. The new frequency will be used.
I am in the process of publishing a detailed KB article - portions of that article provided here.
The Windows operating system exposes timing mechanisms using millisecond precision values. The precision is typically 10 to 15ms but can be as large as 55ms. SQL Server queries frequently complete within single digit millisecond or microsecond time spans, requiring a higher resolution timer. Therefore, prior SQL Server versions would report the duration of some queries as 0ms. This makes it difficult to track and tune SQL Server performance.
SQL Server 2005 improves accuracy by using the high-resolution CPU counter to provide micro-second timing capabilities. Occasionally, when SpeedStep, PowerNow, Cool-n-Quiet, various Power Schemes, or similar technologies are being used the reported timing values can be incorrect. The following is a not a comprehensive list of affected objects and features.
o Command Level Events (SQL:BatchCompleted, RPC:Completed, …)
o Statistics Generation Events (Auto Stats, …)
o Transaction Events
· set statistics time
· sys. dm_io_pending_io_requests
1. Disable Frequency Changes
Use the Control Panel | Power Options program. Setting the computer to the ‘Always On’ or ‘Max Performance’ power scheme generally forces the CPUs to remain at maximum frequency.
Note: It may be insufficient to disable CPU frequency variation features at the BIOS level. There are various third party utilities that can alter CPU frequencies. Some implementations enable frequency adjustments even when under maximum power scheme settings.
2. Enable SQL Server Processor Affinity
Drift-related problems can be reduced by preventing SQL worker threads from moving between CPUs. In SQL Server 2005 it is possible to dynamically change affinity without restarting the SQL Server service.
2. Synchronize Clock Frequencies and Counters (Drift)
Some systems require utilities and drivers from the hardware manufacturer to help keep frequency changes and CPU clock counters in lock step.
Contact your hardware manufacturer for complete details about clock frequency and counter optimizations.
Senior SQL Server Escalation Engineer
May 9, 2007 Update
Microsoft has recently received numerous reports of DRIFT warnings. We continue to investigate these warnings but at this time it appears the warning may be logged prematurely by SQL Server 2005 SP2.
What appears to be happening is that the message is based on the RDTSC counter of each scheduler and the latest value is retrieved from the sys.dm_os_ring_buffers (SCHEDULER_RING_BUFFER) record. If a scheduler is in an idle state (no immediate work to do) the calculation does not account for this situation and uses the last ring buffer entry for the scheduler as the current RDTSC. Since this entry can be > 3MS the message may be inadvertently logged.
Unless you have strong reason to believe the system is experiencing scheduling or performance problems the DRIFT message may be getting logged inadvertently on your system and can often be safely ignored until other performance behaviors are coorolated.
Microsoft Support uses a utility called RDTSCTest to help track down possible RDTSC variations. Even with this utility you have to be careful because there is no call on the system to capture the current state of every CPUs RDTSC at the same time. The capture of each CPUs RDTSC is done in a very tight loop but even that can show variations. We could start a thread on each CPU and have them wait on an event then signal it but we can’t control the operating system dispatch logic so even with all our attempts we can’t get exact RDTSC timing on all CPUs at the same time.
Repeated use of the utility is good to monitor RDTSC trending or exposing very large gaps but even that is compromised. On a dual core system if you use one of the cores the other one can throttle down and the RDTSCs will drift. When the system gets enough load the second code will pick up the load and often the RDTSC comes back in line. So attempting to spin on a set of CPUs to determine RDSTC is not full proof.
We are also working to remove SQL Server reliance on the RDTSC counter for future releases so these issues are no longer of consequence.
Senior SQL Server Escalation Engineer
May 10, 2007 Update
I have repeated this several times so I felt it worth posting for all.
Prior to SQL Server 2005 Microsoft SQL Server support encountered several issues that the duration was not granular enough. Older versions used GetTickCount() which is limited to a granularity around 12ms. It was possible to provide a system and queries running 1000s of times would report 0 duration and be ignored as a primary tuning opportunity.
To combat this SQL Server 2005 moved to the RDTSC counter to provide micro-second timing. The trace file format actually contains the micro-second values and the Profiler GUI converts these into milli-second values. The change was made early in the SQL Server 2005 development cycle (year 2000 time-frame) to locations that provide performance outputs such as Trace durations, statistics profile and others as outlined in the KB article 931279. The messages of DRIFT and OUT OF SYNC were added to SQL Server 2005 SP2 to provide warnings to customers that the performance data they are looking at might be compromised. Support has been burnt a few times looking at the duration column in profiler and it is not correct because the NT thread was moved between physical CPUs and the values are not near the same area as Stop Time – Start Time from the same event. These messages are first and foremost warnings to be cautions when looking at the performance output.
· The RDTSC counter (instruction) is a lightweight, 64 bit counter of the current CPU ticks. It is reported by the local CPU and ticks at the frequency of the CPU. For example a 1.5Ghz CPU ticks at 1.5Ghz * 1000 or 1500000 ticks per milli-second.
· GetTickCount is limited in its granularity.
· QueryPerformanceCounter is a more stable counter but very costly when called lots of times like SQL Server would do. It can also fall back on the RDTSC and encounter similar problems as the current implementation.
One problem support is seeing is multiple CPUs on the same machine don’t have a sync’ed RDTSC. Some of these are known manufacture issues but many times it is expected behavior. The biggest problem with determing this is that it is difficult to obtain the RDTSC values from all CPUs as the same time. You can create threads and affinitize on each CPU and then signal them all to run at one time but you are limited by the scheduling of the operating system dispatcher. There is no easy way to achieve the RDTSC in a synchronized way across all CPUs or CPUs in a group such as a NUMA node. Because of this utilities looking for drift often report some level of drift that would be considered noise.
SQL Server uses the SchedulerMonitor thread to monitor the health of the schedulers. There is a SchedulerMonitor per NUMA Node so it is possible that the RDTSC is significantly different on different nodes but SQL Server won’t report this (take a cable connected NUMA system as an example). SQL Server only reports DRIFT states within the same node.
SchedulerMonitor is just a standard NT thread so it is susceptible to operating system scheduling issues like all other threads. One way we can currently cause false DRIFT reports is to cause the SchedulerMonitor thread to context switch during the review of the scheduler states. The NT quantum is generally around 8ms on a server and a full quantum switch to another thread can cause a false DRIFT report.
Another false report we are looking at is the algorithm for DRIFT. The current design looks at the current and mean average of the last several SCHEDULER_MONITOR ring buffers and uses that to see if the DRIFT exceeds ~3ms. What I am looking closer at is system with lost of CPUs that don’t have core SQL processes on them (checkpoint, LW, lock monitor…..). These schedulers can go idle easier than others when the workers are not active because of load or waiting on a resource. If the SchedulerMonitor looks at the schedulers when we have an idle state it appears it may be possible to skew the mean in a way that reports a false DRIFT message.
The GetDate(), WaitForDelay and other such actions are not based on the RDTSC. They remain separate from the low level performance needs and use the system time. This allows them to pick up things like system wide data changes for time zone or daylight savings and the like.
It seems therortically plausible that a CPU change could lead to strange SQL Server performance problems. For example if you start on CPU 1 and the operating system later moves the thread to CPU 2 with a RDTSC that is behind you should get a longer quantum. From a purest standpoint I can’t deny this should happen. However, we have tested various TPCC benchmarks, the RDTSC has been in use since the year 2000, it is in RTM and SP1 and we have not yet tracked a single performance problem to the RDTSC counter usage for the SQL OS schedulers. What we are finding is that the worker generally has to perform I/O to disk, network or compete for a resource that forces the worker to give up its quantum. Microsoft has yet to uncover an issue that will lead to scheduling problems because RDTSC is used and based on extensive testing we don’t believe we will.
We have looked at some of the manufacture's utilities that help keep the RDTSC in sync. The problem with these are that they are not guaranteed to move the RDTSCs forward. Some will set the RDTSC values lower. We have a test that runs dbcc on pubs and shows a true elapsed time of 8 sec but a duration of 12 days. We also used this utility when running benchmarks and did not see any benchmark changes, another reason to believe scheduling is not negatively impacted.
We are making changes in the next version of SQL Server to use a more stable counter instead of RDTSC that allows 1ms timing granularity based on system close interrupts. This is being testing in the beta builds of the SQL Server version and if all tests well Microsoft is considering a backport of this change to a build of SQL Server 2005. This will remove the performance output problems of the RDTSC but still allow queries that run in less than ~12ms to report a value other than 0ms for the DBA.
Another reason for the change is to better handle issues like hot add CPU (new CPU added starts ticks at 0) and technologies like speed step, Cool N’ Quiet. Studies show allowing CPUs to step the Mhz down when not required to run at full Mhz levels can save as much as 30% energy in the data center as well as reduce the cooling costs reducing the need for full blown, raised for deployments.
Senior SQL Server Escalation Engineer