In the last few months of the Windows Server 2008 development, a good friend of mine was discussing a problem they have been seeing with customers. The problem, lovingly titled as the "Large Time Jump" issue involves a machine in the domain (usually the PDC) making a large jump in time, either forward or backwards. Regardless of the direction of the jump, the results are equally catastrophic.
How it all happens
Lets look at how this can happen. Some of the causes are more likely than you think. Here is a quick list:
Fixing the glitch
The real problem here is that in a domain environment, domain controller completely and utterly trust the time that they get from another DC. The root PDC is a special case, but it still counts.
The solution is to do a "sanity check" on the time that any domain controller gets from anywhere. In this way, you are ensuring that if a domain controller gets out of whack, it will not spread that time to other DCs. This is done by setting the MaxPosPhaseCorrection and MaxNegPhaseCorrection values.
MaxPosPhaseCorrection and MaxNegPhaseCorrection limit the allowable offset taken from a time sample. When any instance of w32time polls another machine for the time, it will determine the offset between the time source and itself. This value is known as the "Sample Offset". Before the samples is used by the time service, it will be compared to the phase correction limits. If the sample offset is greater than the phase correction limit, then sample will be thrown out and a "TOO BIG" event will be generated. The event contains all of the information about the time sample, including who sent it. The purpose of doing this is to isolate domain controllers in the network who get into a bad time state. In this way, the other DCs will log and error about the time samples being too big rather than blindly accepting it.
Knowing your limits
The next question is: What is an acceptable limit of phase correction? After much analysis and debate, we are advising a value of 48 hours. If a domain controller receives a sample that says it is more than 48 hours off, either in the future or in the past, the domain controller will throw it out. However, every customer should evaluate their own situation to be sure.
This is advised for both limits, both forward and backwards.
A packaged solution
Here is an example of a registry entry that you can merge on-demand to apply a 48 hour limit to the phase correction:
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\Config]"MaxNegPhaseCorrection"=dword:0002a300"MaxPosPhaseCorrection"=dword:0002a300
As usual, If you have specific thoughts or questions about this post, please feel free to leave a comment. For general questions about w32time, especially if you have problems with your w32time setup, I encourage you to ask them on Directory Services section of the Microsoft Technet forums.