A question was posed to me whether Flow Control which existed in Mirroring was still relevant for Availability groups.

Flow Control is primarily a mechanism to gate or throttle messages to avoid use of excessive resource on the primary or secondary. When we are in “Flow Control” mode, sending of log block messages from the Primary to the Secondary is paused until out of flow control mode.

 

A Flow Control gate or threshold exists at 2 places:

- Availability Group Replica/Transport – 8192 Messages

- Availability Group Replica Database. - 112*16 = 1792 Messages per database subject to the 8192 total limit at the transport or Replica level

 

When a log block is captured, every message sent over the wire has a Sequence Number which is a monotonically increasing number. Each packet also includes an acknowledgement number which is the sequence number of the last message received /processed at the other side of the connection. With these two numbers, the number of outstanding messages can be calculated to see if there exists a large number unprocessed messages. Message sequence number is also introduced in order to ensure that messages are sent in sequence. If the messages are out of sequence then the session is torn down and re-established.

 

From an Availability Replica perspective, either the Primary or the Secondary replica can signal that we are in Flow control mode.

 

On the Primary, when we send a message, we check for the number of UN-acknowledged messages that we have sent - which is the delta between Sequence Number of the message sent and last acknowledged message. If that delta exceeds a pre-defined threshold value, that replica or database is in flow control mode which means that no further messages are sent to the secondary until the flow control mode is reset. This gives the secondary some time to process and acknowledge the messages and allows whatever resource pressure that exists on the secondary to clear up.

 

On the Secondary, when we reach a low threshold of Log caches or when we detect memory pressure, the secondary passes a message to the primary indicating it is low on resources. When SECONDARY_FLOW_CONTROL message is sent to the primary, a bit is set on the primary layer for the database in question indicating it is in Flow control mode. This in turn skips this database when doing a round-robin scan of databases to send data.

 

Once we are in “flow control” mode, until that is reset, we do not send messages to the primary. Instead, we check every 1000ms for a change in flow control state. On the secondary for example, if the log caches are flushed and additional buffers are available, the secondary will send a flow control disable message indicating we no longer need to be flow controlled. Once the primary gets this message, that bit is cleared out and messages again will flow from the database in question. On the Transport or Replica side on the other hand, once the number of unacknowledged messages falls below the gated threshold, it is reset as well.

While we are in Flow control mode, perfmon counters and wait types can give us the amount of time we are in flow control mode.

 

Wait Types:

http://msdn.microsoft.com/en-us/library/ms179984.aspx

 

HADR_DATABASE_FLOW_CONTROL

Waiting for messages to be sent to the partner when the maximum number of queued messages has been reached. Indicates that the log scans are running faster than the network sends. This is an issue only if network sends are slower than expected.

HADR_TRANSPORT_FLOW_CONTROL

Waiting when the number of outstanding unacknowledged AlwaysOn messages is over the out flow control threshold. This is on an availability replica-to-replica basis (not on a database-to-database basis).

 

Perfmon counters:

http://msdn.microsoft.com/en-us/library/ff878472.aspx

 

Flow Control Time (ms/sec)

Time in milliseconds that log stream messages waited for send flow control, in the last second.

Flow Control/sec

Number of times flow-control initiated in the last second. Flow Control Time (ms/sec) divided by Flow Control/sec is the average time per wait.

 

Extended Events

There are 2 Extended Events which will give us the relevant information when we are under the Flow control mode – note they are under the Debug Channel.

The action is basically a “set=0” or “cleared=1” bit.

clip_image002

 

Denzil Ribeiro – Sr Premier Field Engineer