Based on several e-mail conversations, I've updated the retry code slightly and made a couple of improvements.  The code in the original post is updated and it's also included in this post for RSS purposes.  We haven't deployed this version to production yet, so I don't have any real-world results to report.

Changes
The first change I made was to do my math correctly in the sleep computation ...

The second change I made was to add a SqlConnection.ClearPool(conn); call to the retry exception handler.  This should ensure that the other connections in the pool don't have to retry.  We tend to see our retries in "batches", so hopefully this is the line of code to fix that.  You could use ClearAllPools() instead, but since we have a database per tenant, clearing the specific pool is a better choice for us.

The third change was to move the conn.Close() and SqlConnection.ClearPool(conn) outside of the if block so that they get executed every time.

The fourth change was to add one final Sleep(sqlRetrySleep) call in the exception handler.  This and the ClearPool call gives us our best chance to succeed when we return.

The last thing I did was to use Actipro CodeHighlighter to make the code in the blog post more readable.

Thanks to everyone for the feedback.  Here's the latest version of the code: 

public string GetSqlContextInfo(SqlConnection conn) { string sqlContext = string.Empty; // Omitted the code to read these from configuration int sqlMaxRetries = 4; int sqlRetrySleep = 100; int sqlMaxSleep = 5000; int sqlMinSleep = 10; // start a timer TimeSpan ts; DateTime dt = DateTime.UtcNow; for (int retryCount = 1; retryCount <= sqlMaxRetries; retryCount++) { try { conn.Open(); // get the SQL Context and validate the connection is still valid using (SqlCommand cmd = new SqlCommand("SELECT CONVERT(NVARCHAR(36), CONTEXT_INFO())", conn)) { sqlContext = cmd.ExecuteScalar().ToString(); } ts = DateTime.UtcNow - dt; // log opens that take too long if (ts.TotalMilliseconds >= 75) { SxpLog.WriteKnownEvent(8001, ts.TotalMilliseconds, "Connect", conn.DataSource, conn.Database, conn.WorkstationId, sqlContext); } break; } catch (SqlException ex) { conn.Close();
if (retryCount == 1)
{ SqlConnection.ClearPool(conn);
} if (retryCount < sqlMaxRetries) { SxpLog.WriteSqlRetry(5902, ex, conn.DataSource, conn.Database, conn.WorkstationId, "Connect", retryCount); // don't sleep on the first retry // Most SQL Azure retries work on the first retry with no sleep if (retryCount > 1) { // wait longer between each retry int sleep = retryCount * retryCount * sqlRetrySleep; // limit to the min and max retry values if (sleep > sqlMaxSleep) { sleep = sqlMaxSleep; } else if (sleep < sqlMinSleep) { sleep = sqlMinSleep; } // sleep System.Threading.Thread.Sleep(sleep); } } else { // Log the exception SxpLog.WriteSqlException(ex, conn.DataSource, conn.Database, conn.WorkstationId, "Connect", retryCount, "Final"); // we thought about rethrowing the exception, but chose not to // this will give us one more chance to execute the request - we might get lucky ...
// sleep System.Threading.Thread.Sleep(sqlRetrySleep);
conn.Open(); } } } // we log this value and null might cause an issue if (sqlContext == null) { sqlContext = string.Empty; } return sqlContext; }