Welcome to MSDN Blogs Sign in | Join | Help

SQL Protocols

Topics from the Microsoft SQL Server Protocols team - Netlibs, TDS, SQL Browser, etc...
Understanding the error “An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full.”

This error message, which is associated with the Winsock error WSAENOBUFS and actually comes from Windows rather than SQL Server directly, has two well-understood root causes. However, it still stumps people today, probably because  there is no single source which explains both root causes and offers solutions for both. Here is a description of the two common situations where you may see this error and quick solutions for each:

OS runs out of memory for TCP buffers

When a powerful client machine, especially one with lots of RAM, is running an x86 version of Windows, people use the /PAE switch in the c:\boot.ini file to allow applications on that machine to be able to address the full range of memory. One other switch often used to give more memory to applications is the /3gb switch in the boot.ini file. The problem comes when these two are combined: the /3gb switch gives more memory to applications by reducing the amount of memory available to the OS. When it is used on a powerful machine where the applications require many OS resources, such as by opening many TCP connections, this can cause the OS to run out of memory for resources like TCP buffers. When that happens, Winsock throws the error WSAENOBUFS.

Solution: Remove the /3gb switch from c:\boot.ini. The root problem in this case is memory pressure on the OS, so removing the /3gb switch will give more memory to the OS and will alleviate this problem.

OS runs out of available TCP “ephemeral” ports

When the client machine is opening many TCP connections and is running Windows Server 2003, Windows XP, or any earlier version of Windows, it may run out of TCP “ephemeral” ports. In Windows Server 2003, Windows XP, and earlier versions, Windows limits the number of available ephemeral ports to approximately 5000 across the machine. It is especially common to hit this problem for applications which do not use connection pooling.

Solution: To make more ephemeral ports available, follow the directions in this KB which describe how to create the MaxUserPort  registry key: http://support.microsoft.com/kb/196271

 

Dan Benediktson
SQL Server Protocols
Disclaimer: This posting is provided "AS IS" with no warranties, and confers no rights

TCP listening ports and ENDPOINT ports behavior

Endpoint configuration seems to be creating some confusion.  I hope that this will clarify some of the behavior as it applies to TCP endpoints.

With a default configuration of SQL Server, try the following:

> select name, port, is_dynamic_port, state from sys.tcp_endpoints
> go
name                        port is_dynamic_port state
--------------------------- ---- --------------- -----
Dedicated Admin Connection     0               1     0
TSQL Default TCP               0               1     0

At first, it may seem that the “port” value in this table is incorrect.  The server is listening on port 1433, not port 0.  In addition, if your instance is configured to listen on different or multiple ports, they do not appear here.  Confused?  I was too.  (Also, note that state 0 means "started" and state 1 means "stopped").

The behavior you are seeing is, in fact, by design. A zero for a port value is effectively a wildcard, matching any endpoint of that type that doesn't have a more specific rule.  The endpoint configuration does not specify what ports clients can use to connect to the server.  Rather, it specifies the rules to apply to a connection when it is made.

So, the default TCP port or any other TCP port with no specific endpoint defined will use this default rule. Here is an example of how this works:

First, issue "alter endpoint [TSQL Default TCP] state = stopped" and see that we can no longer connect via TCP to the server:

> select name, port, is_dynamic_port, state from sys.tcp_endpoints
> go
name                        port is_dynamic_port state
--------------------------- ---- --------------- -----
Dedicated Admin Connection     0               1     0
TSQL Default TCP               0               1     1

For now, re-enable that port using "alter endpoint [TSQL Default TCP] state = started" and we can connect again using TCP on the default port.

If you now use SQLServer Configuration Manager to add a new port for SQLServer to listen on, say, port 6060 and restart, you can now connect using TCP on port 1433 or port 6060.

Then issue "alter endpoint [TSQL Default TCP] state = stopped" and we can no longer connect on either port.

Then issue "
create endpoint [MyEndpoint] state = stopped as tcp (listener_port = 6060) FOR tsql ()" and we still can't connect on either port, because they are both still "stopped":

> select name, port, is_dynamic_port, state from sys.tcp_endpoints
> go
name                        port is_dynamic_port state
--------------------------- ---- --------------- -----
Dedicated Admin Connection     0               1     0
TSQL Default TCP               0               1     1
MyEndpoint                  6060               0     1


Now issue "
alter endpoint [TSQL Default TCP] state = started" and we can connect on 1433 but not on 6060.  This is because there is a specific rule for port 6060 which is applied.  On port 1433, there is no specific rule, so the default rules apply and the connection is allowed.

If we now issue "
alter endpoint [MyEndpoint] state = started" and we can connect via TCP on both port 1433 and port 6060.

Finally, if we issue "
alter endpoint [TSQL Default TCP] state = stopped" and we can connect via TCP on port 6060 but not on port 1433.

David Benoit
SQL Server Protocols
Disclaimer: This posting is provided "AS IS" with no warranties, and confers no rights

Understanding “Data Source=(local)” in SQL Server Connection Strings

Lately we have noticed many misunderstandings surrounding the usage of the Data Source keyword in connection strings, caused by people generalizing from an example demonstrating a connection string for local connectivity and creating their own connection string for a remote connection.  Here is one such example connection string for local connectivity as it would be used in VB.Net:

Public con As New System.Data.SqlClient.SqlConnection("Integrated Security=SSPI;Initial Catalog=TestDatabase;Data Source=(local);")

This connection string's options can be dissected as:

-          Integrated Security=SSPI; – This means we want to connect using Windows authentication

-          Initial Catalog=TestDatabase; – This means the database we want to first connect to is named “TestDatabase”.

-          Data Source=(local); – This means that we want to connect to the SQL Server instance located on the local machine.

The confusion occurs with this last option, since people think that “local” is a keyword referring to the local machine account, when in fact the special keyword is "(local)", including the parentheses.  As a result, if they want to use a remote connection, and if their server’s name is, for example, TestServer with IP address 10.1.1.10, they try to use the connection string option: “Data Source=(TestServer)” or “Data Source=(10.1.1.10)”.  Since the special keyword here is the whole word “(local)”, the correct connection string option would be: “Data Source=TestServer” or “Data Source=10.1.1.10”.  So, as a whole line of code, this would now read:

Public con As New System.Data.SqlClient.SqlConnection("Integrated Security=SSPI;Initial Catalog=TestDatabase;Data Source=TestServer")

or:

Public con As New System.Data.SqlClient.SqlConnection("Integrated Security=SSPI;Initial Catalog=TestDatabase;Data Source=10.1.1.10")

Dan Benediktson
SQL Server Protocols
Disclaimer: This posting is provided "AS IS" with no warranties, and confers no rights

Configuring SQL Protocols through Windows PowerShell

                Sometimes we are asked about the possibility of configuring SQL Server protocols through PowerShell.  In SQL Server 2008, the sqlps tool incorporates WMI and SMO into this powerful Windows administrator tool, making it easy to manage SQL Server protocols through PowerShell.

                To get started, run (elevated, if on Windows Vista or Windows Server 2008) sqlps.exe, which by default is located at the %ProgramFiles%\Microsoft SQL Server\100\Tools\binn\sqlps.exe; or, if your architecture is x64, it is in the same path as above, under your Program Files (x86) directory.

                Now that you have an Admin SQL PowerShell window, here are some example scripts that you can run to configure your SQL Server instance:

Get the TCP, NP, and SM objects

This script is the starting point for manipulating the protocols properties on a local default instance.  To modify this for a named instance, replace “MSSQLSERVER” with the name of your instance.

$MachineObject = new-object ('Microsoft.SqlServer.Management.Smo.WMI.ManagedComputer') .

$ProtocolUri = "ManagedComputer[@Name='" + (get-item env:\computername).Value + "']/ServerInstance[@Name='MSSQLSERVER']/ServerProtocol"

 

$tcp = $MachineObject.getsmoobject($ProtocolUri + "[@Name='Tcp']")

$np = $MachineObject.getsmoobject($ProtocolUri + "[@Name='Np']")

$sm = $MachineObject.getsmoobject($ProtocolUri + "[@Name='Sm']")

 

Enable remote connection protocols

Once you have the base protocol objects, enabling remote connections is trivial:

$np.IsEnabled = $true

$np.alter()

$tcp.IsEnabled = $true

$tcp.alter()

Calling the .alter() method commits changes you make to the registry, and you will need to restart the SQL Server instance for it to pick up these changes.

More elaborate example: Modifying an instance’s TCP Port

Once you have the TCP object, you can view the properties of the TCP Ports on the various IP Addresses your SQL Server instance is listening on.  For instance, this command will show the properties of the “IPAll” IP Address:

$MachineObject.getsmoobject($tcp.urn.value + "/IPAddress[@Name='IPAll']")

The following commands will make your server listen on the TCP port 3344, by modifying the TcpPort property of the IPAll entry and then committing those changes:

$MachineObject.getsmoobject($tcp.urn.Value + "/IPAddress[@Name='IPAll']").IPAddressProperties[1].Value = "3344"

$tcp.alter()

You can now verify in the SQL Server Configuration Manager that your IPAll setting is now set to listen on TCP Port 3344, and restarting the SQL Server service will result in it now listening on the newly-specified port.

Dan Benediktson
SQL Server Protocols
Disclaimer: This posting is provided "AS IS" with no warranties, and confers no rights

Connection failure because of mismatched TDS version

Some customers have reported connection failure because of wrong TDS version.

 

The client application gets the following error message: (or similar depends on the protocol used)

A connection was successfully established with the server, but then an error occurred during the login process. (provider: Shared Memory Provider, error: 0 - No process is on the other end of the pipe.) (Microsoft SQL Server, Error: 233)

 

 

When you check the errorlog of your SQL Server, you see the following error message:

 

The Tabular Data Stream (TDS) version 0x730a0003 of the client library used to open the connection is unsupported or unknown. The connection has been closed. [CLIENT: x.x.x.x]

Error: 17802, Severity: 20, State: 1.

 

Depends on your client version, 0x730a0003 could be 0x73080003 or 0x73090003 as well.

 

The failure only happens when connecting to SQL Server 2008 CTP releases. Usually the client involved uses SQLClient which is part of .Net Framework. The reason here is that we don't supported connection between intermediate TDS version. SNAC usually ships with SQL Server and they are managed to be in the same TDS level. If you download SNAC CTP separately or make connection across machines with different level of SNAC, you may also see similar error. As a background, for 0x730a0003 TDS version, 73 represents TDS major version 7.3, 0x03 is the minor version for TDS7.3, 0xa is the intermediate TDS version number.

 

We have shipped the following intermediate TDS version for SQL Server 2008.

 

·         08 first 7.3 TDS version in Katmai  (Shipped with Katmai CTP3 --Jun 2007--, Orcas Beta1)

·         09 TVP (Table Value Parameter) and NewDateTime (Shipped with Katmai CTP4 --build #1049, July 2007--, Orcas Beta2)

·         0A Large UDT (Shipped with Katmai CTP5 --build #1185, Nov 2007--, Orcas RTM, Vista SP1)

·         0B SparseColumn (Shipped with Katmai CTP6 --build #1321 Feb 2008-- )

 

All .Net Framework 3.5 released after Orcas RTM is on 0x0A TDS level. All SQL Server 2008 release after CTP6 (including RC0 and RTM) is on 0x0B TDS level. Connection can only be made between same TDS intermediate version, with one exception:

0B level of SQL Server can also accept 0A level of client. Thus, Orcas RTM can connect to SQL Server 2008 RTM without problem.

 

With the mechanism in place, it's fairly easy to tell which client can connect to which server. e.g. connection from Orcas RTM to Katmai CTP4 server would fail.

 

Orcas RTM contains .Net Framework 3.5. Some words worth to be quoted here: "As with .NET Framework 3.0, version 3.5 uses the CLR of version 2.0. In addition, it installs .NET Framework 2.0 SP1 and .Net Framework 3.0 SP1."

 

As a final note, you should not see the failure when there is no CTP server or beta version of client involved. Connection between RTM of client and server is always guaranteed.

 

 

Xinwei Hong, SQL Server Protocols
Disclaimer: This posting is provided "AS IS" with no warranties, and confers no rights

Troubleshooting SQL Server Cluster Installations

On a Windows Cluster it is possible to install a SQL Server Cluster, which will be a separate entity with disctinct Virtual Server Name, IP address, and pipe name. The nodes of the SQL Cluster must be able to communicate with one another. This communication may fail if the TCP port on which SQL Server is listening is not excepted from firewalls on the machines. This port can be found by using the Configuration Manager utility by viewing the properties of the TCP protocol or by checking the SQL Server errorlog, typically located at "Program Files\Microsoft SQL Server\MSSQL.x\MSSQL\LOG". Here MSSQL.x is the instance ID of the clustered instance in question.

If the Cluster instance being installed is a named instance, a service called SQL Browser is used in order to figure out the port or pipe SQL Server is listening on. For SQL Server Cluster installations, this service is running by default. If this service is not running, it will not be possible to communicate with the remote nodes. To manipulate this service, open services.msc, find SQL Browser, and make sure its state is Automatic and Running. SQL Browser listens on a UDP port, which must also be excepted from firewalls on the machines. Even if SQL Browser is running, in case any of the nodes has firewall ON, it is necessary to add exceptions for UDP port 1434.

These issues with SQL Browser (service not running or TCP and UDP port not excepted from firewall) can also block remote connectivity to non-clustered instances.

More about Windows Firewall impact on APIs for SQL Server enumeration can be read at http://blogs.msdn.com/sql_protocols/archive/2005/09/24/473502.aspx. More about Firewall impact on SQL Server Cluster can be found at http://blogs.msdn.com/sql_protocols/archive/2006/02/27/unable-to-connect-to-a-sql-server-named-instance-on-a-cluster.aspx.

Stoyko Kostov, SQL Server Protocols
Disclaimer: This posting is provided "AS IS" with no warranties, and confers no rights

Connectivity troubleshooting in SQL Server 2008 with the Connectivity Ring Buffer

SQL Server 2008 contains a new feature designed to help troubleshoot particularly tricky connectivity problems. This new feature, the Connectivity Ring Buffer, captures a record for each server-initiated connection closure, including each kill of a session or Login Failed event. To facilitate efficient troubleshooting, the ring buffer attempts to provide information to correlate between the client failure and the server’s closing action. Up to 1k records in the ring buffer are persisted for as long as the server is online, and after 1000 records, the buffer wraps around and begins replacing the oldest records. The Connectivity Ring Buffer’s records are exposed by a DMV which can be queried:

SELECT CAST(record AS XML) FROM sys.dm_os_ring_buffers
WHERE ring_buffer_type = 'RING_BUFFER_CONNECTIVITY'

This will select all records as XML types; in Management Studio, you can click the records to get a more readable version. If you want to use SQL queries on the XML records to locate your particular problem, you can also use SQL Server’s XML support to turn this into a temp table and query against the records.

A basic ring buffer entry: killed SPID

An easy way to cause a server-initiated connection closure is to open two connections to a SQL Server, find the SPID of one of the connections, and then kill that spid from the other connection:

C:\>osql -E
1> SELECT @@spid
2> go
 ------
     51
(1 row affected)

C:\>osql -E
1> kill 51
2> go
1>

If you try this and then query the ring buffer, you will get a result which will look like this:

<Record id="2" type="RING_BUFFER_CONNECTIVITY" time="110448275">

  <ConnectivityTraceRecord>

    <RecordType>ConnectionClose</RecordType>

    <RecordSource>Tds</RecordSource>

    <Spid>55</Spid>

    <SniConnectionId>B7882F3C-3BA9-45A7-8D23-3C5C05F9BDF9</SniConnectionId>

    <SniProvider>4</SniProvider>

    <RemoteHost>&lt;local machine&gt;</RemoteHost>

    <RemotePort>0</RemotePort>

    <LocalHost />

    <LocalPort>0</LocalPort>

    <RecordTime>5/6/2008 22:47:35.880</RecordTime>

    <TdsBuffersInformation>

      <TdsInputBufferError>0</TdsInputBufferError>

      <TdsOutputBufferError>0</TdsOutputBufferError>

      <TdsInputBufferBytes>60</TdsInputBufferBytes>

    </TdsBuffersInformation>

    <TdsDisconnectFlags>

      <PhysicalConnectionIsKilled>0</PhysicalConnectionIsKilled>

      <DisconnectDueToReadError>0</DisconnectDueToReadError>

      <NetworkErrorFoundInInputStream>0</NetworkErrorFoundInInputStream>

      <ErrorFoundBeforeLogin>0</ErrorFoundBeforeLogin>

      <SessionIsKilled>1</SessionIsKilled>

      <NormalDisconnect>0</NormalDisconnect>

      <NormalLogout>0</NormalLogout>

    </TdsDisconnectFlags>

  </ConnectivityTraceRecord>

  <Stack>

    <frame id="0">0X01CA0B00</frame>

    <frame id="1">0X01CA0DB1</frame>

    <frame id="2">0X01DF6162</frame>

    <frame id="3">0X02E53C98</frame>

    <frame id="4">0X02E54845</frame>

    <frame id="5">0X02E57BE9</frame>

    <frame id="6">0X02E38F57</frame>

    <frame id="7">0X02E3B2C0</frame>

    <frame id="8">0X02E3C832</frame>

    <frame id="9">0X02E3D55E</frame>

    <frame id="10">0X781329BB</frame>

    <frame id="11">0X78132A47</frame>

  </Stack>

</Record>

Different record types contain some different information. The three record types exposed by the Connectivity Ring Buffer are: ConnectionClose, Error, and LoginTimers. This record type is a ConnectionClose since it is not a connection close during Login, either caused by timeout or some other login failure scenario:
    <RecordType>ConnectionClose</RecordType>

We can see that it was the connection with SPID 55 that was closed:
    <Spid>55</Spid>
We can see that the connection was local (the text in this tag is an XML-encoded string reading “<local machine>”, which indicates a local, shared memory connection):
    <RemoteHost>&lt;local machine&gt;</RemoteHost>
More useful correlation information is available when the TCP protocol is used for connecting to SQL Server – then the local IP address and port, and remote IP address and port will be included in the record, allowing you to uniquely identify the client machine and application to which the record applies. Additionally, the ring buffer contains the timestamp and whenever possible the SPID of the client process to make the correlation seamless (since, as you probably know, SPIDs may be reused over time by different connections).

 

We can also see exactly how many bytes were in the TDS packet sent by the client, and whether there was any error found in the TDS:
      <TdsInputBufferError>0</TdsInputBufferError>
      <
TdsOutputBufferError>0</TdsOutputBufferError>
      <
TdsInputBufferBytes>60</TdsInputBufferBytes>
The most relevant, easy-to-analyze piece of information for a ConnectionClose record is in the TdsDisconnectFlags, where there are a number of values telling the state of the closed connection. Here we see that no error was found, but that it was also not a normal disconnect or a normal logout for the session. What we do see is a flag indicating that the session was killed:
      <SessionIsKilled>1</SessionIsKilled>

A more interesting example: DC connectivity troubles

Tracking killed SPIDs is fine, but the goal is to help out on tricky problems, before resorting to network traces. Here’s an example of a Login Timers record from the Connectivity Ring Buffer, for a scenario that has up until now been very hard to pinpoint without the costly process of reproducing the problem and taking and analyzing network captures:

<Record id="3" type="RING_BUFFER_CONNECTIVITY" time="112254962">

  <ConnectivityTraceRecord>

    <RecordType>LoginTimers</RecordType>

    <Spid>0</Spid>

    <SniConnectionId>B401B045-3C82-4AAC-A459-DB0520925431</SniConnectionId>

    <SniConsumerError>17830</SniConsumerError>

    <SniProvider>4</SniProvider>

    <State>102</State>

    <RemoteHost>&lt;local machine&gt;</RemoteHost>

    <RemotePort>0</RemotePort>

    <LocalHost />

    <LocalPort>0</LocalPort>

    <RecordTime>5/6/2008 23:17:42.556</RecordTime>

    <TdsBuffersInformation>

      <TdsInputBufferError>0</TdsInputBufferError>

      <TdsOutputBufferError>232</TdsOutputBufferError>

      <TdsInputBufferBytes>198</TdsInputBufferBytes>

    </TdsBuffersInformation>

    <LoginTimers>

      <TotalLoginTimeInMilliseconds>21837</TotalLoginTimeInMilliseconds>

      <LoginTaskEnqueuedInMilliseconds>0</LoginTaskEnqueuedInMilliseconds>

      <NetworkWritesInMilliseconds>2</NetworkWritesInMilliseconds>

      <NetworkReadsInMilliseconds>77</NetworkReadsInMilliseconds>

      <SslProcessingInMilliseconds>3</SslProcessingInMilliseconds>

      <SspiProcessingInMilliseconds>21756</SspiProcessingInMilliseconds>

      <LoginTriggerAndResourceGovernorProcessingInMilliseconds>0</LoginTriggerAndResourceGovernorProcessingInMilliseconds>

    </LoginTimers>

  </ConnectivityTraceRecord>

  <Stack>

    <frame id="0">0X01CA0B00</frame>

   

    <frame id="15">0X02E3C832</frame>

  </Stack>

</Record>

For this situation, on the client we see:
[SQL Server Native Client 10.0]Shared Memory Provider: Timeout error [258].
[SQL Server Native Client 10.0]Login timeout expired
[SQL Server Native Client 10.0]Unable to complete login process due to delay in login response

Getting the OS error message included doesn’t tell us anything:
C:\>net helpmsg 258
The wait operation timed out.

And in the server errorlogs, there is nothing at all. The record from the ring buffer, however, is very interesting. LoginTimers records show the overall processing time:
      <TotalLoginTimeInMilliseconds>21837</TotalLoginTimeInMilliseconds>

This time is then broken into components based on what work was happening at the time (the numbers won’t always add up because of rounding; here they’re off by 1). The interesting component, of course, is the one or ones that took a long time. In this case, the SspiProcessingInMilliseconds looks interesting, at almost 22 seconds:
      <SspiProcessingInMilliseconds>21756</SspiProcessingInMilliseconds>

SSPI, the Security Support Provider Interface, is the interface used by SQL Server for Windows Authentication. When the Windows login used is a domain account, SQL Server uses SSPI to interact with the Domain Controller to authenticate the user. The indication in this record of a very long time spent in SSPI processing indicates that there are latency issues talking to the Domain Controller, likely caused by some faulty network hardware between the SQL Server and DC, or perhaps software problems with the DC. Without even taking network traces, and without having to reproduce the problem (since the Connectivity Ring Buffer is turned on by default), the problem is narrowed down to SQL Server’s interaction with the Domain Controller.

Trace Flags

The Connectivity Ring Buffer is on by default, and by default traces all server-initiated connection closures. If you experience an error on the client but see no entry in the ring buffer, this means that the server has seen a connection closure (reset) similar to normal connection closure from the client, or an erratic one coming from outside the server; for instance, from a malfunctioning piece of networking hardware. If this is the case, you should focus on potential networking issues. If you see an entry in the ring buffer, then it will contain information about why the server decided to close this connection, and it will hopefully be enough to facilitate further troubleshooting. For example, if you see that a connection is closed due to invalid information in the TDS packet, you can investigate for faulty network cards, routers and switches which might have corrupted the data. As you will see below, by using a trace flag you can enable logging of all connection closures and thus observe the exact client-initiated closure conditions and potential errors.

There are two trace flags to change the behavior of the Connectivity Ring Buffer. 

To completely disable the Connectivity Ring Buffer, globally enable trace flag 7826:

DBCC TRACEON (7826, -1)

Client-initiated connection closures are not traced by default because they are a normal usage pattern, rather than always an error case; when a client is done with its session, it disconnects. It is recommended not to trace client-initiated connection closures since they may cause you to miss a real error case when it occurs, either by eventually writing over the error record when the buffer rolls around (which will happen faster if you have many correctly-behaving clients opening and closing connections), or just by hiding the error in a long list of normal connection closures. However, the option to trace them, in addition to the above events, is available by globally enabling trace flag 7827:

DBCC TRACEON (7827, -1)

What are all those <frame> tags?

The sys.dm_os_ring_buffers DMV gives access to a variety of internal information, beyond just the Connectivity Ring Buffer, and as part of the infrastructure of this DMV, most of these ring buffers provide the stack trace where the event occurred. The </Stack> tag indicates the beginning of a stack trace, and each <frame> gives the hexadecimal address of a function. These can be resolved into function names by taking a dump of the sqlservr.exe process, opening the dump in Windbg, and using the lm command on the function address.

Dan Benediktson & Jivko Dobrev
SQL Server Protocols

Disclaimer: This posting is provided "AS IS" with no warranties, and confers no rights

Understanding the error message: “Login failed for user ''. The user is not associated with a trusted SQL Server connection.”

Understanding the error message: “Login failed for user ''. The user is not associated with a trusted SQL Server connection.”

This exact Login Failed error, with the empty string for the user name, has two unrelated classes of causes, one of which has already been blogged about here: http://blogs.msdn.com/sql_protocols/archive/2005/09/28/474698.aspx.  In addition to an extra space in the connection string, the other class of causes for this error message is an inability to resolve the Windows account trying to connect to SQL Server.  This list is not intended to be exhaustive, but here are several known root causes for this error message. 

1)      If this error message occurs every time in an application using Windows Authentication, and the client and the SQL Server instance are on separate machines, then ensure that the account which is being used to access SQL Server is a domain account.  If the account being used is a local account on the client machine, then this error message will occur because the SQL Server machine and the Domain Controller cannot recognize a local account on a different machine.  The next step for this is to create a domain account, give it the appropriate access rights to SQL Server, and then use that domain account to run the client application.  Note that this case also includes the special accounts “NT AUTHORITY\LOCAL SERVICE” and “NT AUTHORITY\NETWORK SERVICE” trying to connect to a remote SQL Server, when authentication uses NTLM rather than Kerberos.

One very common case where this can occur is when creating web applications with SQL Server and IIS; often, the web page will work during development, then errors occur with this message after deploying the web site.  This occurs because the developer’s account has access to SQL Server, but the account IIS runs as does not have access.  To fix this specific problem, refer to this kb article about impersonating a domain user in ASP.NET: http://support.microsoft.com/kb/306158

2)      Similar to above: this error message can appear if the user logging in is a domain account from a different, untrusted domain from the SQL Server’s domain.  The next step for this is either to move the client machine into the same domain as the SQL Server and set it up to use a domain account, or to set up mutual trust between the domains.  Setting up mutual trust is a complicated procedure and should be done with a great deal of care and due security considerations.

3)      This error message can appear immediately after a password change for the user account attempting to login.  This occurs because of caching of the client user’s credentials.  The next step here is to log out the application user with the old password, and re-login with the new password before running the application.

4)      If this error message only appears sporadically in an application using Windows Authentication, it may result because the SQL Server cannot contact the Domain Controller to validate the user.  This may be caused by high network load stressing the hardware, or to a faulty piece of networking equipment.  The next step here is to troubleshoot the network hardware between the SQL Server and the Domain Controller by taking network traces and replacing network hardware as necessary.

5)      This error message can appear consistently for local connections using trusted authentication, when SQL Server’s SPN is not interpreted by SSPI as belonging to the local machine.  This can be caused either by a misconfiguration of DNS, or by a machine having multiple names.  If your machine has multiple names, try to work around the need for multiple names and give it a unique name.  If the machine just has one name, then check your DNS configuration.

Dan Benediktson
SQL Server Protocols

Disclaimer: This posting is provided "AS IS" with no warranties, and confers no rights

Steps to troubleshoot SQL connectivity issues

We have been seeing and trying to resolve SQL connectivity issue all the time. I guess it would be helpful if we can put some guidance on how to resolve connectivity issues. Here comes a proposal based on my experience.

 

Basically, when you failed to connect to your SQL Server, the issue could be:

1) Network issue,

2) SQL Server configuration issue.

3) Firewall issue,

4) Client driver issue,

5) Application configuration issue.

6) Authentication and logon issue.

 

Usually, customers see connectivity issue in their applications, but it would be great if you can follow the steps below to eliminate issues one by one and post a question on SQL Server Data Access forum if needed.

 

 

Step 1: Network issue

You might be able to make local connection without a working network, but that's a special case. For remote connection, a stable network is required. The first thing to trouble shoot SQL connectivity issues is to make sure the network we rely on is workable and stable. Please run the following commands:

 

ping -a <your_target_machine>    (use -4 and -6 for IPv4 and IPv6 specifically)

ping -a <Your_remote_IPAddress>

nslookup (type your local and remote machine name and IP address multiple times)

 

Be careful to see any mismatch on the returned results. If you are not able to ping your target machine, it has high chance that either the network is broken or the target machine is not running. It's possible the target machine is behind a firewall and the firewall blocks the packets sent by ping, though. Windows firewall does not block ping (ECHO) packet by default. The correctness of DNS configuration on the network is vital to SQL connection. Wrong DNS entry could cause of all sorts of connectivity issue later. See this link for example, "Cannot Generate SSPI Context" error message, Poisoned DNS.

 

Step 2: SQL Server configuration issue

You need to make sure the target SQL Server is running and is listening on appropriate protocols. You can use SQL Server Configuration Manager (SCM) to enable protocols on the server machine. SQL Server supports Shared Memory, Named Pipes, and TCP protocols (and VIA which needs special hardware and is rarely used). For remote connection, NP and/or TCP protocols must be enabled. Once you enabled protocols in SCM, please make sure restart the SQL Server.

 

You can open errorlog file to see if the server is successfully listening on any of the protocol. The location of errorlog file is usually under:

%ProgramFile%Microsoft SQL Server/MSSQLxx.xxx/MSSQL/Log

If the target SQL instance is a named instance, you also need to make sure SQL Browser is running on the target machine. If you are not able to access the remote SQL Server, please ask your admin to make sure all these happen.

 

Step 3: Firewall issue

A firewall on the SQL Server machine (or anywhere between client and server) could block SQL connection request. An easy way to isolate if this is a firewall issue is to turn off firewall for a short time if you can. Long term solution is to put exception for SQL Server and SQL Browser.

 

For NP protocol, please make sure file sharing is in firewall exception list. Both file sharing and NP use SMB protocol underneath.

For TCP protocol, you need put the TCP port on which the SQL Server listens on into exception.

For SQL Browser, please put UDP port 1434 into exception.

Meanwhile, you can put sqlservr.exe and sqlbrowser.exe into exception as well, but this is not recommended. IPSec between machines that we are not trusted could also block some packets. Note that firewall should never be an issue for local connections.

 

 

Step 4: Client driver issue

At this stage, you can test your connection using some tools. The tests need to be done on client machine for sure.

 

First try:

telnet <your_target_machine> <TCP_Port>

You should be able to telnet to the SQL server TCP port if TCP is enabled. Otherwise, go back to check steps 1-3. Then, use OSQL, SQLCMD, and SQL Management Studio to test sql connections. If you don't have those tools, please download SQL Express from Microsoft and you can get those tools for free.

 

OSQL (the one shipped with SQL Server 2000) uses MDAC.

OSQL (the one shipped with SQL Server 2005 & 2008) uses SNAC ODBC.

SQLCMD (shipped with SQL Server 2005 & 2008) uses SNAC OLEDB.

SQL Management Studio (shipped with SQL Server 2005 & 2008) uses SQLClient.

 

Possilbe command use be:

osql -E -SYour_target_machine\Your_instance for Windows Auth

osql -Uyour_user -SYour_target_machine\Your_instance for SQL Auth

 

SQLCMD also applies here. In addition, you can use “-Stcp:Your_target_machine, Tcp_port” for TCP,  “-Snp:Your_target_machine\Your_instance” for NP, and “-Slpc:Your_target_machine\Your_instance” for Shared Memory. You would know if it fails for all protocols or just some specific procotols.

 

At this stage, you should not see general error message such as error 26 and error 40 anymore. If you are using NP and you still see error 40 (Named Pipes Provider: Could not open a connection to SQL Server), please try the following steps:

a)       Open a file share on your server machine.

b)       Run “net view \\your_target_machine” and “net use \\your_target_machine\your_share  (You can try Map Network Drive from Windows Explorer as well)

If you get failure in b), it's very likely you have OS/Network configuration issue, which is not SQL Server specific. Please search on internet to resolve this issue first.

 

You can try connection using both Windows Authentication and SQL Authentication. If the tests with all tools failed, there is a good chance that steps 1-3 were not set correctly, unless the failure is logon-related then you can look at step 6.  

 

If you succeeds with some of the tools, but fails with other tools, it's probably a driver issue. You can post a question on our forum and give us the details.

 

You can also use “\windows\system32\odbcad32.exe” (which ships with Windows) to test connection by adding new DSN for various drivers, but that's for ODBC only.

 

 

Step 5: Application issue

If you succeed with steps 1-4 but still see failure in your application, it's likely a configuration issue in your application. Think about couple of possible issues here.

a) Is your application running under the same account with the account you did tests in step 4? If not, you might want to try testing in step 4 under that account or change to a workable service account for your application if possible.

b) Which SQL driver does your app use?

c) What's your connection string? Is the connection string compatible to your driver? Please check http://www.connectionstrings.com/ for reference.

 

 

Step 6: Authentication and logon issue

This is probably the most difficult part for sql connectivity issues. It's often related to the configuration on your network, your OS and your SQL Server database. There is no simple solution for this, and we have to solve it case by case. There are already several blogs in sql_protocols talking about some special cases and you can check them see if any of them applies to your case. Apart from that, things to keep in mind:

a) If you use SQL auth, mixed authentication must be enabled. Check this page for reference http://msdn.microsoft.com/en-us/library/ms188670.aspx

b) Make sure your login account has access permission on the database you used during login ("Initial Catalog" in OLEDB).

c) Check the eventlog on your system see if there is more information

 

At last, please post question on our forum. More people could help you over there. When you post question, you can refer to this link and indicate you see failure at which step. The most important things for us to troubleshoot are a) exact error message and b) connection string.

 

 

Xinwei Hong, SQL Server Protocols
Disclaimer: This posting is provided "AS IS" with no warranties, and confers no rights

Could not connect to X64 named instance

We see a couple of cases where customer could not connect to named instance on X64. They all have the following symptons.

1) Customer has no problem to connect X86 instance on the same server machine even if it's a named instance.
Customer can even connect to some of the X64 named instance, but always see problem when connect to some specific named X64 instance.
2) SQL Server is confirmed up and running. The server is listening on TCP and/or NP protocols.
3) SQL Browser is confirmed in running state.
4) Customer confirmed it's not a firewall issue. (either by disabling firewall on both client and server machine temprarily, or by adding needed firewall exceptions for application or port number)

If you see this happens on your X64 named instance, then the solution below may applies to you. We have confirmed a bug in SQL Server 2005 could cause this problem.

Let's say your X64 instance is INST1. You can find its instance ID from HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\Instance Names\SQL, say it's MSSQL.3. Then the configuration of this instance is under 
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL.3
Please check if you have the following entry in registry:
HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432\Microsoft\Microsoft SQL Server\MSSQL.3

If you do have it, then it's the root cause of the problem and you should remove it to solve the problem. INST1 is an X64 instance and its configuration should not present under Wow hive. This wrong registry might come from a previous instance which you have removed but the registry is not cleaned by the SQL uninstallation for some reason.

A good news is that the issue is resolved in SQL Sever 2008. Browser will response correctly even if there is wrong entry.

Xinwei Hong, SQL Server Protocols
Disclaimer: This posting is provided "AS IS" with no warranties, and confers no rights

Understanding “Connection forcibly closed by remote host” Errors Caused by TOE/Chimney

Sporadic “Connection forcibly closed by remote host” errors with SQL Server connections can be very difficult to troubleshoot and resolve. This blog post is targeted at diagnosing TOE/Chimney issues that may lead to this client error message. Chimney is a feature introduced in the Windows Server 2003 Scalable Networking Pack, which was included in Windows Server 2003 SP2. Chimney increases network performance when using a network card which implements TOE, TCP/IP Offload Engine, which is a hardware implementation of the TCP/IP stack.

The following are the symptoms to look for:

·         The client connection is sporadically failing with the message: “TCP Provider: Connection forcibly closed by remote host.” The client connection may, in addition, sometimes fail with the message: “General network error”.

·         There are no corresponding network-related error messages in the SQL Server instance’s ERRORLOGs. Normally, the “Connection forcibly closed by remote host” message on the client indicates that an error occurred on the server which is deemed severe enough to close the connection; in that case, the server would log an error message explaining why the connection was closed. An example error message for this would be Error 17828: “The prelogin packet used to open the connection is structurally invalid; the connection has been closed. Please contact the vendor of the client library.” However, if the issue is in the networking hardware, such as a TOE-related issue, there will be no message in the SQL Server instance’s ERRORLOGs for this connection closure, since the server is not intentionally closing the connection. Therefore, check the SQL Server ERRORLOG for an absence of any corresponding network-related error messages.

·         There is no other client killing the first client’s connection. In addition to potential network hardware causes, the “Connection forcibly closed” message can also appear with no corresponding server ERRORLOG message if the client’s connection is being killed by a different client. Examine the SQL Server ERRORLOG for KILL statements; if there are none, then no other client is killing SQL Server connections.

If all three of these symptoms are appearing, your problem is likely due to a faulty piece of network hardware, possibly due to TOE/Chimney.

To test if TOE/Chimney is the source of your problem, you can disable it and see if the problem goes away. You should do this for BOTH the client and server, since TOE/Chimney on either machine, or both, could be the cause of the issue. To disable Chimney, run this command (if on Windows Vista or Windows Server 2008, run it at an elevated command prompt):

netsh int ip set chimney DISABLED

This command does NOT require a reboot. If you have these symptoms and running this command doesn’t fix the problem, then you likely have an issue with network hardware and should follow up by investigating your network hardware. This kb article should give you some leads on how to begin network troubleshooting: http://support.microsoft.com/kb/325487

Dan Benediktson
SQL Server Protocols

Disclaimer: This posting is provided "AS IS" with no warranties, and confers no rights

Why Browser is installed under 90\shared directory rathar than 100\Shared?

After installing SQL Server 2008, some users might notice that sqlbrowser.exe is installed under %ProgramFiles%Microsoft SQL Server\90\Shared, rather than %ProgramFiles%Microsoft SQL Server\100\Shared as some users expect. When setup fails for some reason, some user may think this is the culprit. As matter of fact, this is by design, not a bug.

Some backgound here. SQL Server 2005 is also knowns as SQL Server 9 (code name Yukon), and SQL Server 2008 is also known as SQL Server 10 (code name Katmai).  So, usually shared component of SQL Server 2005 is installed under 90\Shared and shared component of SQL Server 2008 is installed under 100\Shared.

Then, why do we install sqlbrowser.exe under 90\shared, rather than 100\shared during SQL Server 2008 setup? The short answer is that SQL Server Browser is a shared component, not only by all instances of SQL Server, but also by different versions of SQL Server, i.e. It's shared by all SQL Server instances on a machine, no matter the instance comes from which version of SQL Server.

SQL Server Browser service used UDP port 1434 to receive request and send information about SQL Server instances to clients. Only one process can listen on that port. This basically means that there can be only one running SQL Server Browser service on a machine.

In order to support SQL Server 2005 and SQL Server 2008 Side-by-Side(SxS), the binaries (sqlbrowser.exe and its resource files) need to be kept in the same place. Otherwise, we have to uninstall SQL Server 2005 Browser before installing SQL Server 2008, but it is not feasible without some complicated redesign. More importantly, we need to support the case that customer install SQL Server 2005 after SQL Server 2008.

The installation of SQL Server Browser binaries are managed by Windows Installer. The SQL Browser component in SQL Server 2005 and SQL Server 2008 share the same GUID in term of Windows Installer. Windows installer will make sure the binaries on the box is always the newest one. The component will be ref-counted and will only be removed when the refcount decreased to 0.

SQL Serve Browser bacomes a seperate application starting from SQL Server 2005. In SQL Server 2000, Browser service is a thread in sqlservr.exe. So, users can consider Browser compoment was introduced in SQL Server 2005 and that's why it's installed under 90\Shared.

For the similar reason, SQL Server Browser group is kept the same as SQLServer2005SQLBrowserUser$<MachineName> in SQL Server 2008. SQL Server resources (registry and file folders) are ACL'ed to this group and it's needed to keep the same in order to access the resources.

Xinwei Hong, SQL Server Protocols
Disclaimer: This posting is provided "AS IS" with no warranties, and confers no rights

Service Principal Name (SPN)

Did you know that beginning with SQL Server 2008, support for service principal names (SPNs) has been extended to enable mutual authentication across all protocols. Administators can now define their own SPNs. Thus SQL Server 2008 makes secure authentication more manageable and reliable by allowing clients to directly specify the SPN to use.

Fore more information, click here:

http://msdn2.microsoft.com/en-us/library/cc280459(SQL.100).aspx

----
Tres London
SQL Server Protocols
Disclaimer: This posting is provided "AS IS" with no warranties, and confers no rights

TCP Provider: No connection could be made because the target machine actively refused it

In this post, I am trying to list out potential cause and general solution to this specific error when you are making connection to SQL Server 2005.

First, It is due to TCP connection fail, since you can tell from the title, "TCP Provider: .xx", which indicates that either your connection string explicitly use 'tcp:" prefix, say "Server=tcp:<machinename> ...." or your client protocol setting ask for first try Named Pipe connection, if fails, then fallback to TCP connection. For eg, in "Client NetworkUtility", you see the protocol order is "np on top of tcp". 

Secondly, this error is winsock error "WSAECONNREFUSED  - 10061". You'd better go through the following checklist to fix the "connection refused" problem.

1) Are you connecting to the right server? Check out whether your target machine name is valid.

2) Whether the <machinename> you specified in your connection string resolve to the correct ipaddress? do:

  ping <machinename> (if your remote server has ipv6 enabled, it might return ipv6 address). Go to the remote machine to see whether it returned the correct ipaddress.

 Then do ping -a <ipaddress>, can it be resolved to correct target machine name?

3) Are you connecting to default instance or named instance? Please go to the target machine, to check whether the instance started successfully, you can go to "Service Control Manager", find out "MSSQLSERVER" or "MSSQL$<instancename>" whether they are running?

4) Does your target SQL instance enabled TCP and listening on a valid TCP port?

a. If it is default instance, on your client machine, try : "telnet <ipaddress> 1433", remember, SQL Server 2005 default instance always use the reserved port 1433, namely, if you modify it to any other port number, the client app can not connect to the sql instance.

b. If it is named instance, on your target server machine, go to "SQL Server Configuration Manager", open "Protocols for MSSQL$<instancename>", check TCP enabled, and right click TCP, find out the tcp port number under 'IPAll". Then on your client machine, do " telnet <hostname> <portnumber>", see whether it succeeds.

5) Is your browser service running on the target server?

If you connect to named instance, then on your target server machine do "net start sqlbrowser".

6) Are your target server behind firewall?

If so, go to " SQL Server 2005 Surface Area configuration", click "Surface Area configuration for service and connection", then click the instance name, enable its remote connection.

You should be able to see the sql tcp port and udp port(1434 for browser service) is opened in Firewall, which enable you can telnet from client.

7) Does your client app set any alias that cause the connection redirect to an invalid port?

a. If you are using MDAC: on your client machine, click "cliconfig.exe", check out alias, whether it points to some other port that you can not telnet, if so, correct it or remove alias, since you should be able to connect through servername by sqlbrowser service automatcally respond to the correct port which server is listening on.

b. If you are using SNAC/SqlClient,  go to SQL Configuration Manager, check out client protocols, see alias setting.

8) Check out IPSec setting if you still face problem. See whether IPSec specification make dropping any packets.

9) If you still face problem, please use "Evenvwr" on the client machine, try to look up the remote server application log, see any info from SQL Server that indicates the sql instance reject client connection. Under this situation, it probably due to invalid client request. Open SQL Profile also can help you dig out which client data operation make server terminate the connection.

MING LU

SQL Server Protocols

Disclaimer: This posting is provided "AS IS" with no warranties, and confers no rights

Login failed for user <xx>, Msg 18456, Level 14, State 1

This post focus on a typical cause of sql login failure issue described as the title.

In real world, a customer log on to the machine as a domain\user or local machine accout, he can connec to SQL Server through SQL Server Management Studio. However, he fails to connect by using SQL Login, namely, in connection string specify User as the NT account, and the password.

An example:

He did: "Create login [domain\user] from windows", then try -

osql -S -U [domain\user] -P xx or osql -S -U[machineaccount] -P xx.

To explain the reason, first you need to understand two Authentication Modes in SQL Server 2005:

  • Windows authentication   Only authenticated windows users can gain access to the SQL instance. You need to add a windows login for each windows user or group that needs access to SQL istance. This is default and recommended mode because you can take advantage of all the centralized security policers of your Active Directory domain.
  • Mixed Mode authentication Both windows login and SQL Server logins(neither of which are mapped to an operating system users) can access the SQL instances. You use mixed mode when you need to provide access to non-Windows-users for example when users of another operating system need access to SQL.

The –U and –P options are only for SQL Authentication. The SQL Client will not try to impersonate a windows user, instead it directly send the user/password to SQL Server and try to find a matching SQL login.

The solution is: You need to runas to impersonate the windows user and connecting using -E option in "osq.exe" or specify "Integrated Security=SSPI" or "Trusted_Connection = yes" in your connection string.

MING LU

SQL Server Protocols

Disclaimer: This posting is provided "AS IS" with no warranties, and confers no rights

 

More Posts Next page »
Page view tracker