CSS SQL Server Engineers

This is the official team Web Log for Microsoft Customer Service and Support (CSS) SQL Support. Posts are provided by the CSS SQL Escalation Services

How It Works: Conversion of a Varchar, RPC Parameter to Text from a Trace (.TRC) Capture

How It Works: Conversion of a Varchar, RPC Parameter to Text from a Trace (.TRC) Capture

  • Comments 1

The Senior Escalation Engineers do various training and mentoring activities.  As I do this I thought I would try to propagate some of this information on the blog.

This was an interesting issue I ran into this week.   The reported problem was that the data shown as TEXT for an RPC event did not match the data submitted by the client.  

SQL Server 2005 introduced an optimization for capturing RPC events.  During a SQL Server 2005 capture you can exclude the TEXT column from RPC events and the BINARY column contains the necessary data to construct the TEXT column.  By capturing the BINARY column and allowing the TEXT to be formatted when the trace is viewed (Profiler or fn_trace_gettable) and the tracing impact is lowered.

The interesting aspect comes when the BINARY data is converted to the UNICODE TEXT column.   Take the following procedure definition as an example, taking special care to notice the multi-byte character and UNICODE character definitions.

create procedure spTest  @sVarchar varchar(10), @sNVarchar nvarchar(10) ...

Note: The trace TEXT column is always UNICODE column.     

The BINARY column contains the portion holding the raw parameter data values.  So the process that reads the trace must convert the varchar to UNICODE in order to build the TEXT column.

  • When using fn_trace_gettable the process is the SQL Server instance.
  • When using Profiler the process is the Profiler process.

The SQL Server products use MultiByteToWideChar to convert the varchar data to the UNICODE representation using the CP_ACP code page which tells the conversion to use the SYSTEM DEFAULT LANGUAGE (locale).

Assume a client is Russian (Cyrillic).1251 based and submitted the following query.

{call spText('ГУРМ', N'ГУРМ')}

My machine is a US English.1252 based.  When I open the trace file I see the following because the character to UNICODE is using my system default language to perform the conversion.

exec spTest 'ÃÓÐÌ34', N'ГУРМ'

If I change my system default locale to the Russian I see the following.

exec spTest 'ГУРМ', N'ГУРМ'

As a general rule it is unwise to store language specific data in single byte character columns, nvarchar should be used instead to avoid unexpected conversions.

The solution for this issue was to make sure the trace was being processed using a computer with the system language set to the expected data. 

To check your system default language (Vista example) use the Control Panel | Regional and Language Options.

The system language is located under the Administrative table | Change System Locale

image

Don't get confused with the Formats tab.  This is the default USER locale and is not used by the CP_ACP designation in MultiByteToWideChar.

image


LANGUAGE EVENTS: What about the Language (adhoc query) event TEXT?   These are UNICODE and built during the trace capture so the process that reads the trace file already has the UNICODE representation.

REPLAY: It is important to understand the possible translation issue because this could have an affect on replay attempts.

RML UTILITIES: This issue exposed an problem with ReadTrace's (9.00.0023) processing of the Varchar, RPC data.  Contact DSDBTOOL@MICROSOFT.COM if you encounter this issue.

Bob Dorr
SQL Server Senior Escalation Engineer

Leave a Comment
  • Please add 2 and 8 and type the answer here:
  • Post
  • >>>As a general rule it is unwise to store language specific data in single byte character columns, nvarchar should be used instead to avoid unexpected conversions.

    I always took this as LAW, rather than a general rule. Also if you were storing text of any kind and if there was even a slight chance of it being used by a different culture then go with nvarchar. The only thing that would ever override this Law was if there was hard drive space considerations because if the double byte size by the nvarchar.

    Am I wrong in thinking this way? If so could you explain?

    In my case I am a developer for a global company so I do a lot of multicultural apps. So I use pretty much nvarchar for about everything. But even if I wasn't building multicultural I still think of this as law,

     

    [RDORR Jan 24 2008] I personally agree with you.  There are some older databases where data came from 4.21 or 6.x days.  If you were using SQL Server during that timeframe it was a bit of a mess with things like ANSItoOEM and auto conversion so we still see a few of these yet.

Page 1 of 1 (1 items)