In this blog, I will give some my recommendation of SQL Server's Globalization Development.

1.       Use nchar/nvarchar instead of char/varchar type.   Nchar/nvarchar type is SQL Server's Unicode data type, which can store any characters defined by Unicode Org.  char/varchar type always associates with a code page, so the number of supported characters is limited.  I give you some fact to show why use Unicode data type is recommendation: 

o   Starting from Windows 2000, Windows have full Unicode support implemented by using UTF-16 Encoding.  Windows APIs take WCHAR* as input which represent a Unicode String. 

o   Virtual C++ has WCHAR, which is Unicode char type.

o   .Net String is Unicode String only encoded in UTF-16.

o   Java String is Unicode String only encoded in UTF-16.

o   Char type might have data corruption issue if your client locale is different with server locale.

So it is time to deprecated the old code page technique, and use the Unicode in your application, and also in your SQL Server database.   The only benefit of using char type is the space saving for single byte code page.  You can always use the Data Compression feature in SQL Server 2008 if you care about disk space, which can save more space than using char type.  

2.       Also remember to put N' for your string literal in T-SQL.   String literal N'abc'   is a Unicode nvarchar string literal.  Literal ‘ab©'  is a varchar literal, it always associates with a codepage (string literal always use current database's collation). Suppose char © is not in the code page, you will get a question mark (?) when inserting the literal into a table,  even the column is nvarchar type. 

3.       Use nvarchar instead of nchar.  The different between nvarchar and nchar is the storage.  A column with nchar(30) type always take 60 bytes to store on the disk, even the value is a single character.  The data size for a nvarchar(30) type column is not fixed,  it varies row by row or value by value.  A single character value takes 2 bytes to store, and a value with 30 characters long  takes 60 bytes to be stored.  Another different between nvarchar and nchar type is the performance.  Nchar types always stored in fixed location in every row, which can be retrieved faster than nvarchar type which is stored in different location for different row.  However, I believe the benefit of less stored space for nvarchar types usually overcomes the cost of locating the value in a row. 

4.      If possible, avoid using a column collation which is different with the database's collation.   You will have less collation conflict.   If you want a query use a special collation's sorting rule, use explicit collate clause in that query.

5.       Use Windows Collation instead of SQL Collation.   The only exception here is the default collation for en_US locale which is sql_latin1_general_cp1_ci_as.

6.       Never store UTF-8 string in varchar types, you will get data corruption.

7.       Keep in mind, string comparison always ignore trailing spaces (Unicode U+0020), no matter what collation you are using.

8.       Keep in mind, LEN function always return the number of characters exclude the trailing spaces. DataLength function returns the storage size in term of bytes.

9.       using _BIN2 collation, instead of  _BIN collation if you want binary, code page base string comparison behavior.