CLR User Defined Types (UDTs) are a fantastic new addition to SQL Server 2005 (the database formerly known as Yukon). Every Yukon UDT must specify the type of serialization format it uses so that the engine knows how to roundtrip data represented by an instance of that UDT between the TSQL (unmanaged) and managed worlds. For example, conversion of a UDT instance into an NVARCHAR instance causes the UDT serialization code to be invoked.
As of Beta 2, Yukon supports two serialization formats:
- Format.Native
- Format.UserDefined
Yukon maintains UDT instances in the unmanaged world as a special byte stream annotated with metadata information, whereas it is maintained as an instance of a CLR type in the managed runtime.
Format.Native serialization: This is the simple form of serialization where the engine automatically performs serialization and deserialization of instances of the type. Yukon UDTs employing this form of serialization are basically CLR classes or structs in the managed world, such that every single instance of that type has the same storage requirements as the others. Therefore, fields of such UDTs are all of types whose storage requirements are known at compile-time (you could think of them as “blittable” types). For example, System.Int32 and System.Byte among others are valid for native serialization; System.String is not valid since different instances of a System.String can have different lengths. The engine computes the storage requirements for this type at type registration time (i.e. during a CREATE TYPE operation) and saves this away in internal metadata tables.
Format.UserDefined serialization: This form of serialization for a UDT requires more work on the part of the UDT author who must explicitly provide the code that will serialize instances of such a UDT since the engine has no idea what parts of which fields we have decided to persist. In other words, garbage in, garbage out. In fact, the engine requires us to implement the System.Data.Sql.IBinarySerialize interface containing the Read() and Write() methods for deserialization and serialization, respectively. While this does mean that we must write additional code to make sure our own data is roundtrippable, it is in fact a rather powerful feature – we can now create UDTs containing fields of type System.Collections.Hashtable or array types as long as we take the responsibility to correctly serialize and deserialize instances of the UDT in our implementations of Read(System.IO.BinaryReader) and Write(System.IO.BinaryWriter).
Which brings me to the topic of this posting: the importance of having correct serialization code. I was playing around with a Yukon UDT to store an address using Format.UserDefined serialization. My UDT has a public field ‘m_XmlFullAddress’ of type SqlXml to contain an XML representation of the address. The Read() and Write() methods alone are shown below for convenience:
//******************************************************
// IBinarySerialize interface
//******************************************************
public void Read(BinaryReader r)
{
m_fNotNull = r.ReadBoolean();
if (!r.ReadBoolean()) {
m_firstline = r.ReadString();
}
else {
m_firstline = SqlString.Null;
}
if (!r.ReadBoolean()) {
m_secondline = r.ReadString();
}
else {
m_secondline = SqlString.Null;
}
}
public void Write(BinaryWriter w)
{
w.Write(m_fNotNull);
w.Write(m_firstline.IsNull);
if (!m_firstline.IsNull) {
w.Write(m_firstline.ToString());
}
w.Write(m_secondline.IsNull);
if (!m_secondline.IsNull) {
w.Write(m_secondline.ToString());
}
}
With my implementation of the Address UDT, when I attempted to set the value of ‘m_XmlFullAddress’ explicitly using the public field accessor syntax and then read it back immediately afterward, it would not persist. The TSQL code excerpt I used to do this is shown below:
declare @a Address
set @a = CONVERT(Address, '911 Fire Lane |Circuit City, XO 00000')
select @a. m_XmlFullAddress
set @a.m_XmlFullAddress = '<address/>'
select @a. m_XmlFullAddress
go
The second SELECT returns the same result as the first. It foxed me for a while since I was under the impression that having just set a value on the UDT variable, immediately retrieving that same value should somehow obviously work, right? I mean the engine probably maintains an instance of the Address class in the hosted CLR and would simply access the corresponding field I just set a moment ago.
Not so.
The server actually relies on the serialization code to perform any operations that modify the instance of the UDT. Setting the ‘m_XmlFullAddress’ property in the TSQL world (using the SET statement above) actually causes SQL Server to follow a series of steps (not necessarily in the exact order shown):
- infer that @a.m_XmlFullAddress is of type SqlXml
- take the NVARCHAR string supplied (the right hand side operand of the assignment), and convert it to an XML data type instance implicitly (since the XML data type maps to the managed SqlXml data type)
- now convert the XML data type instance from step 2 into a managed SqlXml instance using the internal conversion/mapping routines
- send the set of bytes representing the current value of the UDT variable @a to the managed world by invoking our deserialization method (IBinarySerialize.Read()), and instantiate this UDT in the managed runtime using its equivalent CLR type
- perform the assignment of the new SqlXml value to the SqlXml field of the managed instance
- invoke our serialization code (IBinarySerialize.Write()) to convert the modified managed UDT instance back into a byte stream representing the TSQL UDT variable world and assign it to the variable @a
Note from the steps above how our deserialization and serialization code is invoked in steps 4 & 6 respectively, even though we were performing what we thought was an innocuous and straightforward field mutation. It should now be hopefully evident that the real reason why our code did not work is because our serialization code was ignoring the SqlXml field. By leaving out m_XmlFullAddress in IBinarySerialize.Read(), its value is never read in or set during deserialization, and by leaving it out in IBinarySerialize.Write() it is never saved out during serialization.
Correcting this problem is relatively simple. The modified Read() and Write() methods are shown below for convenience:
//******************************************************
// IBinarySerialize interface
//******************************************************
public void Read(BinaryReader r)
{
m_fNotNull = r.ReadBoolean();
if (!r.ReadBoolean()) {
m_firstline = r.ReadString();
}
else {
m_firstline = SqlString.Null;
}
if (!r.ReadBoolean()) {
m_secondline = r.ReadString();
}
else {
m_secondline = SqlString.Null;
}
// get the XML value
if (!r.ReadBoolean()) {
string xml = r.ReadString();
m_XmlFullAddress = SqlXml_Parse(xml);
}
else {
m_XmlFullAddress = SQLXML_NULL;
}
}
public void Write(BinaryWriter w)
{
w.Write(m_fNotNull);
w.Write(m_firstline.IsNull);
if (!m_firstline.IsNull) {
w.Write(m_firstline.ToString());
}
w.Write(m_secondline.IsNull);
if (!m_secondline.IsNull) {
w.Write(m_secondline.ToString());
}
// write the xml
w.Write(m_XmlFullAddress.IsNull);
if (!m_XmlFullAddress.IsNull) {
string xml = m_XmlFullAddress.Value;
w.Write(xml);
}
}
Here's the full C# code for the Address UDT. This is the code containing the corrections/additions necessary for correct serialization. Note that this code is provided AS-IS so don't sue me if it causes your toaster to reboot infinitely:
#region Using directives
using System;
using System.Data.SqlTypes; // for INullable
using System.Data.Sql; // for IBinarySerialize, SqlUserDefinedTypeAttribute
using System.IO;
using System.Collections.Generic;
using System.Text;
using System.Xml;
#endregion
namespace Examples
{
/// <summary>
/// Address CLR User-defined type using UserDefined serialization
/// </summary>
[Serializable]
[SqlUserDefinedType(Format.UserDefined, IsByteOrdered = true, MaxByteSize = 4096, IsFixedLength = false)]
public class Address : INullable, IBinarySerialize
{
//******************************************************
// Address Public Members
//******************************************************
public SqlXml m_XmlFullAddress;
//******************************************************
// Address Private Members
//******************************************************
private SqlString m_firstline;
private SqlString m_secondline;
private bool m_fNotNull; //false if null, default ctor makes it null
//******************************************************
// Constructors
//******************************************************
// Constructor for a null value: Address.Null
public Address()
{
m_fNotNull = false;
m_firstline = SqlString.Null;
m_secondline = SqlString.Null;
m_XmlFullAddress = SQLXML_NULL;
}
public Address(SqlString line1, SqlString line2)
{
if (line1.IsNull && line1.IsNull) {
m_fNotNull = false;
m_firstline = SqlString.Null;
m_secondline = SqlString.Null;
m_XmlFullAddress = SQLXML_NULL;
}
else {
m_fNotNull = true;
m_firstline = line1;
m_secondline = line2;
m_XmlFullAddress = SQLXML_THIS_INSTANCE;
}
}
//******************************************************
// INullable interface
//******************************************************
// INullable
public bool IsNull
{
get
{
return !m_fNotNull;
}
}
public static Address Null
{
get
{
return new Address();
}
}
//******************************************************
// Common static methods for SQL UDTs
//******************************************************
public override string ToString()
{
if (IsNull) {
return "Null";
}
else {
return Value.ToString();
}
}
public static Address Parse(SqlString s)
{
if (s.IsNull) {
return Address.Null;
}
String str = s.ToString();
String[] lines = new String[2];
// using || to indicate separation between address lines 1 and 2
// for now assume it won't appear in any address
lines = str.Split(new char[] { '|' });
if (lines.Length == 2) {
return new Address(lines[0], lines[1]);
}
else if (lines.Length == 1) {
return new Address(lines[0], SqlString.Null);
}
else {
return Address.Null;
}
}
//******************************************************
// IBinarySerialize interface
//******************************************************
public void Read(BinaryReader r)
{
m_fNotNull = r.ReadBoolean();
if (!r.ReadBoolean()) {
m_firstline = r.ReadString();
}
else {
m_firstline = SqlString.Null;
}
if (!r.ReadBoolean()) {
m_secondline = r.ReadString();
}
else {
m_secondline = SqlString.Null;
}
// get the XML value
if (!r.ReadBoolean()) {
string xml = r.ReadString();
m_XmlFullAddress = SqlXml_Parse(xml);
}
else {
m_XmlFullAddress = SQLXML_NULL;
}
}
public void Write(BinaryWriter w)
{
w.Write(m_fNotNull);
w.Write(m_firstline.IsNull);
if (!m_firstline.IsNull) {
w.Write(m_firstline.ToString());
}
w.Write(m_secondline.IsNull);
if (!m_secondline.IsNull) {
w.Write(m_secondline.ToString());
}
// write the xml
w.Write(m_XmlFullAddress.IsNull);
if (!m_XmlFullAddress.IsNull) {
string xml = m_XmlFullAddress.Value;
w.Write(xml);
}
}
//******************************************************
// Address Specific Fields, Properties and Methods
//******************************************************
public static SqlXml SqlXml_Parse(string s)
{
// create a SqlXml instance from values[2]
MemoryStream ms = new MemoryStream();
StreamWriter sw = new StreamWriter(ms);
sw.Write(s);
sw.Flush();
return new SqlXml(ms);
}
public SqlXml SQLXML_THIS_INSTANCE
{
get
{
return this.getAddressXml();
}
}
public SqlXml getAddressXml()
{
StringBuilder sb = new StringBuilder(16384);
if (this.IsNull) {
sb.Append("<address/>");
}
else {
sb.Append("<address>");
sb.Append("<line id=\"1\">" + this.m_firstline + "</line>");
if (!this.m_secondline.IsNull) {
sb.Append("<line id=\"2\">" + this.m_secondline + "</line>");
}
sb.Append("</address>");
}
return SqlXml_Parse(sb.ToString());
}
public SqlXml SQLXML_NULL
{
get
{
return SqlXml_Parse("");
}
}
public SqlString Value
{
get
{
if (m_fNotNull) {
if (m_secondline.IsNull) {
return m_firstline;
}
else {
return new SqlString(m_firstline.Value + "|" + m_secondline.Value);
}
}
else {
return SqlString.Null;
}
}
}
}
}
Format.UserDefined is a powerful way to persist custom and/or complex data structures using Yukon UDTs. For effective use of this method of serialization however, we must be very careful with our serialization and deserialization code.