(The title of this post might seem a little inflammatory, but it’s not just a troll -- I really do think that people should seriously question whether it’s wise to use T-SQL User Defined Data Types. Mostly, though, I just like that Edwin Starr song.)
Let me start this post off by saying that I understand why user-defined data types (UDTs) in T-SQL seem alluring. At first glance, you might expect UDTs to give you some very nice benefits, such as:
Expected Benefit #1: Suppose you have a field that stores percentages values (0-100). In T-SQL you might choose the tinyint data type for such a field, but this isn’t the only data type that you could have chosen to store percentage values. If there is more than one table that contains a percentage-type field, you could end up with the same sort of value being stored using different data types. By creating a UDT called PERCENT, you can avoid the need for database developers to memorize “Percentages should be stored using tinyint.”
Reality: This is based on the idea that it’s easier for a developer to memorize “Percentages are stored using a UDT named PERCENT” than it is to memorize “Percentages are stored using system type tinyint.” The problem is that you have to know the underlying data type in order to deal with the column correctly. For example, a database developer that was retrieving a PERCENT (UDT) field in a VB.NET or C# app would need to know what .Net data type should be used to retrieve values from the field. A thoughtful dev would wonder: “Maybe I should store this PERCENT value in a byte variable, which can hold values from 0 to 255. But wait: Can a PERCENT column hold a negative percentage? If so, I’ll need to use the short data type, which can represent negative integers. Hold on a second! I wonder if this PERCENT data type can store fractional percentages? If so, I’d need to retrieve the column value into a decimal variable...”
In other words, if you don’t use a UDT, the application developer has to memorize “Percentages are stored using system type tinyint.” But if you use a T-SQL UDT the application developer has to know both “Percentages are stored using a UDT named PERCENT” and “Percentages are stored using system type tinyint.” The attempt to make the app developer’s life easier can backfire and actually made it a little harder. (It’s worth mentioning that SQLCLR UDTs don’t have this same limitation. App developers can reference an assembly that gives them a client-side data type that corresponds to the server-side SQLCLR UDT, so the developer doesn’t have to know the details of the UDT’s implementation in order to use it.)
Expected Benefit #2: A tinyint can store 0-255, but your percentage fields should only allow 0-100. You can restrict a column’s domain by defining a CHECK constraint on the column. But if there are several percentage type columns in several different tables, you’d need to define identical CHECK constraints on each column. Wouldn’t it be nice if you could attach the constraint to the PERCENT user-defined data type, so that it would apply automatically to each column that used this UDT?
Reality: This would be a real benefit of T-SQL UDTs. Unfortunately, the ability to attach a rule to a UDT has been officially deprecated, so you shouldn’t use this functionality. Since you can no longer attach custom domain constraints or other business rules to the data type, a T-SQL UDT is really nothing more than an alias for a system data type. Sorry!
Expected Benefit #3: Suppose that your requirements have changed, and now you need to be able to store fractional percentages. You decide to store percentages as decimal(3,2) instead of tinyint. If all percentage columns were defined as tinyint, you’d have to go around to all of the columns and modify each column’s data type separately. And to change a column’s data type you must first drop any indexes or constraints that reference the column. What a pain. Wouldn’t it be easier if you could just change the PERCENT UDT definition in one place?
Reality: T-SQL doesn’t have an ALTER TYPE command for UDTs. To change a UDT definition you must do a DROP TYPE followed by CREATE TYPE. But of course you can’t drop a type that is being used in a column, so you have to do an awkward little dance to switch all of the columns that use the UDT to use the underlying system data type, instead (so they no longer reference the UDT), then drop and redefine the UDT, then ALTER the columns a second time to re-reference the new UDT.
ALTER TABLE tbl1 ALTER COLUMN foo_pct TINYINT;
ALTER TABLE tbl2 ALTER COLUMN foo_pct TINYINT;
ALTER TABLE tbl3 ALTER COLUMN foo_pct TINYINT;
ALTER TABLE tbl4 ALTER COLUMN foo_pct TINYINT;
DROP TYPE PERCENT;
CREATE TYPE PERCENT FROM DECIMAL(3,2);
ALTER TABLE tbl1 ALTER COLUMN foo_pct PERCENT;
ALTER TABLE tbl2 ALTER COLUMN foo_pct PERCENT;
ALTER TABLE tbl3 ALTER COLUMN foo_pct PERCENT;
ALTER TABLE tbl4 ALTER COLUMN foo_pct PERCENT;
And if any indexes or constraints reference any of these columns, you’d still have to drop those before the first ALTER COLUMN, and recreate them afterward. If you have to write DDL that redefines the columns using the base data type that the UDT is supposed to hide, where’s the abstraction benefit? More importantly, if you didn’t use UDTs this same change can be done with less than half the code:
ALTER TABLE tbl1 ALTER COLUMN foo_pct DECIMAL(3,2);
ALTER TABLE tbl2 ALTER COLUMN foo_pct DECIMAL(3,2);
ALTER TABLE tbl3 ALTER COLUMN foo_pct DECIMAL(3,2);
ALTER TABLE tbl4 ALTER COLUMN foo_pct DECIMAL(3,2);
So there you have the reasons why I avoid using T-SQL user-defined types in my own code. It’s not that UDTs are all that bad; it’s just that in my experience they fall short of their promise, and they can make some things even more cumbersome than the same task would be without the use of UDTs. If you think I’m overlooking some concrete benefit of UDTs, I’d love to hear from you in the comments section. If you do choose to use UDTs, just be sure you aren’t basing the decision to use them on inaccurate assumptions about the benefits they’ll provide.
An important caveat: The above applies to simple T-SQL UDTs. You can also create a user-defined type in SQL Server using SQLCLR and .Net. While a T-SQL UDT is never anything more than a simple alias for one of the system data types, a SQLCLR UDT can be a truly custom data type. With SQLCLR you can implement complex types, embed custom business logic like domain constraints or validation rules into the type, and so on. In other words, with SQLCLR UDTs you can actually realize the benefit that I called “Expected Benefit #2”, above. As mentioned earlier, you also get much of Expected Benefit #1. SQLCLR UDTs still have the limitations described in #3, but the ability to extend SQL Server in the way that SQLCLR allows is very powerful. That benefit could easily compensate for the issue discussed in #3, if the task you are trying to accomplish would be much harder to implement using one of the system data types.
I disagree with every single point except #3. I'm primarily a .NET developer and not a database guy but I wish that .NET had a similar construct. It would be nice for me to be able to create a "derived" (but not really in the .NET sense) type from string that was just a string with some additional constraints. In a lot of ways this is what "M" language appeared to offer. Point #1 is very beneficial for me in the telecom industry where we have lots of codes, ID's, numbers, etc that may or may not have constraints but do have a semantic meaning. It's like saying why have a (strong) HTML tag when you can do the same thing with (span style="font-weight: bold") tag. It's all about semantics. Point #2 is still valid regardless of whether or not the ability is deprecated because if nothing else, it serves as self-documentation. Point #3, can't argue with that... changing the definition of a duck is never easy when you've got a million ducks already comfortable with who they are.
Love that song (I even had my own take a little while back: "Char: What is it good for?" http://michaeljswart.com/?p=63
@Bart, What do you think of a UDT called UTCDateTime which is just a datetime column that indicates to everyone that the value stored there is at UTC?
@Josh, Re: "a derived string type with a couple other constraints." What's wrong with a subclass? Isn't that Object-Oriented programming 101?
>> Point #1 is very beneficial for me in the telecom industry where we have lots of codes, ID's, numbers, etc <<
Don't you need to know the underlying system data type of each UDT in order to handle the types correctly in the client app? If so, what's the benefit to you as a .Net developer? If you could reference the type using the name of the type alias in your app as well as in T-SQL, then I'd see a clear benefit. But since T-SQL UDTs can only be referenced in SQL, the application developer has to know what system type the UDT is an alias for, anyway (to map it to a .Net system datatype that won't lead to data loss). That means that UDTs don't really allow you to hide the underlying system type. If you disagree let me know, but I just don't see the abstraction benefit.
>> Point #2 is still valid regardless of whether or not the ability is deprecated <<
Hmm... You're OK with increasing your dependence on a feature that is likely to be removed in a version or two? Rules bound to UDTs may provide some conveniences in the short term, but it's likely that you or one of your successors will eventually need to overhaul your type system in order to upgrade and take advantage of new engine features. Using rules bound to UDTs today creates debt that will need to be repaid in the future. Whoever inherits your application won't thank you for doing this.
@Michael, that example (UTCDataTime) is great. I think most devs would understand from the UDT name alone that it's really just a datetime (allowing them to correctly guess how to map the column to a .Net data type), while the alias would help prevent people from falling victim to the time zone ambiguity that's inherent to datetime. FWIW, for this particular situation I think a better solution would be to use datetimeoffset instead of datetime (see blogs.msdn.com/.../the-death-of-datetime.aspx). But I bet there are other cases that are similar to this one, where you can choose an alias/UDT name that doesn't obscure the base system data type but that also helps clarify something about the type of data that should be in the column. Thanks for the example! It sounds like a case where UDTs could provide a real benefit.
(Re: "wrong with a subclass?", the string type in .Net is sealed, so you can't derive a subclass from it.)
Bart...You are spot on with this i believe. And heres another issue.... Database using UDTs and they decide for consistancy we'll create UTDs in model also so that they can create temp tables with udt data types. Receipe for disaster? Yep u bet. changing a udt in model but not in the user db and hey presto... lots and lots of truncation errors, etc because of bad implicit type conversions.