The story so far:
At the end of the last instalment, we'd created a simple user-defined function in C# and used the CREATE ASSEMBLY and CREATE FUNCTION statements to catalogue the assembly and function. But so far we've really only scratched the surface of the capabilities of Yukon managed code. To reach a closer functional parity with Transact-SQL, we need to be able to access existing data in the database from managed code. In this instalment we therefore look at the new in-proc .NET Data Provider for SQL Server.
Since the inception of the .NET Framework, the recommended strategy for accessing relational data has been ADO.NET. To recap quickly, the classes pertaining to ADO.NET reside largely within the System.Data namespace, and provide a mechanism for connecting to databases and working with data either using a disconnected DataSet object or using a fire-hose forward-only cursor through a DataReader instance. To provide greater database neutrality, ADO.NET implements a provider model in which connected objects such as SqlConnection and OracleConnection implement a common interface IDbConnection. Most of the methods and properties you're likely to call are exposed via these common interfaces. (You'll find a far more exhaustive treatment of the subject in this book, which I highly recommend.)
Much of what you probably already know about ADO.NET also applies when writing managed code for Yukon. You'll find yourself working with common classes such as SqlConnection, SqlDataReader, and SqlTransaction, but there are a few differences that reflect the fact that the code now executes inside the database engine. Let's cover some of the main points of significance:
Let's have a look at a very simple stored procedure written in C#:
using System.Data.SqlServer; public class PubsSprocs { public static void GetBookInfo() { using (SqlCommand cmd = SqlContext.GetCommand()) { cmd.CommandText = "SELECT * FROM authors"; SqlPipe pipe = SqlContext.GetPipe(); pipe.Send(cmd.ExecuteReader()); } } }
To catalogue this against the pubs database, you'd use something like:
CREATE PROCEDURE dbo.usp_get_author_info AS EXTERNAL NAME YukonCLR:[PubsSprocs]::GetBookInfo GO
Calling EXEC usp_get_author_info would of course give you the same results as executing the SELECT statement directly. This procedure isn't exactly a great candidate for managed code, of course, but it demonstrates some of the key concepts I've mentioned above. Notice the use of the SqlContext class to create the SqlCommand and SqlPipe instances; in fact, SqlPipe has a private constructor to prevent instantiation in any other way. You can also see how the pipe is used to output a DataReader object without deserialising it explicitly.
Using a Command object isn't actually the most scalable way to work with data if you're going to be calling the same command multiple times or from multiple users. The preferred route is to separate out the command definition itself from any run-time parameters that will be given to that command. To do this, you create a SqlDefinition object which contains a parameterised statement, and then instantiate a SqlExecutionContext object from the connection which supplies the parameter values and executes the statement. Here's a cut down code fragment which shows this in practice:
using System.Data; using System.Data.Sql; using System.Data.SqlServer; using System.Data.SqlTypes; ... public class YukonManagedClass { static readonly SqlDefinition insertStmt; ... static YukonManagedClass() { // Create the definition SqlCommand cmd = new SqlCommand(); SqlParameter param = new SqlParameter(); // Parameterised statement goes here... cmd.CommandText = "INSERT INTO MyTable " + " VALUES (@Param1)"; param = cmd.Parameters.Add("@Param1", SqlDbType.NVarChar, 50); // Additional parameters here... insertStmt = new SqlDefinition(cmd); } public static void MyStoredProc() { SqlConnection conn = SqlContext.GetConnection(); using (SqlExecutionContext ex = conn.CreateExecutionContext(insertStmt)) { ex.SetString(0, "some value"); ex.ExecuteNonQuery(); } } }
This fragment comes from a stored procedure I wrote to create some dummy data for a sample I put together. You can see the full stored procedure here if you're interested, along with some sample SQL to catalogue the procedure and test it.
I was going to write about table-valued functions and Whidbey / Yukon integration at this point, but I've simply run out of time. I'll have to save those for Part 4, which will probably appear after the Christmas holidays. In the meantime, have a meaningful and relaxing Christmas and a happy and prosperous New Year!
In the last instalment, we discussed the support in Yukon for writing stored procedures, user-defined functions and triggers using managed code.
At this point, you might be wondering whether these new capabilities are the death knell for writing these objects using Transact-SQL. By no means! Both managed code and T-SQL will continue to play an important part in database applications; managed code support is intended to offer additional flexibility rather than replace what already exists. The received wisdom thus far is that managed code should be used for computationally-intensive activities or tasks that can take advantage of the base class libraries in the .NET Framework, whereas T-SQL should be maintained for set-based select and update operations.
But rather than being some complex design decision that will involve whole armies of architects debating the merits of each technique until the early hours, the choice in practice is reasonably intuitive. There are certain stored procedures I've written in the past which felt like they took T-SQL some place that it wasn't intended for: most of them have involved loops or similar iterative constructs. In the future, I'll be writing these same tasks in C# and I can imagine them taking half the code and a quarter of the effort. Similarly, if all you're doing in a stored procedure is wrapping a complex multi-table joined SELECT statement, you'll wind up writing twice the code if you build it in managed code, because you'll still need the same statement but you'll have to wrap it up with objects such as SqlContext and SqlCommand that will just add overhead and make debugging and deployment harder. (We'll talk about the concept of SqlContext next time.)
That's enough general discussion. Let's see how this works out in practice. I've written a very simple pair of functions that use the Triple DES cryptography class in the Framework to encrypt and decrypt strings. As you look at the code below, note that there's nothing specific to Yukon.
using System.IO; using System.Security.Cryptography; using System.Text; public class Encrypt { // This is obviously not the right place to store // cryptographic secrets as these values could be // easily retrieved. In a real-world scenario, you'd // use a Cryptographic Service Provider (CSP) or // similar to generate these values. It does however // make this demo easier to distribute! // 192-bit key and 64-bit initialization vector private static readonly byte[] ENCRYPTION_KEY = new byte[] { 0x00, 0x3A, 0x40, 0x30, 0x00, 0x3A, 0x40, 0x30, 0xFA, 0xE0, 0x01, 0x39, 0x8A, 0x21, 0x17, 0x97, 0x14, 0x85, 0xED, 0xCA, 0xFF, 0x44, 0x48, 0x9F }; private static readonly byte[] ENCRYPTION_IV = new byte[] { 0x10, 0xA9, 0x44, 0x24, 0x96, 0xD4, 0xFA, 0xFE }; public static byte[] EncryptData(string input) { byte[] byteInput = new UnicodeEncoding().GetBytes(input); // Instantiate a DES instance with random key TripleDESCryptoServiceProvider desServiceProvider = new TripleDESCryptoServiceProvider(); desServiceProvider.KeySize = 192; // Create DES Encryptor from this instance ICryptoTransform desEncryptor = desServiceProvider.CreateEncryptor( ENCRYPTION_KEY, ENCRYPTION_IV); // Create stream that transforms input using // DES encryption MemoryStream ms = new MemoryStream(); CryptoStream encryptStream = new CryptoStream(ms, desEncryptor, CryptoStreamMode.Write); // Write out DES-encrypted file encryptStream.Write(byteInput, 0, byteInput.Length); encryptStream.FlushFinalBlock(); // Now write out MemoryStream to output buffer return ms.ToArray(); } public static string DecryptData(byte[] input) { // Instantiate a DES instance TripleDESCryptoServiceProvider desServiceProvider = new TripleDESCryptoServiceProvider(); // Create DES Decryptor from our des instance ICryptoTransform desDecryptor = desServiceProvider.CreateDecryptor( ENCRYPTION_KEY, ENCRYPTION_IV); // Create stream that transforms input using // DES encryption MemoryStream msDecrypt = new MemoryStream(input); CryptoStream csDecrypt = new CryptoStream( msDecrypt, desDecryptor, CryptoStreamMode.Read); byte[] byteOutput = new byte[input.Length]; csDecrypt.Read(byteOutput, 0, byteOutput.Length); string stringOutput = new UnicodeEncoding().GetString(byteOutput); return stringOutput.TrimEnd('\0'); } };
So what do you need to do to use these functions from Yukon? The first thing of course is to compile this class into a .NET assembly: use a command like:
csc /t:library funcEncrypt.cs
Take a note of the location of the compiled DLL; you'll need it later. To use it in Yukon, there are two steps: cataloguing the assembly itself and cataloguing any entry points that you'll use. Here's the Transact-SQL commands that perform these steps:
CREATE ASSEMBLY YukonCLR FROM 'c:\code\yukon\yukonclr.dll' WITH PERMISSION_SET = EXTERNAL_ACCESS GO CREATE FUNCTION dbo.fn_encrypt_data (@input nvarchar(4000)) RETURNS varbinary(8000) AS EXTERNAL NAME YukonCLR:Encrypt::EncryptData GO CREATE FUNCTION dbo.fn_decrypt_data (@input varbinary(8000)) RETURNS nvarchar(4000) AS EXTERNAL NAME YukonCLR:Encrypt::DecryptData GO
Note the data types used - for the moment, at least, you're limited to 8000 bytes for input and output: the new data types of nvarchar(max) and varbinary(max) aren't supported by the engine in Beta 1 for interacting with managed code. Also notice the assembly:class::method syntax for describing where each function in the assembly is stored. Lastly, be aware that you can give the functions an alias in the SQL environment rather than their name as defined in managed code, but the assembly itself must be named identically to the actual assembly name.
For each CREATE statement, there is an analogue for ALTER and DROP; here's the standard way to drop the assembly and associated objects (including a check for their prior existence):
IF EXISTS (SELECT * FROM sys.assembly_modules WHERE name=N'fn_encrypt_data' AND [type]=N'FS') DROP FUNCTION dbo.fn_encrypt_data IF EXISTS (SELECT * FROM sys.assembly_modules WHERE name=N'fn_decrypt_data' AND [type]=N'FS') DROP FUNCTION dbo.fn_decrypt_data IF EXISTS (SELECT * FROM sys.assemblies WHERE name = N'YukonCLR') DROP ASSEMBLY YukonCLR GO
To test that these functions work successfully, try executing the following:
SELECT dbo.fn_encrypt_data('Yukon demo')
As a further test, let's use both functions together to prove the roundtrip works:
SELECT dbo.fn_decrypt_data( dbo.fn_encrypt_data('Hello world!'))
So far, we've only seen how you build user-defined functions. Actually, stored procedures work in exactly the same way, except that you use CREATE PROCEDURE rather than CREATE FUNCTION.
If you examined the DROP statements above closely, you'll have seen mention of a few system views. You can investigate much of the underlying storage of managed code in Yukon by browsing several system views. Firstly,
SELECT * FROM sys.assemblies
This shows the .NET assemblies that are catalogued in a particular database, with its full strong name and some other metadata (if you want to use the same assembly from two different databases, you'll have to catalogue it twice). Now try the following statement:
SELECT * FROM sys.assembly_files
You'll see here the assembly ID, along with the actual binary code of the assembly itself. Lastly, the following view shows the functions that are catalogued in the database:
SELECT * FROM sys.assembly_modules
In the next instalment, we'll look at some of the help that Visual Studio "Whidbey" gives you when building managed code for Yukon to help you with deployment and debugging. We'll also look at some more complex examples of stored procedures and functions that access data using the SqlContext object. In the meantime, feel free to add a comment if you've got a question, you don't think I've explained something very well or you've spotted a mistake.
One of my Paris-based colleagues, Nigel Watling, regularly posts insightful nuggets of information to an internal distribution list he's responsible for. I suggested he start a blog back at TechEd last year, since he's got enough material to make it really interesting. In the meantime, he pointed out a really useful sample that should be mandatory for any professional Windows Forms application, and I've shamelessly plagiarised his supporting text as it's very useful...
He points out that when handling exceptions in WinForms, there will inevitably be some (unknown) exceptions that you do not handle in your error handling code for the form. This is guaranteed due to the non-linear nature of WinForms. To handle these you should register your own handler with the Windows Forms API using the Application.ThreadException event. Doing this suppresses the informational (but less professional) debug dialog that comes up by default.
This only works, however, for the thread that is pumping GUI messages, and so you will still have to rely on the CLR unhandled exception handler for all other exceptions (finalizer thread, thread-pool thread, manual thread, etc.). You can register for this handler using the AppDomain.UnhandledException event.
Full details and a code sample are available here on GotDotNet.
Great tip - thanks Nigel.
I've had something of an epiphany about Windows Management Instrumentation today. I always used to think that WMI was one of those dreary but worthy things that only systems administrators used to be interested in. A big improvement on SNMP for remotely configuring a file server, sure, but of limited interest to developers?
I've just discovered two things that have finally caused my dull head to see the light. Firstly, I came across the TechNet script centre: a fantastic pot pourri of ready-rolled VBScript fragments that enable you to do everything from creating file shares to finding out what software is installed on a particular machine. This resource is a boon to systems administrators, but it also demonstrates what WMI is capable of in the right hands. Unfortunately there's no language filter that provides you with a C# alternative.
But then I made my second discovery - in this case a re-discovery of something I last came across nearly two years ago. On the .NET WMI page, I found an add-in for the Visual Studio Server Explorer that exposes almost every WMI object on a local or remote system as nicely wrapped .NET classes.
Here are a couple of random examples of the kind of thing you can do with these classes:
computerSystem.JoinDomainOrWorkgroup(); string manufacturer = computerSystem.Manufacturer; int numProcs = computerSystem.NumberOfProcessors; int clockSpeed = processor1.MaxClockSpeed();
I'm always being asked how to access some or other hardware or software setting from .NET managed classes. Usually the answer is some really grungy interop into Win32, but now I've got an alternative story.
Am I the only one to have missed out on all this?
I thought it might be of interest to some to put a few notes up from the talk I gave on Yukon development last week. If nothing else, it's helpful to me to have as a reminder for the next time I give this talk! I'll split this into several parts for ease of access, and post them over the course of the week.
In the current release of SQL Server, there are effectively three ways to embed custom logic into your database:
Traditionally developers have opted for a mixture of options 1 and 3. Option 1 is perfect from the point of view of a database purist - everything that impacts the integrity of the database can be kept there; nothing can bypass the code, as it's embedded in the database. However, whilst T-SQL is great for set-based data manipulation, it doesn't have the same degree of structure and elegance as most component-orientated languages, offers limited support for string handling and many other constructs, and doesn't perform awfully well when you're not dealing directly with the database.
Option 3 allows you to build access logic that abstracts data from a relational form into more business-centric idioms such as customer, purchase order etc., as well as offloading this work from the database server to other application servers. But because the data tier is separate from the database itself, it's not always easy to integrate any semantic validation from here into other data services such as replication or reporting.
Most people steer away from option 2 (extended stored procedures), because they're quite fragile due to their in-process nature - a memory leak in one of these can bring the whole database server down, for instance. They're also rather awkward to write. The guarded wording used in this security note is certainly enough to scare me off recommending them, at any rate.
The big change in SQL Server "Yukon" is that the Common Language Runtime (CLR) is hosted directly in the database engine, allowing .NET managed code to be used for stored procedures, functions or triggers. On the surface, this is the best of all worlds - your code resides in the database, but can take advantage of the full richness of the .NET Framework Base Class Libraries. It performs better than T-SQL for computationally-intensive tasks, can be secured using both the SQL security architecture and .NET Code Access Security, and allows you to use any .NET language to develop everything from the front-end user interface to the back-end database logic itself.
Here's a few examples of how you might use this capability to build a richer database:
On top of all this, you can also create custom user-defined data types and aggregates using the .NET integration; for example, you could have a custom type that represents a financial instrument and then use a custom aggregation to ensure that it was rolled up appropriately for reports.
One thing that struck me in particular about the CLR integration was how Yukon took advantage of the extended CLR hosting APIs in Whidbey. Traditionally the CLR takes care of memory management, garbage collection and thread support itself, but when it runs in Yukon these services are delegated to the database engine. The engine has a much better understanding of the current system load as a whole and can therefore manage memory and threads appropriately for the entire execution environment. This results in a more robust and scalable solution than if these services were left within the CLR itself.
In part two, I'll talk more about how you take advantage of these extensibility points using Visual Studio "Whidbey".