Welcome to MSDN Blogs Sign in | Join | Help

Dynamic Assembly Loading II: SGen and System Defined Classes

Today we’ll look at an example where sgen did not solve the problem with a dynamically loaded XmlSerializers assembly.   This issue was found by a customer and filed as a bug on the Microsoft Connect site for SQL Server: http://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=209210

 

In brief, the code was trying to write out an XmlSchema object to a MemoryStream using XmlTextWriter.  Under the covers, however, XmlSchema.Write calls the following:

 

XmlSerializer serializer = new XmlSerializer(typeof(XmlSchema));

 

This looks just like the typical Xml Serialization use case which sgen was able to solve; however, in this case the clr is looking for an XmlSerializer for the framework defined XmlSchema class, rather than a class that the customer created.  Running sgen on the user’s assembly does not create an XmlSerializer for XmlSchema because it is defined in System.Xml.Dll.

 

At first, it seems the workaround to this would be simply to sgen System.Xml.dll for the XmlSchema type and load the resultant XmlSerializers assembly into SQL Server.  However, if you try this, you’ll note that running sgen on System.Xml.dll won’t work because System.Xml.dll is signed with Microsoft’s private key, which you are unlikely to have.

 

C:\Windows\Microsoft.NET\Framework\v2.0.50727>sgen /t:System.Xml.Schema.XmlSchema System.Xml.Dll

Microsoft (R) Xml Serialization support utility

[Microsoft (R) .NET Framework, Version 2.0.50727.42]

Copyright (C) Microsoft Corporation. All rights reserved.

Error: Generated serialization assembly is not signed: System.Xml.XmlSerializers, Version=2.0.0.0, Culture=neutral, PublicKeyToken=null. Please specify strong name key file via /compiler:/keyfile:<file> switch.

 

So what are our other options?  One possibility is to incorporate the generated XmlSchema.XmlSerializer dll into the user’s assembly and reference it directly.   

After all, what the code is really doing is creating a specialized Serializer for the XmlSchema class.  If the XmlSerializer already exists, then we can use it directly and bypass the magic dynamic assembly generation step altogether.  Although sgen returned an error because the Serializer assembly isn’t signed and can’t be used automatically, we can instead use the XmlSerializer it generated manually by renaming the file and adding a reference to it in the project.

 

Now all we have to do is change the code to explicitly use our specified XmlSchemaSerializer.  In this case, it’s done by replacing the code

 

 GeneratedSchema.Write(Writer)

 

With

 

Dim GeneratedSchemaSerializer As XmlSchemaSerializer
GeneratedSchemaSerializer = New XmlSchemaSerializer()
GeneratedSchemaSerializer.Serialize(Writer, GeneratedSchema)

 

The code now works.  Although the workaround is a little messy, it accomplishes the goal of serializing a system defined class without requiring a dynamic assembly to be loaded.

Dynamic Assembly Loading

Simple question: what security permission level is required to allow dynamic loading of assemblies? Trick question: dynamic assembly loading is always disallowed under SQL CLR, even under UNSAFE. 

 

This is mentioned twice in BOL:

CLR Integration Programming Model Restrictions :

“Loading an assembly—either explicitly by calling the System.Reflection.Assembly.Load() method from a byte array, or implicitly through the use of Reflection.Emit namespace—is not permitted.”

 

CLR Hosted Environment :

“Note that the ability to generate managed code dynamically, using the Reflection.Emit API, is not supported inside the CLR-hosted environment in SQL Server. Such code would not have the CAS permissions to run and, hence, would fail at run-time.”

 

Despite the BOL warnings, it’s easy to run into this restriction as the code doing the dynamic loading might be in a dependent assembly or even in the clr itself.   The problem is also difficult to debug as many disparate types of applications might fail in completely different ways for this same underlying reason.  Over the next couple of posts, I’ll go through some of the more common cases that I’m aware of and how to solve them.

 

First, the most common case and the easiest one to fix is covered by Vineet in this blog post last year, is described in BOL: XML Serialization from CLR Database Objects, and even has its own KB article. 

 

When using Xml Serialization, you need to pre-generate the XmlSerializers assemblies to perform your serialization.  Visual Studio will do this for you if you check the “Generate Serialization Assembly” on the Project Properties/Build pane, or you can use the sgen tool from the CLR SDK.  Either way, you then need to register the XmlSerializers assembly in your database:

 

CREATE ASSEMBLY [MyAssembly.XmlSerializers] FROM ‘<path>\MyAssembly.XmlSerializers.dll’

 

If you do not do this, then you’ll see the following error when you try to use code that needs to serialize your class to xml:

Msg 6522, Level 16, State 2, Line 1
A .NET Framework error occurred during execution of user defined routine or aggregate 'MyClass':
System.InvalidOperationException: Cannot load dynamically generated serialization assembly. In some hosting environments assembly load functionality is restricted, consider using pre-generated serializer. Please see inner exception for more information. ---> System.IO.FileLoadException: LoadFrom(), LoadFile(), Load(byte[]) and LoadModule() have been disabled by the host.
System.IO.FileLoadException:
at System.Reflection.Assembly.nLoadImage(Byte[] rawAssembly, Byte[] rawSymbolStore, Evidence evidence, StackCrawlMark& stackMark, Boolean fIntrospection)
at System.Reflection.Assembly.Load(Byte[] rawAssembly, Byte[] rawSymbolStore, Evidence securityEvidence)

 

Most of the time, this error is solved by performing the sgen steps listed above. Unfortunately, there are a small number of cases where this error can occur using XmlSerialization even if you correctly sgened your assembly.  I’ll cover some of those cases and how to fix them in future posts.

 

-- Steven Hemingray

Deadlock Detection in SQL CLR

Usually, talking about the differences between running code under the CLR vs. running under SQL CLR focuses on functionality that either doesn’t work or is difficult to use in a safe and reliable manner.  However, one feature that SQL Server actually adds to the CLR environment is deadlock detection.

   Joe Duffy mentions this in his article, No More Hangs, about advanced CLR techniques to detect and resolve deadlocks.  One of his methods is to use the CLR Hosting interfaces to write a custom CLR Host to handle all the locking primitives so he can analyze them to check if deadlock has occurred.  This is essentially the same method that SQL Server uses to detect deadlocks, except rather than using a separate deadlock detection algorithm for lock requests coming from the CLR, we translate them to the standard SQL locks provided by SOS. 

   If you compile the following program as an executable and run it, not a whole lot happens.  The program deadlocks as expected and leaves you staring at the blinking cursor, wondering what to do.  However, if you create Method1 and Method2 as SQL Stored Procedures and run them at the same time from separate connections, you’ll see that SQL Server automatically detects the deadlock and kills one of them for you.

 

public class DeadlockSample

{

    public static readonly object a = new object();

    public static readonly object b = new object();

 

    [SqlProcedure]

    public static void Method1()

    {

        lock(a)

        {

            Thread.Sleep(2000);

            lock (b) { SqlContext.Pipe.Send("This means Method2 was killed!"); }

        }

    }

 

    [SqlProcedure]

    public static void Method2()

    {

        lock(b)

        {

            Thread.Sleep(2000);

            lock (a) { SqlContext.Pipe.Send("This means Method1 was killed!"); }

        }

    }

 

    public static void Main()

    {

        Thread thread1 = new Thread(new ThreadStart(Method1));

        Thread thread2 = new Thread(new ThreadStart(Method2));

        thread1.Start();

        thread2.Start();

        thread1.Join();

        thread2.Join();

    }

}

  

Having SOS handle all locking is especially useful as it allows for deadlock detection to work even for the case of inproc data access where a CLR lock might be deadlocked against a SQL lock.  In the following example, 2 methods both want to take a CLR lock and update a column in a SQL table, but requesting them in a different order leads to deadlock.

 

create table table1(c int)

insert into table1 values(1)

 

    [SqlProcedure]

    public static void LockAndUpdate()

    {

        using (SqlConnection conn = new SqlConnection("Context Connection=true"))

        {

            conn.Open();

            SqlCommand cmd = new SqlCommand("update table1 set c = 2", conn);

            cmd.Transaction = conn.BeginTransaction();

            lock (a)

            {

                Thread.Sleep(2000);

                cmd.ExecuteNonQuery();

                SqlContext.Pipe.Send("This means UpdateAndLock was killed!");

            }

            cmd.Transaction.Commit();

        }

    }

 

    [SqlProcedure]

    public static void UpdateAndLock()

    {

        using (SqlConnection conn = new SqlConnection("Context Connection=true"))

        {

            conn.Open();

            SqlCommand cmd = new SqlCommand("update table1 set c = 2", conn);

            cmd.Transaction = conn.BeginTransaction();

            cmd.ExecuteNonQuery();

            Thread.Sleep(2000);

            lock (a)

            {  SqlContext.Pipe.Send("This means LockAndUpdate was killed!");  }

            cmd.Transaction.Commit();

        }

    }

 

One important aspect to keep in mind when dealing with deadlock detection in SQL CLR is that SQL does not explicitly kill your thread with a ThreadAbortException but merely throws a regular exception so that you can catch it and deal with the problem if you are prepared to handle it.  This also means, however, that poor programming practices, such as catching all exceptions, might cause you to dismiss the exception without handling it properly.  If you catch the exception and retry without releasing the deadlocked resources then it is likely that you'll only deadlock again.

 

Here is the section from BOL, Detecting and Ending Deadlocks:

"When working with CLR, the deadlock monitor automatically detects deadlock for synchronization resources (monitors, reader/writer lock and thread join) accessed inside managed procedures. However, the deadlock is resolved by throwing an exception in the procedure that was selected to be the deadlock victim. It is important to understand that the exception does not automatically release resources currently owned by the victim; the resources must be explicitly released. Consistent with exception behavior, the exception used to identify a deadlock victim can be caught and dismissed."

 

-- Steven Hemingray

SQL Server 2005 Debugging Requirements

There have been some questions about the sysadmin requirement of SQL Server 2005 Debugging, and I’d like to explain it in some details.  When you debug T-SQL or CLR code in SQL Server 2005, there are two users involved: user running the debugger and user making the connection that is being debugged.  User running the debugger (Visual Studio 2005) has to be in the sysadmin fixed role, and there’s no requirement on the user making the connection.  Also because Visual Studio communicates with SQL Server through debugging interfaces in DCOM, the debugger user has to use Windows Authentication rather than SQL Server Authentication.

 

For CLR code debugging, sysadmin is required because CLR debugger user has total access to the memory of SQL Server process, and we don’t want anyone other than sysadmin to have it.

 

For T-SQL debugging some alternatives have been considered.  One alternative is to allow anyone to debug T-SQL procedure/function that s/he has certain permission (e.g. alter permission or ownership).  This would be much more convenient for developers, but it has some security complications, especially in cases like procedure with EXECUTE AS and signed procedures.  Also filtering of T-SQL stack frames based on permissions make implementation more complex.  We gave up on this for SQL Server 2005, and will reconsider it for future versions.  Another alternative is to make execute permission on sp_enable_sql_debug grantable and allow anyone with this permission to debug T-SQL.  After security review we found that without solving security problems that prevented us from the first alternative, it’s possible for a malicious debugging user to elevate to sysadmin privilege.  Thus debugging permission is equivalent to sysadmin privilege and we chose to signify this by only allowing sysadmin to execute sp_enable_sql_debug.

 

 

Remote Debugging Monitor (msvsmon.exe) is another requirement that people often get confused on.  Remote Debugging Monitor is required for SQL-CLR debugging, whether remotely or locally (here are the steps to set up the Remote Debugging Monitor); and it is not required for T-SQL debugging, whether remotely or locally.  Here “local” means Visual Studio 2005 and SQL Server 2005 run on the same machine.  

 

For T-SQL debugging Visual Studio doesn’t actually attach to the SQL Server process.  It communicates with SQL Server through a set of debugging interfaces in DCOM, so msvsmon.exe is not required.  For SQL-CLR debugging msvsmon.exe is required even for local debugging for robustness reasons.  In this case msvsmon.exe attaches to the SQL Server process, and Visual Studio talks to msvsmon.exe through some private channels.  In this way even if Visual Studio crashes or freezes, msvsmon.exe can detect it and detach safely from SQL Server process.  If Visual Studio attaches to the SQL Server process directly, and something bad happens to Visual Studio, then SQL Server process can be terminated, which is what we try to avoid.  Msvsmon.exe is relatively small and we can make it pretty robust; whereas Visual Studio is much more complex and has open plug-in architecture, and thus is much more susceptible to problems.

 

Posted by sqlclr | 11 Comments

Exception handling in SQLCLR

Naveen covered the TSQL exception handling in his posts on sqlprogrammability blog site. How does the TSQL exception handling mechanism interacts with SQLCLR exception handling mechanism? We will cover this topic in this post.
 
When SQL server execute a user function/procedure/trigger implemented in CLR (i.e., managed code), we will install a managed exception handler around the user code. So if the user code leaked a exception, the server will catch it and throw a TSQL exception wrapping the user exception. Here is an example:
 
Msg 6522, Level 16, State 1, Procedure UDP_OpenFile, Line 0
A .NET Framework error occurred during execution of user defined routine or aggregate 'UDP_OpenFile':
System.IO.FileNotFoundException: Could not find file 'C:\nonexisiting'.
System.IO.FileNotFoundException:
   at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
   at System.IO.FileStream.Init(String path, FileMode mode, FileAccess access, Int32 rights, Boolean useRights, FileShare share, Int32 bufferSize, FileOptions options, SECURI
 

Now, what happens if your CLR function/procedure calls back into SQL server again through InProc data access and caused a TSQL exception? For example, if a stored procedure is implemented in CLR like the following:
 
public static void AddNewBook()
{
 using(SqlConnection cnn = new SqlConnection(someConnectionString))
 {
  conn.Open();
  using(SqlCommand cmd = conn.CreateCommand())
  {
   cmd.CommandText = "insert t_Books (title, author) values('title', 'author')";
   cmd.ExecuteNonQuery();
  }
 }
}

What if the INSERT statement failed due to a duplicated key violation? SQL server will translate such normal TSQL exceptions into a CLR SqlException object. When this happens, the TSQL exception is considered as been handled. The system no longer has any pending TSQL exceptions at all, instead a managed SqlException will be thrown. Your code will see a SqlException. You can catch it through your CLR exception handler. This mechanism allows to catch TSQL exceptions in your CLR function/procedure.

However, if your CLR code doesn't catch the SqlException, SQL server will see that you leaked an unhandled CLR exception. Currently, the infrastructure doesn't treat SqlException differently from any other CLR exception (e.g. FileNotFoundException), so we will give an error 6522 with the exception message and stack.
 
Simply speaking, normal TSQL exceptions doesn't pass through CLR frames on the stack. It will get translated into appropriate SqlException. This SqlException can be handled by user code. If not, SQL server will treat it the same way as any other unhandled CLR exception.
 
I used the world "normal" TSQL exceptions in the above explaination, which suggests that there are "unusal" TSQL exceptions that behaves differently. That is indeed the case. Those are server TSQL exceptions that will either stop the execution of the batch or terminate the connection. Such TSQL exceptions are server enough to warrant special treatment. They will be translated into System.Threading.ThreadAbortException. Since ThreadAbortException can't be handled (well, it can be caught, but it will be rethrow immediately after catch block unless you explicitly reset it, which will be considered as bug in user code), the CLR frame will unwind and when the control goes back to TSQL, the original server TSQL exception will be thrown.
 
If this sounds a little complicated, you are right. Exception handling is tricky. We are looking into ways to make the mechanism more intuitive and friendly. Your experience anc comments about the current mechanism will help us to understand how to make things easier for you in the future.
 
[Cross Posted from http://blogs.msdn.com/sqlprogrammability]
Posted by sqlclr | 2 Comments

Overview of T-SQL and CLR debugging in SQL Server 2005

SQL Server 2005 ships with a new debugging engine that supports debugging of any T-SQL and CLR code running in the server, including batches, stored procedures, user defined functions/aggregates/triggers, etc.  You can use Visual Studio 2005 to debug against SQL Server 2005 or SQL Server 2000, but you can not use Visual Studio 2003 or earlier to debug against SQL Server 2005 because the debugging engine is not compatible.

 

There are some improvements in debugging in SQL Server 2005:

. Its much easier to setup than debugging in SQL Server 2000.  You can enable debugging SQL Server 2005 from Visual Studio 2005 by following these simple steps (please note that Remote Debugging Monitor mentioned in the steps is not required for T-SQL debugging).

. Integration of T-SQL debugging and CLR debugging.  You can step in/out from T-SQL code to CLR code, or vise versa.  You will get mixed call stack consisting of T-SQL frames and CLR frames, and inspect T-SQL and CLR variables on any frame.

. Full functionality of CLR debugging.

. Isolation of database connections in T-SQL debugging.  When you break in T-SQL code in one connection, other connections are not suspended and can continue to run normally until they require resource locked by the connection being debugged.  In another word, the impact of debugging a connection to other connections on the server is similar to have a long-running connection.

 

The easiest way to develop and debug CLR code running in SQL Server is to use a C# or VB SQL Server project in Visual Studio.  When you deploy a SQL Server project, Visual Studio deploys the CLR assembly and its symbol file (.pdb) and source code files to the database so that the assembly is ready for execution and debugging.  

 

When you have a CLR assembly built in other ways, you can also use CREATE ASSEMBLY statement to deploy it to the database.  In this case you need to use ALTER ASSEMBLY ADD FILE to attach the symbol file (.pdb) and source code files to the assembly, and then you can use Visual Studio to debug the assembly.  If you forget to add a symbol file or source file, step-in or breakpoint in the corresponding CLR code will be skipped by Visual Studio.

 

In Visual Studio there are 3 ways that you can start debugging of T-SQL or CLR code in the database:

. Direct Database Debugging (DDD).  You can open Server Explorer and add a Data Connection to a SQL Server database.  Then you can browse to any T-SQL or CLR object in the database, such as a stored procedure or function, right click on it, and select Step Into to begin DDD.

. Debugging from a SQL Server project.  When you start debugging from a C# or VB SQL Server project (e.g. by pressing F5), the assembly will be built and deployed, and the default test script will be run in debugging mode.  Breakpoints in the default test script or in any T-SQL or CLR code called by the default test script will be hit.  You can also right click on any test script in the project and select Debug Script.

. Application debugging.  You can use Visual Studio to attach to any client application that opens a database connection to SQL Server, e.g. SQL Management Studio, and then you can debug T-SQL or CLR code executed on this connection.  When you attach to the client process, you need to make sure the debugging type includes T-SQL code, along with any other desired types such as native code and/or managed code.  There are some notable limitations in application debugging:

. You have to attach to the client process before the database connection is opened.  Any code run on the connections opened before attaching will be ignored.

. A Data Connection to the server being debugged must present in the Server Explorer.  Connections made by the attached client to databases not listed in Data Connections in Server Explorer will not be debugged.

. You can not step into a T-SQL or CLR store procedure from client code (managed or native).  Usually you need to set a breakpoint in the T-SQL or CLR code that you want to debug.  To do this browse to the desired object in Server Explorer, double click on the object to open its source code, and set a breakpoint.

 

The following are some imitations of T-SQL and CLR debugging in SQL Server 2005:

. T-SQL debugging is on the statement level.  You can not step into the execution of a select statement.

. Visual Studio 2005 is needed to debug T-SQL.  Microsoft SQL Server Management Studio doesnt support debugging.  Some third party tools that can be used to debug SQL Server 2005 may be released in the near future (or may have already been released).

. Break into CLR code in one connection freezes all connections that are running CLR code.  CLR debugging doesnt have connection isolation that is available in T-SQL debugging, because of limitations in CLR debugging architecture.

. Debugging doesnt work with parallel compiled plan execution.  When a batch or stored procedure being debugged is compiled to a parallel plan, debugging may not work.  In this case you need to disable parallel plan generation to debug it.

 

[ Cross posted from http://blogs.msdn.com/sqlprogrammability ]

Posted by sqlclr | 15 Comments

Strong named assemblies and AllowPartiallyTrustedCallers

 

Very often, you may wish to factor out your code into separate assemblies in your application. For example, you separate your type in one assembly because it gets used by multiple areas in your application. And you save your proc that uses that type in another assembly.

So Let's say you have two CLR assemblies. Assembly A contains a UDT. Assembly B has functions which use the UDT in their function signatures and/or in code inside the function. Assembly B references A.

A problem occurs when they are strong-named. When a function in B is called this error is thrown:

Msg 6522, Level 16, State 1, Line 1
A .NET Framework error occurred during execution of user defined routine or aggregate 'MyProc':
System.Security.SecurityException: That assembly does not allow partially trusted callers.
System.Security.SecurityException:
at System.Security.CodeAccessSecurityEngine.ThrowSecurityException(Assembly asm, PermissionSet granted, PermissionSet refused, RuntimeMethodHandle rmh, SecurityAction action, Object demand, IPermission permThatFailed)

What's going on ?

You tried writing your own executable with the same callers that use the UDT in A and it works! but it doesn't within the server and you are wondering whether its a bug in the server ?

Well this is actually a CLR design. MSDN says that Types in strong-named assemblies can be called by partially trusted code only when the assemblies have been marked with the AllowPartiallyTrustedCallers attribute (APTCA). It adds that this attribute essentially removes the implicit LinkDemand for the FullTrust permission set that is otherwise automatically placed on each publicly accessible method in each type. Please note the word "FullTrust" in CLR refers to unrestricted access or 'unsafe' permission set by sqlclr definition.

So,

When assembly B calls strong-named assembly A,

Either A must have AllowPartiallyTrustedCallers attribute Or B must have unsafe (FullTrust) permission set.

What does this mean ?

It means that CLR enforces the callers of a strong named assembly to be fully trusted unless that assembly has the APTCA attribute. If you add another assembly C that calls the functions in B, then B should have APTCA attribute or C should be unsafe. We recommend that if your assemblies in the database are SAFE/EXTERNAL_ACCESS (not FullTrust / partial trust / restricted), you need to mark them with APTCA to support cross assembly calls.

The reason your executable worked is, by default every assembly on you local drive gets FullTrust. If you copy your assemblies to a network share and try to run it from your local machine, you will see the same error, because default CLR policy will not give FullTrust to assemblies on remote shares.

Here's the history on this requirement for sqlclr assemblies. CLR introduced the APTCA model for providing protection for the assemblies that are shared across multiple applications. Typically such assemblies were put in the GAC by applications. Assemblies that are NOT shared are usually in the Application Path. Any application that did not have unrestricted access(FullTrust/unsafe) could only load only those assemblies that are in GAC or those found by probing in the location specified by the AppDomain's APPBASE property (ApplicationPath / Database).CLR loader guaranteed that and provided protection against restricted assemblies from loading arbitrary assemblies. In the case of SQL Server, the hosting hooks guaranteed similar control on what gets loaded in its appdomain. But there was no corresponding mechanism for protecting shared assemblies ( such as in the GAC ) from being called by arbitrary assemblies. APTCA enforced that by either requiring the caller to have unrestricted access or requiring that those shared assemblies opt to allow arbitrary caller. Using the attribute implied that the shared assembly would be audited / reviewed and then deemed to be safe for use from any calleer.

The problem with APTCA is that CLR checks for it in *every* assembly instead of restricting the check to shared assemblies. Further, CLR could have skipped checking this attribute for assemblies in the ApplicationPath / Database which can be considered as NOT shared. For example, this relaxation makes perfect sense for SAFE assemblies since have only execute permissions. They could have been considered implicitly APTCA and they could have been allowed to call each other. CLR allows the host to dictate permissions for the assemblies but does not allow the host to control APTCA behaviour.

The workaround we suggest therefore is to mark your SAFE/EXTERNAL_ACCESS assemblies with APTCA attribute in order to allow them to talk to each other. Please note that specifying this attribute in sqlclr assemblies does not require any audit/review and can be considered the norm to allow cross assembly calls. So we recommend that. We are not changing the rules here but specifying its relevance in SQLCLR. The directive for audit / review continues to remain valid for assemblies in the GAC that are shared between multiple other applications. In addition, we would not recommend granting unrestricted access to the callers of sqlclr assemblies. That will be an extreme measure.

[Posted by RaviR]

Posted by sqlclr | 1 Comments

Creating a CLR user define aggregate (part 2). Use multiple columns in the aggregation function

In part 1 we created a nice user defined aggregate. Now we are going to make it more sophisticated and let its value depend on two parameters ShipCountry and ShipShipCity. You might try having two parameters in Accumulate function of the aggregate but you will get an error

The Accumulate method in user defined aggregate "Bonus" must have exactly one parameter.

We are definitely looking into adding “multi-column aggregates” feature in the future versions of SQL Server. For now you can use a workaround. The idea is to create a worker UDT that contain all the fields required for the aggregation. So if you want to take Orders.ShipCountry and Orders.ShipCity into account the UDT should have two corresponding fields. You also need to create a user defined function that takes a number of parameters and returns an instance of the worker UDT. And finally you create an aggregate that takes the worker UDT as a parameter in its aggregation function.

Let’s say XYZ wants to consider German sales that has been shipped to Berlin as regular sales. To take ShipCity this into account you first need to create a UDT. I won’t implement several methods to keep the sample short.

[Serializable]

[Microsoft.SqlServer.Server.SqlUserDefinedType(Format.UserDefined, MaxByteSize=8000)]

public struct OrderData : INullable, IBinarySerialize

{

    public override string ToString()

    {

        throw new Exception("The method or operation is not implemented.");

    }

 

    public bool IsNull

    {

        get

        {

            return false;

        }

    }

 

    public static OrderData Null

    {

        get

        {

            throw new Exception("The method or operation is not implemented.");

        }

    }

 

    public static OrderData Parse(SqlString s)

    {

        throw new Exception("The method or operation is not implemented.");

    }

 

    public string ShipCountry;

    public string ShipCity;

 

    #region IBinarySerialize Members

 

    void IBinarySerialize.Read(System.IO.BinaryReader r)

    {

        ShipCountry = r.ReadString();

        ShipCity = r.ReadString();

    }

 

    void IBinarySerialize.Write(System.IO.BinaryWriter w)

    {

        w.Write(ShipCountry);

        w.Write(ShipCity);

    }

 

    #endregion

}

Then you need a function that takes two strings and returns a UDT

public partial class UserDefinedFunctions

{

    [Microsoft.SqlServer.Server.SqlFunction]

    public static OrderData MakeOrder(string ShipCountry, string ShipCity)

    {

        OrderData o = new OrderData();

        o.ShipCountry = ShipCountry;

        o.ShipCity = ShipCity;

        return o;

    }

};

And finally a new aggregate

[Serializable]

[Microsoft.SqlServer.Server.SqlUserDefinedAggregate(Format.Native)]

public struct BonusEx

{

    private int m_nRegularSales;

    private int m_nGermanSales;

 

    public void Init()

    {

        m_nRegularSales = 0;

        m_nGermanSales = 0;

    }

 

    public void Accumulate(OrderData o)

    {

        if (o.ShipCountry == "Germany" && o.ShipCity != "Berlin")

        {

            ++m_nGermanSales;

        }

        else

        {

            ++m_nRegularSales;

        }

    }

 

    public void Merge(BonusEx Group)

    {

        m_nRegularSales += Group.m_nRegularSales;

        m_nGermanSales += Group.m_nGermanSales;

    }

 

    public int Terminate()

    {

        return Math.Min(200, (m_nRegularSales + 3 * m_nGermanSales));

    }

}

So now you can get the bonus for each employee with this query

select

      Employees.FirstName, Employees.LastName, dbo.BonusEx(dbo.MakeOrder(Orders.ShipCountry, Orders.ShipCity))

from

      Employees join Orders on Employees.EmployeeId = Orders.EmployeeId

group by

      Employees.EmployeeId, Employees.FirstName, Employees.LastName

Bottom Line

User defined aggregates gives you more flexibility in terms of where you can implement complex business rules. Now you have all the option: client, middle tier and server. You cannot pass several columns into the aggregation function but you can work it around using UDTs.

[posted by NikitaS]

Posted by sqlclr | 1 Comments

Attachment(s): ClrAggregates.zip

Creating a CLR user define aggregate (part 1). Simple CLR aggregate

SQL Server 2005 allows creating of User Defined Aggregate in any of the .NET  languages such as C# or VB. For simple cases like SUM or MAX you probably want  to use built-in aggregates, however there are cases  where build-ins are insufficient. In such cases people used to put the business logic on a client on a middle tier. With the new version of SQL Server you can have this logic on a server.

Let’s say company XYZ wants to come up with a way of calculating a bonus for their employees. XYZ uses NWIND database (NWIND database can be downloaded from http://msdn2.microsoft.com/en-us/library/ms143221.aspx). XYZ wants to have a business rule such that the bonus is never greater than 200% of the salary and each regular sale adds 1% to the bonus and each sale to Germany adds 3% to the bonus.

With the new Sql Server 2005 you can write your own aggregates in C# (or any .NET compatible language). Here is the aggregate.

[Serializable]

[Microsoft.SqlServer.Server.SqlUserDefinedAggregate(Format.Native)]

public struct Bonus

{

    private int m_nRegularSales;

    private int m_nGermanSales;

 

    public void Init()

    {

        m_nRegularSales = 0;

        m_nGermanSales = 0;

    }

 

    public void Accumulate(SqlString Country)

    {

        if (Country == "Germany")

        {

            ++m_nGermanSales;

        }

        else

        {

            ++m_nRegularSales;

        }

    }

 

    public void Merge(Bonus Group)

    {

        m_nRegularSales += Group.m_nRegularSales;

        m_nGermanSales += Group.m_nGermanSales;

    }

 

    public int Terminate()

    {

        return Math.Min(200, (m_nRegularSales + 3 * m_nGermanSales));

    }

}

And here is a T-SQL query that uses this aggregate to calculate bonus for each employee.

select

      Employees.FirstName, Employees.LastName, dbo.Bonus(Orders.ShipCountry)

from

      Employees join Orders on Employees.EmployeeId = Orders.EmployeeId

group by

      Employees.EmployeeId, Employees.FirstName, Employees.LastName

 

[Posted by NikitaS]

Posted by sqlclr | 7 Comments

Impersonation in SQL CLR

When the user code will try to access a resource outside of the SQL server it will run in the security context under which SQL Server is running. But this is not the ideal situation since it could become a security issue when the logged user that calls the code has less privileges than account under which SQL Server is running. It is also possible that SQL Server will run under a low privilege account and the user cannot access the desired resource. So it is highly recommended that the user code will impersonate before accessing resources outside of the server.

 

There are 2 possible impersonation types in SQL CLR. I will call them implicit and explicit impersonation. For the implicit impersonation the executing code will try to impersonate the calling user. For this type of impersonation there is no need for the user name and password. In the explicit impersonation the code will try to impersonate any NT user using name and password.

 

The implicit impersonation is done by obtaining the user impersonation token from the SQL Server calling SqlContext.WindowsIdentity and calling Impersonate method that will change the security token of the thread. At the end of accessing resource the user should revert thread token by calling Undo method of the WindowsImpersonationContext returned by Impersonate method. SQL Server will check if the user did revert the thread token and it will throw an error message if it was not reverted. This means that every in-proc data access call should be made after reverting back. If the managed code throws an exception that is not handled, SQL Server will not show the exception but it will show the error message the user thread token was not reverted. This is the reason for which the Undo should be in a finally block. See the SQL Server Books Online for the help.

SqlContext.WindowsIdentity will be null for the sql logins. If the user will log in as sa, the SqlContext.WindowsIdentity will return the identity under which the server process is running. For windows logins it will return the corresponding windows identity even if the login is sysadmin. If the execution is happening in the context of an Execute As Login (statement directly or some module marked with it ) SqlContext.WindowsIdentity will be null unless the login is sysadmin. In this case SqlContext.WindowsIdentity will return the identity under which the server process is running.

 

The explicit impersonation is done by calling LogonUser with the name and password constructing WindowsIdentity with the token obtained. This impersonation is working inside of the SQL CLR but is not recommended. The same restriction will apply for reverting before in-proc data access or finishing execution. This means that it will be possible for the user to do an explicit impersonation after an implicit one but it is not possible the other way around. With the explicit impersonation it is possible to leak token handles in the user code (since they are opened in the unmanaged code).

 

Regards,

Mihail Frintu

 

This posting is provided "AS IS" with no warranties, and confers no rights.

Posted by sqlclr | 5 Comments

Assembly redirection and CLR Integration

Assembly redirection is not supported in SQL CLR. Assembly redirection can be done using publisher, application configuration and machine configuration files.  SQL Server 2005 uses application configuration to redirect assembly call after Alter Assembly. So let’s suppose there is an assembly A calling assembly B and the user will run Alter Assembly command for B (modifying the version). In this case, SQL Server will write an application configuration in memory to redirect the calls of A to the new version of B overwriting any other application configuration filer ?. This is the reason for what why SQL Server does not take in consideration the user specified redirection in application configuration file.

 

The story is different for machine and publisher configuration file. If there is a redirection of the of the assembly B from one version (let’ s say 1.1.1) to other(1.1.5) and there were not Alter Assembly statements inside of the database to redirect the version 1.1.1 to 1.1.3 the attempt to load assembly B in the memory will fail. This can cause Create Assembly, Alter Assembly or executing the code inside of the assembly to fail. If there are Alter Assembly statements to match the redirection, the assembly can be load in the memory so it can be altered or executed. I mention Alter Assembly statements because final assembly redirection can come after multiple assembly steps: Alter Assembly from 1.1.1 to 1.1.2 and Alter it again from 1.1.2 to 1.1.3.

 

The entry in the application configuration from the memory for an assembly will be deleted when the assembly is dropped from the database.  

 

The redirection resulted from an Alter Assembly will also have the following effect: if a user registers Assembly A version 1.0 and alter it to version 2.0 he will not be able to register again assembly A version 1.0 until he drops Assembly A version 2.0. After the drop he will be able to register both of the versions of the assembly since the application policy was updated.

 

Also note that SQL Server will distinguish between multiple versions of the same assembly only for strong named (or signed) assemblies

 

In my tests I used Fuslogvw.exe to see the redirection and loads attempts.

 

Regards,

Mihail Frintu

This posting is provided "AS IS" with no warranties, and confers no rights.

Posted by sqlclr | 5 Comments

Satellite assemblies in SQL Server 2005

This is a sample on how to register satellite assemblies in SQL Server 2005.

 

Based on the CultureInfo on the executing thread, CLR will try to load the respective resource assembly. It should not be a difference between satellite assemblies inside and outside of the SQL CLR other than location: if for a normal application the satellite assemblies need to be located in a special name subdirectory or in GAC for SQL CLR, the satellite assemblies should be registered inside of the database using Create Assembly command. Remember that the SQL Server will only load the assemblies registered in the database.

 

The naming convention is mandatory:  for assembly A, resources assembly file should be named as A.resource.dll. The sql name given at the registration time is not important.

Also I observed from my tests that it is mandatory to have a match between the versions of the root assembly and resources assembly (but this is not specific to SQL CLR).

 

In my bellow test I am changing the CultureInfo of the current thread in order to check that the right resource assembly is loaded.

I have already created 2 resource files Test.resources and Test.fr.resource that contain an entry 'test', with values ‘default’ and 'fr' with the following code.

IResourceWriter rw = new ResourceWriter(strFileName);

rw.AddResource(strResName, strResValue);

rw.Close();                

 

This is the assembly code that tries to consume the resource and change the CultureInfo. My assembly is strong named using TestKeyPair.key so you will need to use your own key pair there.

 

using System.Reflection;

using System.Threading;

using System.Globalization;

using System.Resources;

 

 

[assembly:AssemblyVersion("1.3.0.0")]

 

public class cTest

{

        public static string  MainMethod(string strCul)

        {      

                 if (strCul!="" && strCul!="def")

                 {

                         Thread.CurrentThread.CurrentUICulture = new CultureInfo (strCul);    

                 }

                 ResourceManager rm = new ResourceManager ("Test", Assembly.GetExecutingAssembly());

                 return rm.GetString("test");

        }

}

 

This is the command to compile assembly:

csc.exe /target:library /out:Test.dll Test.cs  /r:system.dll /res:Test.resources /keyfile:TestKeyPair.key

 

This is the command that I used to create the resource assembly (note the version, culture and key used for signing):

al.exe /out:Test.resources.dll /v:1.3.0.0 /c:fr /embed:Test.fr.resources /t:lib /keyf:TestKeyPair.key

 

This is the TSQL code used to register assemblies and clr user defined function in the database:

 

CREATE ASSEMBLY Test FROM 'C:\temp\meta\Test.dll' WITH PERMISSION_SET=UNSAFE

go

 

CREATE FUNCTION dbo.f_SatTest (@p_culture nvarchar(400))

RETURNS nvarchar(400)

AS

EXTERNAL NAME Test.cTest.MainMethod

go

 

CREATE ASSEMBLY Test_FR FROM 'C:\temp\meta\Test.resources.dll'

go

 

SELECT name FROM sys.assemblies WHERE name LIKE 'Test%'

go

 

SELECT dbo.f_SatTest('')

go

 

SELECT dbo.f_SatTest('fr')

 

 

This posting is provided "AS IS" with no warranties, and confers no rights. Use of included script samples are subject to the terms specified at http://www.microsoft.com/info/cpyright.htm

 

Posted by sqlclr | 1 Comments

How to tell if your code should be yielding

As we saw in the last post, cooperative clr code that explicitly yields during long-running computations outperforms clr code that gets forcibly rescheduled.  How should one discover sql/clr code that is considered 'runaway' by the server and could be improved?

There are a number of informative dmvs that contain helpful information in cases like this, such as  sys.dm_clr_tasks, sys.dm_os_waiting_tasks, and sys.dm_os_workers.

For this particular problem, I want to know what state my clr tasks are in and what they have been waiting on recently as well as the number of times that the SQL Server Scheduler has forced them to yield:

select os.task_address, os.state, os.last_wait_type, clr.state, clr.forced_yield_count from sys.dm_os_workers os join sys.dm_clr_tasks clr on (os.task_address = clr.sos_task_address) where clr.type = 'E_TYPE_USER'

When my greedy count query from the last post is running, I notice a number of rows like the following in the output:

0x006D9180

RUNNABLE

SQLCLR_QUANTUM_PUNISHMENT

E_TASK_ATTACHED_TO_CLR

2

0x006D89C0

RUNNABLE

SQLCLR_QUANTUM_PUNISHMENT

E_TASK_ATTACHED_TO_CLR

4

0x008CCCA8

RUNNABLE

SQLCLR_QUANTUM_PUNISHMENT

E_TASK_ATTACHED_TO_CLR

1

0x006D8E98

RUNNABLE

SQLCLR_QUANTUM_PUNISHMENT

E_TASK_ATTACHED_TO_CLR

3

SQLCLR_QUANTUM_PUNISHMENT indicates that the task previously exceeded its allowed quantum, causing the scheduler to intervene and reschedule it at the end of the queue while forced_yield_count shows the number of times that this has happened.

In contrast, when the cooperative version is running, I see rows like the following where the last_wait_type is SOS_SCHEDULER_YIELD and forced_yield_count is 0.

0x008CCAB8

RUNNING

SOS_SCHEDULER_YIELD

E_TASK_ATTACHED_TO_CLR

0

0x006D9278

RUNNABLE

SOS_SCHEDULER_YIELD

E_TASK_ATTACHED_TO_CLR

0

0x006D9088

RUNNABLE

SOS_SCHEDULER_YIELD

E_TASK_ATTACHED_TO_CLR

0

This post is late as I was trying to include more information about all the underlying activity, but it quickly proved too much for a blog post on this simple example.  Hopefully, I and others on the team will be able to be able to go into more detail in future posts. 

Who says tough guys never yield?

Perhaps you've seen the following note in BOL advising SQLCLR users to explicitly yield in their long-running CLR functions:

 

"Long-running managed code that accesses data or allocates enough memory to trigger garbage collection will yield automatically. Long-running managed code that does not access data or allocate enough managed memory to trigger garbage collection should explicitly yield by calling System.Thread.Sleep() function of the .NET Framework."

 

I'm not sure about you, but reading notes like this brings out the worst in me.  Why should I listen to BOL's recommendation to yield in my code if I don't want to? Surely my code is more important than any other code running on the system? If I want maximum performance out of my code (and only my code matters) why would I voluntarily sleep and give up the rest of my quantum to someone else?

 

Sure, BOL claims that the SQL Server scheduler can detect runaway threads and suspend them, but does this really work? Is it effective enough to convince otherwise greedy developers like me that sharing and playing well with others is a better approach to life?

 

To test this, I came up with some quick CLR code that spawns a bunch of threads to do nothing but count to a big number in a loop.  A parameter flag specifies whether the function would try to be greedy and never yield, or cooperative and call sleep(0) periodically in the outer loop.  Note that this code needs to run in UNSAFE permission set because of its use of threads, but that's ok since this is just for demonstration purposes and isn't something you would ever run in production.

 

using System;
using System.Threading;

public partial class GreedyOrCoop
{
    
public static double StartCounting(bool isGreedy, int numThreads, int countTo)
    {
        
DateTime startTime = DateTime.Now;
        
Thread[] threadPool = new Thread[numThreads];
        Worker workerCount =
new Worker(isGreedy, countTo);
        
for (int i = 0; i < numThreads; i++)
        {
            threadPool[i] =
new Thread(new ThreadStart(workerCount.Count));
            threadPool[i].Start();
        }
        
for(int i = 0; i < numThreads; i++)
            threadPool[i].Join();
        
return (DateTime.Now - startTime).TotalMilliseconds;
    }

    
public class Worker
    {
        
int countTo;
        
bool isGreedy;

        
public Worker(bool IsGreedy, int CountTo)
        {
            isGreedy = IsGreedy;
            countTo = CountTo;
        }

        
public void Count()
        {
            
for (int j = 0; j < 10; j++)
            {
                
int i = 0;
                
while (i < countTo)
                    i++;
                
if (!isGreedy)
                    
Thread.Sleep(0);
            }
        }
    }
}

I wanted the code to run on multiple threads at the same time to ensure there was some contention taking place.  I tweaked the numbers I used a little, but settled on using 10 threads that counted to 500 million 10 times per thread because on my system that would take about 3 minutes to run, which is the perfect amount of time to get a cup of coffee.

select dbo.StartCounting(1, 10, 500000000)
select dbo.StartCounting(0, 10, 500000000)

And the results? Averaged over 100 iterations, the Cooperative version beat the Greedy version by over 4 seconds, 84.31 seconds vs. 88.79 seconds.  Maybe it's time for me to pay more attention to BOL's recommendations.

Tomorrow, I'll talk about some DMVs that are useful to monitor this behavior and how to tell if your threads are being forcible preempted by the SQL Server Scheduler.

Memory Usage in SQL CLR

There are two important memory considerations you may want to track when using SQL CLR functionality: 1) How much memory is SQL CLR using? And 2) How much memory is SQL CLR allowed to use?

 

The answer to the first question is pretty easy to answer thanks to the dmv sys.dm_os_memory_clerks.  The field single_pages_kb is for memory allocated in the SQL Buffer Pool, multi_pages_kb is for memory allocated by the SQL CLR Host that is outside the SQL Buffer pool, and virtual_memory_committed_kb is the amount of memory allocated by the CLR directly through bulk allocation interface (instead of heap allocation) through SQL server. The memory is mostly used for the managed GC heap and the JIT compiler heap, and it is also stored outside of the SQL Buffer Pool.  So, to get the total memory used by SQL CLR, you would run the following query:

 

select single_pages_kb + multi_pages_kb + virtual_memory_committed_kb from sys.dm_os_memory_clerks where type = 'MEMORYCLERK_SQLCLR'

 

Now that we know how much memory SQL CLR is using on the server, it would be nice to know how much memory SQL CLR is allowed to use.  You may be aware that when there is memory pressure on the server, SQL CLR will try to release memory by explicitly running garbage collection and, if necessary, unloading appdomains.

 

There are two types of memory pressure to be aware of:

-         Physical memory pressure based on the amount of available system memory

-         Virtual Address Space memory pressure based on the number of available virtual addresses

 

Physical memory pressure is pretty clear; if your server is under load and running low on available memory, then Windows issues a LowMemoryResourceNotification which SQL Server listens for and handles as Slava explains in two posts on his blog.  Understandably, SQL CLR can’t use so much memory that it causes external physical memory pressure.

 

Virtual Address Space memory pressure is more interesting and frequently more limiting from the SQL CLR perspective because it might cause memory pressure even when there is enough physical memory available.  This might happen because as was noted above most SQL CLR memory allocations occur outside of the SQL Buffer Pool in what is sometimes called the MemToLeave section.  The size of this area of memory is set by the –g flag on SQL Server start-up, but by default it is at least 256 MB.  I say “at least” because the value is not explicitly defined, it is simply the amount of VAS not reserved already by the SQL Buffer Pool.  Since the SQL Buffer Pool will not reserve more than the amount of physical ram, this would result in the case where a machine with less ram would actually have more VAS available in MemToLeave.

 

For an example of how this might affect SQL CLR, in a discussion with MVP Adam Machanic, it was noted than on his machine with 1 GB of ram, he was able to use more memory in SQL CLR than I was on my machine with 2 GB of ram.  Adam’s machine would have 1 GB reserved for the buffer pool and 1 GB left for MemToLeave, whereas my machine had 1792 MB reserved for the buffer pool and therefore SQL CLR was limited to the 256 MB left in MemToLeave.

 

Thankfully, Virtual Address Space memory pressure is primarily only an issue for x86 because on 64-bit machines, as Ken Henderson mentions, the user-mode VAS is 8 TB, so there is always plenty of VAS space left for SQL CLR.

 

-- Steven Hemingray

More Posts Next page »
 
Page view tracker