The blog of the F# team at Microsoft
This week's post is a guest article writtten by Dmitry Sevastianov with help from Dmitry Morozov, two members of the F# community, about FSharp.Data.SqlClient, a fascinatingly simple and elegant technique to integrate T-SQL and F# in new ways, using the ever-awesome power of F# type providers.
First, be warned - this post is NOT about LINQ! It is about how to do LINQ-like things, but using a technique that is much simpler yet better in certain ways, though with pros and cons. It is also an industry-relevant example of a technique for compile-time checking of embedded DSLs that has applicability well beyond SQL-data programming.
Usual disclaimer: Note this is a guest article about a community-provided library, and the content below doesn't represent the views of Microsoft or the Visual F# Tools team.
Don Syme, for the Visual F# Tools Team
In this post, we will explore the FSharp.Data.SqlClient library, and in particiular the SqlCommandProvider, which allows a new way of using both SQL syntax and F# syntax in data scripting.
Nearly all readers will be familiar with LINQ from either C# or F#. LINQ queries were integrated in F# in F# 3.0, expecially through Query Expressions. These are a lovely and useful technology. LINQ queries make writing data access code a pleasant exercise. LINQ in F# builds upon both query expressions and the existing F# type providers for databses, used to expose structured data in F# code. There are several different implementations of LINQ for SQL data available for F#, for example SQLProvider (a recent powerful F# community effort, based on direct ADO.NET, avoiding dependencies on Entity Framework, and usable with many different database technologies), SqlDataConnection (based on the Linq2Sql framework) and SqlEntityConnection (based on Entity Framework).
However, here we would like to point out some of the reasons why the alternative approach to query processing offered by the new FSharp.Data.SqlClient is in some circumstances a better solution to the woes of .Net developers.
First, a simple observation: StackOverflow and MSDN has hundreds of issues like this one or this one for all kinds of ORM frameworks from NHibernate to LINQ-to-SQL to LINQ-to-Entities. What seems like perfectly valid code fails atrun-time because of an unsupported C#-to-SQL translation semantics. To quote one response to one of these queries:
LINQ to SQL does not know how to translate a call your method 'IsValid' into SQL in order to execute the where clause on the server. Your method does not exist on the server.
or the following runtime error message:
LINQ to Entities does not recognize the method 'Double Sqrt(Double)' method, and this method cannot be translated into a store expression
This is a great example of "leaky abstraction". Further, we all know intuitively that runtime errors like these should be avoided in strongly-typed languages: these conditions are quite "obviously" checkable at compile-time. The lack of control and opaqueness of the C#-to-SQL conversion spells performance problems as well, like infamous "N+1" issue where performance of database queries can be affected based on small changes to code.
Readers are no doubt familiar with a popular blog post The Vietnam of Computer Science by Ted Neward going deep into so-called object-relational impedance mismatch, which is at the core of these issues. So far one industry answer to this was a number of so-called micro-ORMs with a mission of making conversion from database types to .NET objects as simple as possible while refraining from making any assumptions about the actual mapping.
What all of them lack, however, is an ability to verify correctness of SQL queries at compile-time.
And that's where SqlCommandProvider, the core element of the FSharp.Data.SqlClient library, really shines. Essentially, it offers "What You See Is What You Get" for SQL queries. Once F# code involving SqlCommandProvider passes the compilation stage you are guaranteed to have valid executable code for both F# and T-SQL.
Here is a typical snippet of SqlCommandProvider-enabled code:
let connectionString = @"Data Source=(LocalDb)\v11.0;Initial Catalog=AdventureWorks2012;Integrated Security=True"
let query = "
SELECT TOP(@TopN) FirstName, LastName, SalesYTD
WHERE CountryRegionName = @regionName AND SalesYTD > @salesMoreThan
ORDER BY SalesYTD
type SalesPersonQuery = SqlCommandProvider<query, connectionString>
let cmd = new SalesPersonQuery()
cmd.AsyncExecute(TopN = 3L, regionName = "United States", salesMoreThan = 1000000M)
[("Pamela", "Ansman-Wolfe", 1352577.1325M);
("David", "Campbell", 1573012.9383M);
("Tete", "Mensa-Annan", 1576562.1966M)]
Now, if there is a typo of syntax, or column names, or if table names are incorrect anywhere in the query, the F# compiler notifies developer immediately. If the query is not legitimate, and would fail basic semantic checking on the database, the F# compiler notifies developer immediately. Likewise, if the database schema changes in ways that make the query incorrect, the F# compiler notifies developer immediately.
That is, the F# compiler is checking the well-formedness of the SQL syntax at compile-time, through the use of the FSharp.Data.SqlClient type providers. More specifically, FSharp.Data.SqlClient uses features available in MS SQL Server 2012 and SQL Azure to check and compile the SQL query at compile time. As you program in F#, your query is being sent to the database and checked for validity, even as you type! Please see FSharp.Data.SqlClient documentation for more details about how this works.
Note that F# type providers are addins to the F# compiler and the F# development environments, used to provide this checking functionality.
Here is an example of an error being reported because of a problem in an SQL query:
In this section, we take a quick look at performance. This is "just a taster" and shouldn't be taken as a full and rigorous analysis.
As we've mentioned, performance is another potential issue for language-integrated queries which rely on query translation via meta-programming (the technique used by LINQ). We'll see this below. However, in order to contrast with "lighter" approaches to the same problem, one comparison we make is with Dapper, an excellent micro-ORM by StackOverflow with a main goal of being extremely fast. It doesn't use the meta-programming techniques of LINQ to the same extent. Here is a description from StackOverflow itself:
Dapper is a micro-ORM, offering core parameterization and materialization services, but (by design) not the full breadth of services that you might expect in a full ORM such as LINQ-to-SQL or Entity Framework. Instead, it focuses on making the materialization as fast as possible, with no overheads from things like identity managers - just "run this query and give me the (typed) data".
Dapper comes with an excellent benchmark. The focus of the test is on deserialization.
Ideally we'd like the "direct" SQL queries above to perform as well as a micro-ORM like Dapper. And indeed, it seems this is the case, at least for simple benchmarks. Here is a summary of how FSharp.Data.SqlClient compares, on one benchmark, based on our tests:
Tests were executed against SqlExpress 2012 so the numbers are a bit higher than what you can see on the Dapper page. The test retrieves a single record with 13 properties by random id 500 times and deserializes it. Trials for different ORMs are mixed up randomly. All executions are synchronous.
Note that we didn't put any specific effort into improving FSharp.Data.SqlClient performance for this test. The very nature of using directly written SQL means FSharp.Data.SqlClient uses a very simple run-time implementation, and is hence almost as close as possible to hand-coded ADO.NET code.
The lesson here is that performance of code that uses direct authoring of compile-time checked SQL syntax to express SQL queries can be very predictable and involve low overhead and translation complexity. FSharp.Data.SqlClient both checks your SQL queries at compile time, in a way that is much more rigorous than other SQL query translation techniques, and gives you a simple and clear path to understanding query performance, basically by avoiding the query translation process altogether.
We must keep in mind that FSharp.Data.SqlClient is not strictly an ORM in commonly understood sense of the term. Because of this, it is important to undertstand some of the pros and cons of the techniques we are using. Feel free to list fruther pros and cons in the comment section below!
The following FSharp.Data.SqlClient features are also worht highlighting:
F# 3.0 Type Providers, which are, in essence, light-weight plugins for F# compiler, dramatically improve developer experience through access to various external data sources with Intellisense and checking at design time.
Combined with the latest features of MS SQL Server, the FSharp.Data.SqlClient library empowers users to write compile time-verified F# and SQL code, with live rechecking in reaction to SQL schema changes, leaving no space for boilerplate while promising performance comparable with the best-of-breed traditional solutions.
This can be viewed as a Domain Specific Language seamlessly integrating SQL and F# in the same codebase.
Please use and contribute to this great F# community library effort at http://fsprojects.github.io/FSharp.Data.SqlClient/.
I have been using FSharp.Data.SqlClient in our production system virtually every day of 2014. It has directly contributed to the success of our company (we are an all F# shop). Having full native T-SQL inline with F# code is tremendously powerful. I also cannot say enough about the responsiveness of Dmitry and Dmitry to the very few issues we have had. They have made improvements over the course of my experience with the type provider which I would not have thought of. Lots of other capabilities not covered in the article, like "with" method for copy and update capabilities over the generated records. fsprojects.github.io/.../output.html
This is among the best type providers. Highly recommended!
I have been working on a project (github.com/.../canopy.TestPen) using this type provider and I love it. It lets me write SQL by hand, which I prefer, but without having to write results types or boring mapping code. Compile time checks for query correctness is just icing on the cake! Keep up the great work!
What about putting TSQL in a separate file and then have Visual Studio give full intellisense, syntax highlighting, etc?