This is the first in a series of posts on C# and LINQ. These posts will describe a natural, easy to understand technique for querying data. When using LINQ, simply declare the question you want to ask, and then sit back and wait while the computer analyzes your query and finds an optimal way to retrieve your answer.

The development of LINQ involved input from many of the best minds at Microsoft. Anders Hejlsberg, researchers at the Microsoft research centers in Redmond and in Cambridge, England, the C# developer and test teams, the C# program managers, and others all contributed to the development of this language. As a result, it represents one of the major technical achievments that has come out of the Microsoft Developer Division in recent years.

When you hear talk about querying data, it is easy to imagine the kind of data that resides in a database. And indeed, you can use LINQ to query data in a database. However, LINQ does more than just run SQL queries. It can be used with almost any kind of data.

The acronym LINQ stands for Language Integrated Query. This means that LINQ is a technique for querying data that has been integrated into the C# language. For database programmers, this means that you don't have to embed SQL strings in your code anymore. Instead, a query language will be built into C#. For XML programmers, this means that you don't have to learn XPath, instead, a query language will be built into C#. You can, of course, continue to use any technology that you prefer, or any combination of technologies that seems optimal in a particular case. But LINQ is now built into the C# language and offers a native solution.

A wide variety of data sources will now be accessible through LINQ. You can use it to retrieve data from an array, or from a collection, or from any data source that supports IEnumerable or IQueryable. You can even query your own program and ask how many of its methods are public, or which classes reside in a particular namespace.

LINQ can, at least in theory, be part of any .NET language. For now, that means primarily C# and VB, but there has been talk of integrating LINQ into other languages, such as PHP and Python. LINQ will be released as part of the Orcas version of Visual Studio. At the time of this writing it is available in various Microsoft CTP's, and of course it will be part of the Orcas beta when it ships.

LINQ in Action

It's time to take a peak at the LINQ syntax as it appears in the May CTP. Note that LINQ is still under development and the syntax is still undergoing minor modifications. But for now, I strongly recommend installing the May CTP and taking LINQ out for a spin.

The code shown in Listing Two demonstrates a somewhat unorthodox way to use LINQ to query an array of string. I say it is unorthodox because I sidestep a few common LINQ idioms in order to write code that will look as familiar to you as possible. In future posts, I will talk more about the preferred way to write LINQ queries, but for now I want to limit the number of new ideas I introduce in this first pass over the subject.

Listing One: Simple test harness for running the code in Listing One.

   1:  using System;
   2:   
   3:  namespace LinksToLinq
   4:  {
   5:      class Program
   6:      {
   7:          static void Main(string[] args)
   8:          {
   9:              WordQuery.Run();  
  10:              Console.ReadLine();
  11:          }
  12:      }
  13:  }

Listing Two: Querying an array of string.

   1:  using System;
   2:  using System.Collections.Generic;
   3:  using System.Query;
   4:   
   5:  namespace LinksToLinq
   6:  {
   7:      class WordQuery
   8:      {
   9:          public static void Run()
  10:          {
  11:              string[] EinsteinQuote = new string[] {"space", 
  12:                  "detached", "from", "any", "physical", 
  13:                  "content", "does", "not", "exist" }; 
  14:   
  15:              IEnumerable<string> selectedWords =
  16:                  from p in EinsteinQuote
  17:                  where p.Equals("any") != true
  18:                  select p;
  19:   
  20:              foreach (string word in selectedWords)
  21:              {
  22:                  Console.WriteLine(word);
  23:              }         
  24:          }
  25:      }
  26:  }

You can compile and run the code in Listing Two by first creating a C# console application, and then placing a call to WordQuery.Run() in the Main statement of your program, as shown in Listing One. A run of the application should generate the following output:

space 
detached 
from 
physical 
content 
does 
not 
exist

As you can see, the code prints out all the words from our string array except for the word "any." The LINQ syntax is close enough to English that this output should not come as a surprise.

Look at the code in lines 15 through 18. You can fairly readily translate them into the following English sentence: "Select all the words from the string array that are not equal to the word 'any.'"

If you prefer, you can make the English translation match the code a little more precisely: "Select all the words from the string array where the words you choose are not equal to the word 'any.'" In this version, we see all the major elements of the syntax found on lines 15 through 18.

Simply Declare What You Want Done

If you have read much about LINQ, then you have probably heard that it is a declarative language. I'll explain exactly what that means in a future post. For now, however, all you need to know is that LINQ allows you to "declare" what you want done, and then sit back and let the computer figure out how to do it.

You don't have to write a series of imperative sequential statements that laboriously describe how to complete your task. All you have to do is say what you want to do, and then let the computer figure out how to best go about doing it. This is the advantage of declarative code: We simply declare what we want done, and then advanced logic built into the language discovers the best way of doing it.

Before LINQ, we would have accomplished the task shown in Listing 1 by iterating over all the words in the string array, and testing to see which ones were equal to the word "any." That's a very laborious, sequential way of accomplishing a task. With LINQ, we simply say: "Hey, how about you give me all the words from this array that aren't equal to the word 'any.'" You just declare what you want done, and then let the computer figure out how best to do it. In a simple case like this, we might not need much help, but when writing a complex SQL query, it is good to have a little help optimizing your code.

I've said that LINQ can be used in several different domains. It can be used to query arrays, databases, XML files, class hierarchies, and numerous other types of data. In each case, you are going to be able to use either the same declarative syntax you've seen so far, or a similar syntax crafted for a particular domain. You might, for instance, use LINQ to ask questions like these:

  • Select all the rows from this database where the country field is equal to 'USA.'
  • Find all the methods from this class that are declared to be public.
  • Return all the numbers in this array that are evenly divisible by 25.
  • Group together all the strings in this array that begin with the letter 'E', 'F' and 'G'.
  • Return all the elements from this XML file where the country attribute is equal to Lithuania.

The select and where keywords lie at the heart of the query expressions you create in LINQ. Third parties could redefine these keywords, and there are other common words such as join and groupby. The only real limitation here is that you must query things that support the IEnumerable or IQueryable interface. Exactly what that means to us as programmers is a subject to be explored in later posts. 

Summary

In this post I've outlined a few of the basics you need to know to get started with LINQ. As often happens when exploring a new topic, I've ended up raising nearly as many questions as I've answered. That's the way things should be though, and in future posts I'll dig into each of these questions, and explore LINQ in considerable depth.

In particular, I'll talk more about the new keywords select and where, and about that mysterious variable P that appears on lines 16, 18 and 19. Querying XML and querying your own code are also subjects ripe for discussion.

LINQ is the big new feature in C#, and those of us who want to be ready for the future need to begin exploring this subject in earnest. We are on the verge of a huge change in the way we think about writing C# programs, and we can look forward to many exciting new developments that will open up lots of new territory.

Read the next post in this series.

kick it on DotNetKicks.com