This is the sixth in a series of articles on LINQ. In this post the focus will be on the LINQ Set operators. Near the end of the post I include a short section on the importance of choosing the best operator for a particular task. Please see the links at the bottom of this post to retrieve the code.

### Set Operators

LINQ provides users with four Set operations:

- Distinct
- Union
- Intersect
- Except

We have already worked with the Distinct operator. As a result, in this post I will focus on **Union**, **Intersect**, and **Except.**

### Unions

Listing 1 shows an example of working with the **Union** operator. This code queries for all the operators of type **Aggregate** or type **Conversion. **It then joins the operators in a simple union containing a list of all the **Aggregate** and **Conversion** operators.

**Listing 1: Performing a simple union with LINQ.**

1: private IEnumerable<string> GetOpertorTypeMembers(string operatorType)

` 2: {`

3: return from p in operatorList

4: where p.OperatorType == operatorType

` 5: select p.OperatorName;`

` 6: }`

` 7: `

8: public void SimpleUnion()

` 9: {`

10: var aggregateOperators = GetOpertorTypeMembers("Aggregate");

11: var conversionOperators = GetOpertorTypeMembers("Conversion");

` 12: var aggregatePlusConversion = aggregateOperators.Union(conversionOperators);`

` 13: `

14: foreach (var item in aggregatePlusConversion)

` 15: {`

` 16: listBox.Add(item);`

` 17: }`

` 18: }`

In Listing 1, your focus should be on the second method, called **SimpleUnion**. The **GetOperatorTypeMembers** method is a simple helper function designed to retrieve data from the database with a simple query. In particular, it asks the question "select from the database the names of all the operators of a particular type." The type of operator to retrieve is passed in as a parameter. This type of query was discussed in more detail in earlier posts in this series.

On lines 10 and 11 you can see the code that retrieves all the operators of type **Aggregate** and of type **Conversion.** Line 12 has the simple LINQ code for performing a union between two sets.

Earlier in this series, in the post entitled "Using Distinct and Avoiding Lambdas," I explained that most LINQ operations should be performed with query expressions, such as the one found in the **GetOperatorTypeMembers** method. I then went on to explain that query expressions help us avoid having to compose lambdas. There is nothing innately wrong with lambdas, but they can be hard to write. As a result, most users will prefer using query expressions and avoiding the difficult task of composing their own lambdas. When calling the set operators, however, there is no need to use a lambda. As a result, we call them directly, as shown on line 12.

### Intersect

The next Set operator that I want to cover is called **Intersect**. Needless to say, this operator retrieves the *intersection* between two sets.

In Figure 2 you can see two sets. The first set shows all the operators that contain the letters "Wh", and the second sent shows all the operators that contain the letter "k". The intersection between the two sets are the operators called **TakeWhile** and **SkipWhile**.

**Figure 2: Here you can see two sets, and below the second dotted line you can see their intersection.**

**Listing 4: Finding the Intersection between two sets.**

1: private IEnumerable<string> GetContains(string searchTerm)

` 2: {`

3: return from p in operatorList

4: where p.OperatorName.Contains(searchTerm)

` 5: select p.OperatorName;`

` 6: }`

` 7: `

8: public void GroupPatterns()

` 9: {`

10: var constainsWh = GetContains("Wh");

11: var constainsK = GetContains("k");

` 12: `

` 13: Utilities.Display(listBox, constainsWh);`

14: listBox.Add("===============");

` 15: Utilities.Display(listBox, constainsK);`

16: listBox.Add("===============");

` 17: var unionData = constainsK.Intersect(constainsWh);`

` 18: `

19: foreach (var data in unionData)

` 20: {`

` 21: listBox.Add(data);`

` 22: }`

` 23: }`

The code in Listing 4 produces the output shown in Figure 2. At the top of the listing is a helper method that uses the **Contains** operator. In this short code sample, the **GetContains** method is used on lines 10 and 11 to retrieve the operators that contain either the letters "Wh" or the letter "k."

On line 17 you can see the call to the **Intersect** operator. It retrieves the intersection between the set called **containsK **and the set called **containsWh. **It's all pretty simple and straight forward.

### The Except Operator

The **Except** operator is the mirror image of the **Union **operator. Instead of joining two sets together, the **Except** operator removes members from a set. More precisely, you can use it to subtract one set from an existing set.

In Listing 2 the code creates the union of three sets, and then uses the **Except** operator to show how to remove elements belonging to one of the three sets.

**Listing 2: A simple except statement.**

1: public void SimpleExcept()

` 2: {`

3: var aggregateOperators = GetOpertorTypeMembers("Aggregate");

4: var conversionOperators = GetOpertorTypeMembers("Conversion");

5: var setOperators = GetOpertorTypeMembers("Set");

` 6: var aggregatePlusConversionPlusSet = `

` 7: aggregateOperators.Union(conversionOperators).Union(setOperators);`

` 8: `

` 9: Utilities.Display(listBox, aggregatePlusConversionPlusSet);`

` 10: `

11: listBox.Add("===============");

` 12: `

` 13: var exceptData = aggregatePlusConversionPlusSet.Except(setOperators);`

` 14: `

` 15: Utilities.Display(listBox, exceptData);`

` 16: }`

In Listing 2 notice how the **Union** operator is chained together in line 7 to create the union of three distinct sets. On line 13 the **Except** operator is used to return a set equal to the big union created in line 7 minus those items in the **setOperators**.

Figure 1 first shows the union of all three sets. Then, after the dotted line you can see what the set looks like after you subtract the set operators. In particular, notice that **Distinct**, **Union**, **Intersect** and **Except** are missing from the second group of operators.

**Figure 1: The union of three sets is shown first. After the dotted line you see the same set minus the Set operators: Distinct, Union, Intersect and Except.**

### Choosing the Right Operator

I've now said all that I wanted to say about set operators. However, this post has been a little too simple, so why don't I close by talking a bit about the importance of choosing the right tools to help you compose the right query.

If you have an array of numbers from 1 to 9, then you might think it would be easy to use a mathematical formula in combination with the **Except** operator to extract the set of numbers that are even. You would then be left with the set of odd numbers between 1 and 9. You want to subtract one set from another set, so obviously you should use the **Except **operator. Right?

In practice things aren't always that simple. The LINQ Set operators work with sets. In particular, the **Except **operator expects to be passed a set.

You have the set of numbers between 1 and 9, but to subtract the even numbers with the **Except** operator you have to first create the set of even numbers. To create the set of even numbers, you would typically write a query expression that looks like this:

1: int[] numbers = new int[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 };

` 2: `

3: var evens = from p in numbers

4: where (p % 2 == 0)

` 5: select p;`

You could now use **Except** to subtract the Even numbers from the original set of numbers:

var odds = numbers.Except(evens);

But if you can create the set of even numbers, then you could just as easily create the set of odd numbers and just skip calling **Except**:

1: var evens = from p in numbers

2: where (p % 2 != 0)

` 3: select p;`

By changing the == operator to an != operator, we got the result we wanted directly without calling **Except**.

In this simple example, it is fairly obvious why calling the **Except** operator is unnecessary. But in more complex code, it is going to be easy for LINQ developers to start using the wrong operator to accomplish a particular task. By calling the "wrong" operator, we won't necessarily get the wrong answer, but we will be creating code that is both slower and more complex than necessary.

Consider the code in Listing 3. This combines the two chunks of code we were looking at earlier to get the set of odd numbers between 1 and 9.

**Listing 3: The query expression embedded in this Except statement returns a set expressed as an IEnumerable<T>.**

1: var odds = numbers.Except(from p in numbers

2: where (p % 2 == 0)

` 3: select p);`

In LINQ, a set is usually expressed as an **IEnumerable<T>. **In fact, The **Except** operator expects an **IEnumerable<T> **as its sole parameter.

We know that query expressions produce an **IEnumerable<T>. **Here I use a query expression to produce a set of even numbers that we could subtract from our existing set of numbers to create a set of odd numbers. This is what happens in the code seen in Listing 3.

In some ways, this code is fairly compelling. It is relatively concise, and produces the results that we want. However, it is not optimal. In fact, the query expression passed to the **Except** operator if taken on its own would be sufficient to produce our results. All we would need to do is change the == operator to !=.

The approximately 50 LINQ operators represent a language that can be used to query data. There are lots of ways to combine these operators together to create the correct results. The best LINQ developers, however, will be the ones who can quickly understand which of these 50 operators they want to use, and what is the best way to combine them.

**Summary**

This relatively simple post demonstrates how to use the Set operators to perform simple set operations on your data. The interesting thing about the Set operators is that they usually do not take lambdas, and as a result the designers of LINQ allow you to call them directly rather than asking you to call them through a query expression.

At the end of this post I took a little side trip which focused on the importance of selecting the right operators for a particular task. There is an art to writing LINQ queries, and the developers who are most adept at composing queries will be the ones who excel on this new frontier.

If you want to succeed you will almost certainly have to begin by becoming familiar with the LINQ query expression "language." To make the right choices, you need to know the various operators, and you need to have a fairly intuitive sense of how to optimally combine them. For most people, this is going to take a certain amount of practice. If you take the time to master query expressions, however, you will have gained a powerful skill that will significantly improve both the maintainability and the readability of your code.