Welcome to MSDN Blogs Sign in | Join | Help

Mitsu's blog

Discussing topics related to .Net, WPF, C# and Linq
Playing with Linq grouping: GroupByMany ?

Linq is a great technology to manage data directly from your .Net language.

One of its features is grouping. Many people understand grouping like it is defined in Sql. Linq is implementing grouping quite the same way. Let's discover this syntax and how to make consecutive groups easier.

Then we will show how to use WPF HierarchicalDataTemplate to expose the results in just a few lines.

Let's assume we have a collection of customer: 'customers'. You just have to use 'group by' to define groups among your data.

var q = from c in db.Customers group c by c.Country;

q then becomes an enumeration of groups (IQueryable<IGrouping<string, Customer>>).
Each item of this enumeration defines a group (IGrouping<string, Cucstomer>).

As we can see in its definition, IGrouping is just adding a few things:

- the key of the group (country in our sample).
- the items grouped by this common key. To retrieve these items, you have to browse the group which is an enumeration itself.

foreach (var g in q) { var city = g.Key; foreach (var c in g) Console.WriteLine(c.CompanyName); }

IGrouping<,> definition:
// Summary: // Represents a collection of objects that have a common key. // // Type parameters: // TKey: // The type of the key of the System.Linq.IGrouping<TKey,TElement>. // // TElement: // The type of the values in the System.Linq.IGrouping<TKey,TElement>. public interface IGrouping<TKey, TElement> : IEnumerable<TElement>, IEnumerable { // Summary: // Gets the key of the System.Linq.IGrouping<TKey,TElement>. // // Returns: // The key of the System.Linq.IGrouping<TKey,TElement>. TKey Key { get; } }

Most of the time, we are using groups to retrieve aggregations like sum or count.
To do this using Linq you just have to build a new Linq query on our first group query.
var q = from g in (from c in db.Customers group c by c.Country) select new { g.Key, Count = g.Count() };

To simplify this syntax, you can use the 'into' keyword and then make disappear the nested query. Do not forget the first syntax that makes more visible why 'c' is not reachable after the group statement. Like in Sql, once the data are grouped, you can only select properties from the group.

var q = from c in db.Customers group c by c.Country into g select new { g.Key, Count = g.Count() };

Now, let's try to create child groups inside this query. The goal is simple: I would like to group customers by Countries then by Cities inside each group. This is quite the same scenario than what we are doing when using pivot table in excel.

We can write it 'manually' nesting a second Linq query into the result of our first query:

var q = from c in db.Customers group c by c.Country into g select new { g.Key, Count = g.Count(), SubGroups = from c in g group c by c.City into g2 select g2};

The result is a tree of items grouped in a first level of countries groups and then each country group has a SubGroups property that stores the group of cities contained in each country.

Writing this will become less and less readable when the number of child groups will grow.
Moreover, it's quite hard to factorize this code as we have to insert a new query inside the last projection.
I wanted to make this scenario more simple and more generic. Here is the idea.

The first thing I have done is to create a fixed type to define a group.
This allows me to have a returnable type (anonymous types are not), so I can isolate my code in a method. Moreover, as I will use my method recursively, actually I had no choice!
Another reason was having the fixed non generic type GroupResult makes it easier for me to use WPF data binding (xaml does not support generic types).

public class GroupResult { public object Key { get; set; } public int Count { get; set; } public IEnumerable Items { get; set; } public IEnumerable<GroupResult> SubGroups { get; set; } public override string ToString() { return string.Format("{0} ({1})", Key, Count); } }

Ok, now let's write the main work. The GroupByMany method is extending IEnumerable<T> just like GroupBy does but you can add an undefined number of group selectors (params Func<TElement, TKey>[] groupSelectors).

If the number of group selectors is zero, then the method returns null. It's also what will stop the recursivity in the case of multiple selectors.

If the number of group selectors is greater than zero, then I isolate the first one and build a simple Linq GroupBy query using it. Each returned item is a GroupResult and I am calling recursively the GroupByMany method on the results of the group (g) to fill the SubGroups property. When calling this method, I am using the remaining unused group selectors, which will finish by being empty.

public static class MyEnumerableExtensions { public static IEnumerable<GroupResult> GroupByMany<TElement>( this IEnumerable<TElement> elements, params Func<TElement, object>[] groupSelectors) { if (groupSelectors.Length > 0) { var selector = groupSelectors.First(); //reduce the list recursively until zero var nextSelectors = groupSelectors.Skip(1).ToArray(); return elements.GroupBy(selector).Select( g => new GroupResult { Key = g.Key, Count = g.Count(), Items = g, SubGroups = g.GroupByMany(nextSelectors) }); } else return null; } }

Now, let's use it:
var result = customers.GroupByMany(c => c.Country, c => c.City);

The calling code is short and easy to read and the GroupByMany method is factorized enough to use it in many other cases.

Last step, let's try to display the result. WPF has a feature that I love: the possibility to associate a template to a data type. Usually, a template is stored in the resources and is indexed with a key, then the controls reference this template. Using the 'DataType' syntax without key, the template is automatically associated to any content control when the type of the content is corresponding to the DataType of the template.

The HierarchicalDataTemplate is a special template that allows you to define a collection of children (ItemsSource property) in addition to the regular DataTemplate definition.

Some hierarchical controls like the TreeView are using this template to build their structure recursively. So we have nothing more to do to display our multiple groupby results than connecting them to the treeview:

DataContext = customers.GroupByMany(c => c.Country, c => c.City);
<Window x:Class="WPFGroupingTemplate.Window1" xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation" xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml" xmlns:local="clr-namespace:WPFGroupingTemplate" Title="Window1" Height="600" Width="800" Loaded="Window_Loaded"> <Window.Resources> <HierarchicalDataTemplate DataType="{x:Type local:GroupResult}" ItemsSource="{Binding SubGroups}"> <TextBlock Text="{Binding}" /> </HierarchicalDataTemplate> </Window.Resources> <DockPanel> <TreeView x:Name="groups" Width="200" DockPanel.Dock="Left" ItemsSource="{Binding}" /> <ListView ItemsSource="{Binding ElementName=groups, Path=SelectedItem.Items}"> <ListView.View> <GridView> <GridViewColumn Header="Company name" DisplayMemberBinding="{Binding CompanyName}" /> <GridViewColumn Header="Contact name" DisplayMemberBinding="{Binding ContactName}" /> <GridViewColumn Header="Country" DisplayMemberBinding="{Binding Country}" /> <GridViewColumn Header="City" DisplayMemberBinding="{Binding City}" /> </GridView> </ListView.View> </ListView> </DockPanel> </Window>

In addition to the treeview, I have added a simple ListView to display the customers belonging to the current selected group.

image 

I let you evaluate the size of the code if you had to write the same program using Windows Forms and ADO.Net...

The source code attached is for Visual Studio 2008. Even if this sample is using Linq to object grouping, I have used a local Northwind database to populate my collection. You just need to modify the connection string in the app.config file to run the sample (or use any of your data sources of course).

Posted: Saturday, December 22, 2007 1:34 AM by mitsu
Filed under: , ,

Attachment(s): WPFGroupingTemplateDynamic.zip

Comments

Frans Bouma said:

I ran the linq query posted above using RTM linq to sql, but it results in a LOT of queries. This is logical: the main GROUP BY query ran on the DB can't contain 'City' in the projection list unless it's in the group by clause as well. So this isn't fetched. The nested group by requires 'City' so this is fetched PER CUSTOMER from the db.

Instead, you should group the customers on Country AND City using an anonymous type. Though the nested group by then becomes more difficult at first glance.

Linq to Sql can't merge parent/child resultsets inside a projection at runtime, so you won't get this efficiently using Linq to Sql in its current state.

# January 7, 2008 9:21 AM

Mitsu said:

I do not understand.

The whole grouping stuff is not running against Linq to Sql but Linq to object. This sample is only working in memory but I am using a Linq to Sql query to populate my collection.

I have just checked again but only one sql query is executed.

After 'q.ToList()', all the grouping stuff is not running Linq to Sql anymore.

Regarding your comment, you are right, the group can not contain city but I am not working on the group ! I am using the group to populate the children.

It's a Linq option that allows you to easilly retrieve the items of the group.

g.Key contains the result of one item of the 'group by' (not containing city or any other field anymore).

But g is a new IEnumerable<Customer> and g.ToList() gives you back all the customers of the group (you must consider it as a new Linq query).

Mitsu

# January 9, 2008 7:08 AM

Michael L. Harges said:

Wow, this is a very timely and informative post.  I work on a product that was built pre-Linq that has a GetTree method that builds a hierarchy of TreeGroup objects that look very similar to your GroupResult class.  I was wondering this past weekend how I could accomplish the same thing in Linq and your post showed me exactly what I need to do.  Thanks!

# January 9, 2008 7:21 PM

J said:

Can anyone convert this to VB?  I'm having trouble converting the lambda function correctly (I know VB uses the Function keyword, but I'm not sure how to use it in the context of this function.  Any ideas?

# January 15, 2008 9:55 AM

Jeremy Holt said:

Hi

How would you write this extremely simple SQL as LINQ?

Surely I don't have to write inner loops etc.

SELECT Commodity, Grade, SUM(Quantity) Quantity

FROM Contacts

GROUP BY Commodity, Grade

RESULTS:

Commodity Grade Quantity

Arg 2002 40/50 -352.20

Arg 2002 40/80 -3.00

Arg 2002 50/80 100.00

Arg 2004 38/50 -100.00

Arg 2004 60/70 0.00

Many thanks

Jeremy

# January 25, 2008 1:12 PM

mitsu said:

Hi Jeremy,

Try something like:

var q =

 from c in Commodity

 group c by new {c.Commodity, c.Grade } into g

 select new {

   g.Key.Commodity,

   g.Key.Grade,

   Quantity = g.Sum(c => c.Quantity) }

# January 26, 2008 12:34 PM

Jeremy Holt said:

Hi Mitsu,

That was great - many thanks

Jeremy

# January 27, 2008 7:31 AM

Jeremy Holt said:

Mitsu,

I just changed it into VB and have come up with an odd problem:

     Dim db As New AtrbeDataContext

       Dim funcQuantity = Function(x As ContractDetail) If(x.Contract.SellerID = 653, -x.Quantity, x.Quantity)

       Dim qry = From c In db.ContractDetails _

                 Where (c.Contract.CommodityGroupID = 12 And c.ShippedStatus = 0 And c.Contract.Status = 0 And c.Contract.SellerID = 653) Or _

                   (c.Contract.CommodityGroupID = 12 And c.ShippedStatus = 0 And c.Contract.Status = 0 And c.Contract.BuyerID = 653) _

                 Group c By Commodity = c.Contract.Commodity, Grade = c.Grade Into g = Group _

                 Select Commodity, _

                           Grade, _

                           Quantity = g.Sum(Function(x As ContractDetail) If(x.Contract.SellerID = 653, -x.Quantity, x.Quantity)) _

                 Order By Commodity

This works fine and gives the expected results. However, if I replace the g.Sum(....) with funcQuantity (which of course is identical) the query fails with a NotSupportedException - "Unsupported overload used for query operator 'Sum'.

Any idea why this would be?

Jeremy

# January 27, 2008 8:06 AM

mitsu said:

Ok,

You must understand that Linq to Sql is analyzing the code (an expression tree) to translate it to a sql query. So Linq to Sql is only recognizing what Linq defines (actually, Linq to sql recognizes a subset of Linq methods). Your query fails here because funcQuantity can not be translated into slq by Linq to Sql.

# January 27, 2008 12:28 PM

Ryan Versaw said:

Depending on what you're using this for, it can be quite useful to strongly type your results. Keep in mind that this only works for a single Key type (just as the solution above does). It only takes a few small changes to do so: (hopefully this is displayed well...)

public class GroupResult<TKey, TItem>

{

public TKey Key { get; set; }

public int Count { get; set; }

public IEnumerable<TItem> Items { get; set; }

public IEnumerable<GroupResult<TKey, TItem>> SubGroups { get; set; }

public override string ToString()

{

 return string.Format("{0} ({1})", Key, Count);

}

}

public static class MyEnumerableExtensions

{

public static IEnumerable<GroupResult<TKey, TElement>> GroupByMany<TKey, TElement>(

 this IEnumerable<TElement> elements, params Func<TElement, TKey>[] groupSelectors)

{

 if (groupSelectors.Length > 0)

 {

  var selector = groupSelectors.First();

  //reduce the list recursively until zero

  var nextSelectors = groupSelectors.Skip(1).ToArray();

  return

   elements.GroupBy(selector).Select(

    g => new GroupResult<TKey, TElement>

    {

     Key = g.Key,

     Count = g.Count(),

     Items = g,

     SubGroups = g.GroupByMany(nextSelectors)

    });

 }

 else

  return null;

}

}

# January 28, 2008 11:08 AM

Ryan Versaw said:

After re-reading a little of your post, I see you mentioned that xaml does not support generic types. I suppose my previous code probably won't work for anyone using this for WPF, but for others it might be useful.

# January 28, 2008 12:40 PM

mitsu said:

Hi Ryan,

Yes, xaml does not support generic types. Of course WPF does !! What you just cannot do is using the type name inside the xaml syntax.

Regarding your interesting proposition, it seems that there is a problem. In your solution TKey is inferred from 'params Func<TElement, TKey>[] groupSelectors'. This means that all the group definitions are returning the same type. So you could write:

q.GroupByMany(c => c.Country, c => c.City) because Country and City are both strings but you could not write

q.GroupByMany(c => c.Country, c => c.DateProperty)

# January 29, 2008 4:43 AM

Shawn said:

I've been impressed by this solution but I am trying to understand how it could be made to allow a user to select one of the sorting fields using something like a combobox. Any ideas?

var result = customers.GroupByMany(c => c.GetProperty(ComboBox1.SelectedValue), c => "City");

# February 6, 2008 11:48 AM

Mitsu's blog said:

Shawn asked me in my last post about GroupByMany how to use it dynamically. The answer is not easy. So

# February 6, 2008 6:04 PM

mitsu said:

# February 6, 2008 6:12 PM

Noticias externas said:

Shawn asked me in my last post about GroupByMany how to use it dynamically. The answer is not easy. So

# February 6, 2008 6:14 PM

Shawn said:

Wow Mitsu, Thank you so much for your post. I've found both to be very enlightening and informative. As I learn more about LINQ and Lambda expressions,  articles like these really showcase the power behind these new features and what is now possible. Thanks again.

Shawn

# February 7, 2008 10:21 AM

Dharmender said:

Can u opls let me know the nested grouping of multiple level with this no tree view involved instead an expeneder showing the groupin n subgrouping

# February 8, 2008 4:40 AM

墙头草 said:

很大程度上来讲,LINQ的工作方式很像SQL。查询源定义、join、select和where子句都是相当标准的,而Group/By/Into子句

则正是我们这里要仔细说明的。SQL总是返回矩阵式数据...

# April 2, 2008 6:22 AM

Soyka said:

Thanks for the article! It helped me resolve a problem I was stuck with for a whole day! Keep up the good work..

# August 7, 2008 6:52 AM

mitsu said:

You're welcome. I am happy to see people using it.

# August 7, 2008 9:32 AM

Leonardo said:

Hi

I am trying to do dynamic subgroups on a xceed query with linq.

All is working fine,my only problem is like to find a way to dynamic to build the linq query.

Code:

private void UserControl_DataContextChanged(object sender, DependencyPropertyChangedEventArgs e)

       {

           if (e.NewValue is COP.GUI.Model.VMFund)

           {

               var activeFund = e.NewValue as COP.GUI.Model.VMFund;

               //Populate the FundDetails collection with all the funds from activeFund.

               var NewData = activeFund.FundDetails.ToList();

               //GroupByMany allows you to define a list of subgroups in a single call.

               //Dynamic solution

               string dealer = "DealerName";

               string portfolio = "Portfolio";

               string ticker = "Ticker";

               var resultDealerGroup = NewData.GroupByMany(dealer, portfolio, ticker);

               var resultDealer = resultDealerGroup.Select(t =>

                   new

                   {

                       DealerName = t.Key.ToString(),

                       IM = (from d in t.Items as IEnumerable<COP.GUI.MT.Proxy.FundDealer.FundDealerDetailRow>

                             select d.IM).Sum(),

                       VM = (from d in t.Items as IEnumerable<COP.GUI.MT.Proxy.FundDealer.FundDealerDetailRow>

                             select d.VM).Sum(),

                       Notional = (from d in t.Items as IEnumerable<COP.GUI.MT.Proxy.FundDealer.FundDealerDetailRow>

                                   select d.Notional).Sum(),

                       Trades = t.Count,

                       Portfolio = from d in t.SubGroups as IEnumerable<GroupResult>

                                   select new

                                   {

                                       Portfolio = d.Key.ToString(),

                                       IM = (d.Items as

                                       IEnumerable<COP.GUI.MT.Proxy.FundDealer.FundDealerDetailRow>).Select(g => g.IM).Sum(),

                                       VM = (d.Items as

                                       IEnumerable<COP.GUI.MT.Proxy.FundDealer.FundDealerDetailRow>).Select(g => g.VM).Sum(),

                                       Notional = (d.Items as

                                       IEnumerable<COP.GUI.MT.Proxy.FundDealer.FundDealerDetailRow>).Select(g => g.VM).Sum(),

                                       Trades = d.Count,

                                       Ticker = from s in d.SubGroups as IEnumerable<GroupResult>

                                                select new

                                                {

                                                    Ticker = d.Key.ToString(),

                                                    IM = (s.Items as

                                                    IEnumerable<COP.GUI.MT.Proxy.FundDealer.FundDealerDetailRow>).Select(g => g.IM).Sum(),

                                                    VM = (s.Items as

                                                    IEnumerable<COP.GUI.MT.Proxy.FundDealer.FundDealerDetailRow>).Select(g => g.VM).Sum(),

                                                    Notional = (s.Items as

                                                    IEnumerable<COP.GUI.MT.Proxy.FundDealer.FundDealerDetailRow>).Select(g => g.VM).Sum(),

                                                    Trades = s.Count

                                                }

                                   }

                   });

               var dgCollView = this.FindResource("dgViewSource") as DataGridCollectionViewSource;

               dgCollView.Source = resultDealer;

               tradesGrid.ItemsSource = dgCollView.View;      

           }

       }

Is working correctly but the user must to select from a menu with checkboxes the subgroups like dealer and portfolio and Ticker

An iodea about like do do dynamic te code of the linq query depending of user chooise.???!!!

Thanks in advance

Leonardo

# December 19, 2008 4:29 PM

Steve said:

Thanks for the post!

Very informative

# April 2, 2009 3:43 PM

jhigaki said:

Hey mitsu and all the other folks... a question regarding this expression

/****************************************/

var q =

from c in Commodity

group c by new {c.Commodity, c.Grade } into g

select new {

  g.Key.Commodity,

  g.Key.Grade,

  Quantity = g.Sum(c => c.Quantity) }

/****************************************/

Here the grouping criteria is an anonymous type {c.Commodity, c.Grade }

I need it to be a DEFINED TYPE as

public class GroupingCriteria

{

 public string commodity;

 public string grade;

}

I've tried so far coding the Equals and CompareTo from implementing

IEquatable<GroupingCriteria> and IComparable<GroupingCriteria> but it doesn't seem to work.

I need this because Dotfuscator CE won't work with anonymous types expressions.

Any ideas?

# May 15, 2009 11:13 AM

Mitsu said:

Hi,

Did you try using properties instead of just fields ?

public class GroupingCriteria

{

public string commodity {get; set;};

public string grade {get; set;};

}

# May 16, 2009 3:03 PM

Polo H said:

Can you help me understand or point me in the right direction to solve my issue? I’m trying to subgroup a Datatable(all I am given) using LINQ. I am trying to go several layers deep(5).

I have tried several things but everything falls apart after a few levels. I’m sure I just don’t understand it that well yet. I have tried:

var query = from nRow in newDT2.AsEnumerable()

group nRow by nRow.Field(”agency_group”) into agency_group

select new

{

ag = agency_group.AsEnumerable(),

agK = agency_group.Key,

agency_names =

from agency_names in agency_group

group agency_names by agency_names.Field(”agency_name”) into agency_names_group

select new

{

an = agency_names_group.AsEnumerable(),

anK = agency_names_group.Key,

div_depts =

from div_depts in agency_names_group

group div_depts by div_depts.Field(”div_dept”) into div_depts_group

select new

{

dd = div_depts_group.AsEnumerable(),

ddK = div_depts_group.Key,

unit_sections =

from unit_section in div_depts_group

group unit_section by unit_section.Field(”unit_section”) into unit_section_groups

select new

{

us = unit_section_groups.AsEnumerable(),

usK = unit_section_groups.Key,

group_unit_req =

from gur in unit_section_groups

group gur by gur.Field(”group_unit_requirement”)

}

}

}

};

and

var query = from nRow in newDT.AsEnumerable()

group nRow by nRow.Field(”agency_group”) into agency_group

from agency_name in

(from nRow in agency_group

group nRow by nRow.Field(”agency_name”) into agency_name_group

from division_name in

(from nRow in agency_name_group

group nRow by nRow.Field(”div_dept”))

group division_name by agency_name_group)

group agency_name by agency_group;

But something just isn’t working for me. I’ve looked all over and can’t find anything that goes that deep. Any help you can provide is greatly appreciate.

Thanks for your time.

# June 30, 2009 2:59 PM

Mitsu said:

Hi Polo,

Hard to say, could you send me your sample at mitsufu@microsoft.com ?

# June 30, 2009 4:37 PM

Steven White said:

The code worked perfectly and is very useful. Thanks for publishing it. :-)

# August 17, 2009 2:57 AM

nmducit said:

I want use Group by Linq in Mvc.

# September 16, 2009 9:24 PM

Alex said:

Here is the VB.NET translation for whoever is wondering:

   Public Class GroupResult

       Public Key As Object

       Public Count As Integer

       Public Items As IEnumerable

       Public SubGroups As IEnumerable(Of GroupResult)

       Public Overrides Function ToString() As String

           Return String.Format("{0} ({1})", Key, Count)

       End Function

   End Class

   Module Extensions

       <Extension()> _

       Public Function GroupByMany(Of TElement)(ByRef elements As IEnumerable(Of TElement), ByVal ParamArray groupSelectors As Func(Of TElement, Object)()) As IEnumerable(Of GroupResult)

           If (groupSelectors.Length > 0) Then

               Dim selector = groupSelectors.First()

               Dim nextSelectors = groupSelectors.Skip(1).ToArray()

               Return elements.GroupBy(selector).Select(Function(g) New GroupResult With {.Key = g.Key, .Count = g.Count(), .Items = g, .SubGroups = g.GroupByMany(nextSelectors)})

           Else

               Return Nothing

           End If

       End Function

   End Module

# September 23, 2009 9:07 AM
Leave a Comment

(required) 

(required) 

(optional)

(required) 

  
Enter Code Here: Required

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Page view tracker