August, 2008

  • Eric White's Blog

    Find Duplicates using LINQ

    • 10 Comments

    Sometimes you need to find the duplicates in a list.  I’m currently developing a little utility that tests code in a word processing document (now posted, you can find it here).  Each code snippet in the document has an identifier, and one of the rules that I’m imposing on this code testing utility is that there should be no duplicate identifiers in the set of documents that contain code snippets to be tested.  An easy way to find duplicates is to write a query that groups by the identifier, and then filter for groups that have more than one member.  In the following example, we want to know that 4 and 3 are duplicates:

    This blog is inactive.
    New blog: EricWhite.com/blog

    Blog TOC

    int[] listOfItems = new[] { 4, 2, 3, 1, 6, 4, 3 };
    var duplicates = listOfItems
        .GroupBy(i => i)
        .Where(g => g.Count() > 1)
        .Select(g => g.Key);
    foreach (var d in duplicates)
        Console.WriteLine(d);
     

    This produces the following:

    4
    3

Page 5 of 11 (11 items) «34567»
Page 2 of 2 (11 items) 12