LINQ join Clause -Group Join

With the join clause, you can also do group  joins. A group join groups the items from the inner data source by their  corresponding item from the outer data source. For example, all the books written by John Smith will  be grouped together and all the books written by Harry Gold will have a separate  group.

The diagram below shows how group join works.

Figure 1

All the inner items which has a common key and has a matching key in the  outer data source is grouped together to form one group. As you can see, the  result of the group join is a collection of groups, each representing a group  for a specified key. Again, for example, the key could be the author of  the book. You can group a collection of books by authors and the result will be  a collection of books grouped by authors. Any outer item that has no matching  inner items will produce an empty group but still included in the result.

Let’s take a look at an example of doing a group join. We will define two  classes named Author and Book.

class Author
    public int AuthorId { get; set; }
    public string Name { get; set; }

class Book
    public int AuthorId { get; set; }
    public string Title { get; set; }

The following code contains a query expression that uses a group join using  the join clause.

Author[] authors = new Author[] 
    new Author() { AuthorId = 1, Name = "John Smith" },
    new Author() { AuthorId = 2, Name = "Harry Gold" },
    new Author() { AuthorId = 3, Name = "Ronald Schwimmer" },
    new Author() { AuthorId = 4, Name = "Jerry Mawler" }
Book[] books = new Book[] 
    new Book() { AuthorId = 1, Title = "Little Blue Riding Hood" },
    new Book() { AuthorId = 3, Title = "The Three Little Piggy Banks" },
    new Book() { AuthorId = 1, Title = "Snow Black" },
    new Book() { AuthorId = 2, Title = "My Rubber Duckie" },
    new Book() { AuthorId = 2, Title = "He Who Doesn't Know His Name" },
    new Book() { AuthorId = 1, Title = "Hanzel and Brittle" }
var result = from a in authors
             join b in books on a.AuthorId equals b.AuthorId into booksByAuthor
             select new { Author = a.Name, Books = booksByAuthor };
foreach (var r in result)
    Console.WriteLine("Books written by {0}:", r.Author);
    foreach (var b in r.Books)
        Console.WriteLine("---{0}", b.Title);

Example 1

Books written by John Smith:
---Little Blue Riding Hood
---Snow Black
---Hanzel and Brittle
Books written by Harry Gold:
---My Rubber Duckie
---He Who Doesn't Know His Name
Books written by Ronald Schwimmer:
---The Three Little Piggy Banks
Books written by Jerry Mawler:

Take a look at the join clause in line 20. The join clause will join a book from the books data source to the authors data source in which the AuthorId of the book is equal to the AuthorId of an author. The into keyword signifies the group join followed by a grouping variable.  All the inner items’ key that corresponds to an outer item’s key will be grouped together  and will be stored in the grouping variable. The select  clause in line 21 projects the result to a new variable with an Author property  and a Books property assigned with the group variable.  The result of the query expression is a collection of groups of Books.

The nested foreach loop in lines 23 to 31 shows  the results of the query. Inside the first foreach loop, the name of the  author is shown. After that, an inner foreach loop  iterates through each of the Book in the author’s Book property. Remember that this property  contains the collection of books grouped together for a particular author. As  you can see in the output, Jerry Mawler written no books so his Books property is empty, therefore, no books were  shown.

The GroupJoin method is the equivalent method  of a join-group-by clause. The equivalent query using the GroupJoin method is shown below:

var result = authors.GroupJoin(books,
                               author => author.AuthorId,
                               book => book.AuthorId,
                               (author, booksByAuthor) => 
                                new { Author = author.Name, Books = booksByAuthor });

The first parameter is the inner data source that will be joined to the outer  data source. The second parameter is a delegate that accepts a lambda expression  to determine the outer key to used for joining. The third parameter determines  the inner key and the final parameter is used to create the group and project  each result.