How to: Use the Group Clause in LINQ
Posted
Wednesday, March 26, 2008 9:29 AM
by
CoreyRoth
As you know, LINQ is similar in a lot of ways to T-SQL, but as you go to start doing things with grouping or joining, you will find that that there are some syntactical differences. There are two ways you can use the group clause in LINQ, which way you will use it will be based upon your particular needs. I'll start by explaining the way I typically use it. In this scenario, I want to group something by a particular column and then also get a count of how many rows have each value. Basically, we are creating the equivalent of the following T-SQL statement.
SELECT ProductName, COUNT(*) AS ProductCount FROM MyTable GROUP BY ProductName
When using the group clause it has to be the last statement in the query unless you use the into keyword. Basically, it replaces the select statement. I'll talk about that in a minute. To support a scenario more similar to the T-SQL syntax, you make use of the into keyword. Here is what the LINQ query would look like. In this case we are using LINQ to DataSet as the source.
var productGroups = from row in myDateTable.AsEnumerable()
group row by row.Field<string>("ProductName") into rowGroup
select new
{
Name = rowGroup.Key,
Count = rowGroup.Count()
};
The syntax of the group clause specifies what to group (row), how to group it (row.Field<string>("ProductName")) and where to put that grouping (rowGroup) so that you can use it. The difference between this and T-SQL is the into clause which is required in order for you to do anything with the grouping in an anonymous type. The variable rowGroup is actually of type of IGrouping<string, DataRow> with string being the type we grouped on. Once the data is grouped into rowGroup, this variable can be used to store the name and count in a new anonymous type. The Key property contains the value of what we are grouping on and the Count() method gives us our count. So far when I have grouped in LINQ, I have found this to be the most common scenarios. Typically I want an IEnumerable<> of some sort of anonymous type that I can iterate over or bind to.
If you don't want an anonymous type at this point, you can use the other use of the group clause without the into keyword. In that case the result of the LINQ query would return the IGrouping<string, dataRow>. You can then either query it again with LINQ or use a nested loop to work with the data.
var productGroups = from row in myDateTable.AsEnumerable()
group row by row.Field<string>("ProductName");
foreach (IGrouping<string, DataRow> productGroup in productGroups)
{
foreach (var row in productGroup)
{
// do something with dataRow here
}
}
I don't find this scenario nearly as usable but some may find it better. Hopefully this will help the next time you got to group something with LINQ. It's not really that complicated but I thought it would be worth it to point some of the differences. I'll probably cover left outer joins pretty soon, because I have done it a few times now and the syntax still gets me tripped up every time.