My 500th Post! Left Outer Joins with LINQ
Posted
Wednesday, October 15, 2008 2:43 PM
by
CoreyRoth
Today, this blog reaches an important milestone. This is my 500th post. It's just a number, but I thought I would use it as an opportunity to talk about the history of the blog and where it is heading. On December 1st, 2004, I wrote my first .NET Tip of the Day about a great new method on primitives called TryParse in the upcoming release of ASP.NET 2.0. This blog started out as a simple announcements list built on a SharePoint intranet site so that I could educate a team of developers on emerging technology (at the time it was .NET 2.0 and SQL Server 2005). When I left that company, I wanted to keep the concept going so I exported all of the content and created a home-grown blogging engine at dotnettipoftheday.com. Since then, I have been committed to giving back to the community by providing useful information that will help developers do their every day jobs. My blog has always focused on how to do something not to give my opinion on something that I may or may not know anything about. Admittedly, I haven't been able to keep up with the daily thing of posting (although I did a pretty good job back then), but I still make an effort to post a few things a week.
Sometime in 2007, the term DotNetMafia was coined by the godfather himself Kyle Kelin. Although the original idea for the mafia has evolved, it turned out to be a great way to get blogs from several of us onto one web site, dotnetmafia.com, which ran on DotNetNuke. The site worked ok for us for a while, but it was lacking. Later that year, I imported everything (well almost everything) into a new Community Server site where we are today. In case you haven't seen the rest of the DotNetMafia bloggers, be sure and check out Tony Kilhoffer, James Ashley, Cory Robinson, and Kevin Williams. They post on a variety of topics (including some outside of .NET).
In nearly 4 years, I have brought you 500 posts. 112 of those posts have been on SharePoint. Obviously you can see that I have shifted focus some. The main reason is I blog about the things I am working with on a daily basis. Right now, that has been SharePoint. Although I still find time to blog about LINQ and other things from time to time. Let's face it though, the SharePoint community needs all the help it can get. Sites like SharePointBlogs.com (which this feed is syndicated too), have made a huge difference in helping people find the right information. This is why I am committed to posting every nuance I find in the product, so that the next SharePoint developer coming along doesn't have to spend time trying to figure it out.
I am pleased with how the site has grown to date. In four years, we have gone from having a couple of internal users a month, to 200 - 300 unique visitors a day. 65% of that traffic each day comes from search. The rest are from RSS subscriptions, referrals, and my CodePlex project. For now, I plan to keep the content coming and encourage others to blog too. I am committed to helping the community even more and hope to continue to grow the site and organize meetups. I want to thank all of my friends, all two of my readers, Copy Source as HTML, and everyone that has left a comment. I really appreciate it.
Enough fluff for now though, let's get to some content. I am going to step away from SharePoint for today's post and try to hit a broader audience talking about LINQ.
Left Outer Joins with LINQ
The topic I am talking about today is doing left outer joins with LINQ. Doing a join in LINQ is pretty simple, however doing an outer join is a bit more tricky because the way you do it doesn't follow the way you would think of it using T-SQL. The documentation on LINQ covers how to do this, but it doesn't do a great job explaining what you are doing and why. For this example let's say we have two classes Product and Inventory. The Product class has the following properties Name, ProductId, and Price. The Inventory class has the properties ProductId and Count. We want to get a list of all products and know how many we have in inventory but unfortunately the Inventory table isn't complete and it doesn't have data on all products. We still want the product to return in a query though.
First, for the purpose of example, I populate two lists. This could just as easily come from an XML document or SQL. In this case we have three products and two items containing inventory, but info is missing for one particular product.
List<Product> productList = new List<Product>();
productList.Add(new Product() { Name = "Product 1", Price = 10.99f, ProductId = 2 });
productList.Add(new Product() { Name = "Product 2", Price = 1.99f, ProductId = 5 });
productList.Add(new Product() { Name = "Product 3", Price = 3.99f, ProductId = 9 });
List<Inventory> inventoryList = new List<Inventory>();
inventoryList.Add(new Inventory() { ProductId = 2, Count = 54 });
inventoryList.Add(new Inventory() { ProductId = 9, Count = 31 });
If this was a T-SQL query it would probably look something like this:
SELECT p.Name, i.Count FROM Products p LEFT OUTER JOIN Inventory i ON p.ProductId = i.ProductId
In LINQ it is going to look a bit different though.
var productInventory = from product in productList
join inventory in inventoryList
on product.ProductId equals inventory.ProductId into productInventoryGroup
from item in productInventoryGroup.DefaultIfEmpty(new Inventory() { ProductId = product.ProductId, Count = 0 })
select new
{
Name = product.Name,
ProductId = product.ProductId,
Count = item.Count
};
The first thing we do is get a reference to an instance of the class (product) out of the productList with the from clause. We then can join that to the inventoryList. Remember in LINQ, you always have to have a reference to the class itself and not the collection when performing join and where operations, so we also get a reference to an instance of the class (inventory). Once we have that we can use the on clause to relate the two lists using the ProductId field. When doing a join you must use the keyword equals to relate the two collections.
Here is where things really differ from what you may be used to. The first thing that is different is that the result of the join is stored in a new variable using the into clause (in this case productInventoryGroup). This is because in LINQ you use the DefaultIfEmpty() method to specify default values to return when there are no matching values on the right side of the join. You can specify no parameter on this method to use the default when it is empty (usually null or 0 for an int), or you can pass a value for it to use. In this case I created a new instance of Inventory and set Count to 0. You then use another from clause to reference an item from that join. Once you get to this point, it is just a matter of creating a new anonymous type, with product.Name and item.Count as values. Remember, item in this case is of type Inventory and will contain the value from that list or the default if there is no match.
You can then iterate through the result of the join like usual.
foreach (var item in productInventory)
{
Console.WriteLine(string.Format("{0}: {1}<br />", item.Name, item.Count));
}
Which would return the following results.
Product 1: 54
Product 2: 0
Product 3: 31
It really isn't that bad, but it's just different than what you might be used to coming from the T-SQL world. Hopefully, this explanation helps and will be of use to you.
Of course, don't forget that tomorrow night is the SharePint meetup at CrawPappy's at 6:00pm. Thanks again.