Group By Queries for MS SQL Server

Oct24
Missing Image
By Ken Foubert

Back in the day, we worked on developing a specialty application that was basically a lead generation system with a database that stored over 200,000 records, with a potential for a lot more. The web application displayed numerous reports that calculated totals from disparate sources. We discovered that once our client began adding all their data that those reports were running slower and slower and slower.

The problem was that we had one primary query to pull the records out, then, as the code looped through each record, several other queries were needed to calculate the disparate totals. That resulted in numerous database calls that slowed the entire web application. That's when Mike discovered MySQL Wizardry, that used the SUM(IF()) and the GROUP BY clause, problem solved.


One way to improve the speed and efficiency of a dynamic page is to pull all the needed information that is to be displayed in one query, that's where the MySQL Wizardry comes into play. This wizardry performs cross tabulations, which are great for statistical analysis. If you're planning on creating reports, I strongly suggest you read this article.

Ever since we discovered this methodology, I've used it on numerous specialty applications, from calculating amounts and costs for a shopping list to determining totals for outstanding invoices.

Below is an example of the MySQL Wizardry query.

MySQL Query


SELECT SUM( IF( company_size = "large", 1, 0) ) as vLargeCount,
SUM( IF(company_size = "medium", 1, 0) ) as vMediumCount,
SUM( IF(company_size = "small", 1, 0) ) as vSmallCount,
SUM( IF( company_age < = 4, 1, 0) ) as vYoungCompaniesCount,
SUM( IF( company_age > 4, 1, 0) ) as vOlderComaniesCount,
COUNT(companies.id) AS vTotalCompanyCount
FROM companies



The key here is the SUM(IF()) statements, this simple little piece of query magic aggregated a number of different queries into one query and turned out to be extremely efficient. As a side note, it's also important to index fields that are are used for the LEFT JOIN, in this case campaign.id and contacts_mail.campaign.

Now, the Pierce County Library was developed in ASP.NET 2.0 and MS SQL Server 2005, but I wanted to apply the MySQL Wizardry magic to MS SQL server. The problem was that MS SQL Server does not have an IF statement, but this was easily remedied by using the CASE statement. Below is an example of the MySQL Wizardry for MS SQL Server.

MS SQL Server Query


SELECT movies.title,
movies.date,
COUNT(movie_reviews.review_id) AS vReviewCount,
SUM(CASE movie_reviews.thumbs WHEN "UP" THEN 1 ELSE 0 END) AS vThumbsUp,
SUM(CASE movie_reviews.thumbs WHEN "DOWN" THEN 1 ELSE 0 END) AS vThumbsDown,
FROM  movies
LEFT OUTER JOIN movie_reviews  ON movies.code = movie_reviews.movie_code
GROUP BY movies.title, movies.date
ORDER BY movies.date DESC


Voil�, SUM(IF()) is replaced by SUM(CASE) and we now have MS SQL Server Wizardry.

There are two things that are important to note about this query. First is that MS SQL Server GROUP BY queries are much more restrictive than MySQL's GROUP BY functionality. Each data field (NOT expressions) in the SELECT statement MUST BE in the GROUP BY clause. The second item is that MS SQL Server does not allow ntext, text, and xml data types in the SELECT statement when performing a GROUP BY. There may be a way to include these data types, but since I was short on time and didn't really need those fields and decided to tackle this problem at a later date.



Dev

Back To Feed