Writing a Count(Distinct) Query in Access
We received a question from a reader named Eric week before last about how to write a Count(Distinct) query in Access. In other database applications such as SQL Server, Oracle, and MySQL, the Count(Distinct) aggregate is used to count the number of unique items in a given rowset. For example, consider a table on SQL Server named tblColors, such as:
| Color |
Value |
| Red |
5 |
| Green |
2 |
| Blue |
8 |
| Orange |
1 |
| Red |
8 |
| Green |
6 |
| Blue |
2 |
To get a count of the number of unique colors in the table, you could write a query such as:
SELECT Count(Distinct Color) AS N FROM tblColors
This would return the value 4 as there are four unique colors in the Color field in the table. Unfortunately, the Access Database Engine does not support the Count(Distinct) aggregate. To return this value from an Access table, you would need to use a subquery such as:
SELECT Count(*) AS N
FROM
(SELECT DISTINCT Color FROM tblColors) AS T;
Now let's say that you also want to include another aggregate value such as a Sum, and want to group by some value, in this case, Color. On SQL Server, you could write this query as:
SELECT Color, Sum(Value) AS Total, Count(Distinct Color) AS N
FROM tblColors
GROUP BY Color
This provides the following results:
| Color |
Total |
N |
| Blue |
10 |
1 |
| Green |
8 |
1 |
| Orange |
1 |
1 |
| Red |
13 |
1 |
Now, if you're asking whether or not this should return the value of '1', the answer is yes. As I understand it, the Count(Distinct) here can be used as a test to verify the results of a given query.
If your data is on a server that supports Count(Distinct), you might be able to use a pass-through query to retrieve the results. If you are working with Access data, this becomes a bit more challenging.
Since we used subqueries for the previous query, we'll need to do the same here. The trick however is that we need to use two subqueries as shown in the following SQL:
SELECT C.Color, Sum(C.Value) AS Total, T2.N
FROM
(SELECT T.Color, Count(T.Color) AS N
FROM
(SELECT DISTINCT Color, Count(*) AS N
FROM tblColors GROUP BY Color) AS T
GROUP BY T.Color) AS T2
INNER JOIN tblColors AS C
ON T2.Color = C.Color
GROUP BY C.Color, T2.N;
You'll recognize the inner-most subquery as the subquery we used in the initial problem. This is then rolled into the outer-most subquery to return the number of colors per group.
Thanks to Eric for the question and to Michael Tucker on the Access test team for the help!
Rob Cooper is a Test Lead at Microsoft working on Microsoft Access. He started at Microsoft in 1998 working in Access product support in Charlotte, NC and then moved to Redmond to join the test team in 2001. For Access 2007, Rob worked on the new Grouping Pane for Reports, and security features such as Database Encryption and Disabled Mode. He is also a co-author on Expert Access 2007 Programming and Access 2007 VBA Programmer's Reference, both published by Wrox.
Rob also spends time on www.utteraccess.com reading and answering questions.