Conditional Distinct Count
Conditional Distinct Count
(OP)
I need to do a conditional count on a number of assessments. It needs to be a distinct count over person and date. My raw data looks something like this:
Name Date Assessment Type Assessment Time
Chuck Smith 6/7/2010 1 6/7/10 1:00
Chuck Smith 6/7/2010 2 6/7/10 1:00
Chuck Smith 6/7/2010 3 6/7/10 1:00
Chuck Smith 6/8/2010 1 6/8/10 1:00
Chuck Smith 6/8/2010 1 6/8/10 2:00
Chuck Smith 6/8/2010 1 6/8/10 3:00
Chuck Smith 6/8/2010 2 6/8/10 1:00
Chuck Smith 6/8/2010 2 6/8/10 2:00
Chuck Smith 6/9/2010 2 6/9/10 3:00
Chuck Smith 6/9/2010 3 6/9/10 14:00
Larry Jones 6/7/2010 1 6/7/10 1:00
Larry Jones 6/7/2010 2 6/7/10 1:00
Larry Jones 6/8/2010 1 6/8/10 1:00
Larry Jones 6/8/2010 1 6/8/10 2:00
Larry Jones 6/8/2010 2 6/8/10 1:00
Larry Jones 6/9/2010 1 6/9/10 1:00
Larry Jones 6/9/2010 1 6/9/10 2:00
Larry Jones 6/9/2010 2 6/9/10 1:00
Larry Jones 6/9/2010 2 6/9/10 2:00
Larry Jones 6/10/2010 1 6/10/10 1:00
Larry Jones 6/11/2010 2 6/11/10 1:00
The result set should look something like the following:
Name Assessment Type 1 Assessment Type 2 Assessment Type 3
Chuck Smith 2 3 2
Larry Jones 4 4 0
If it didn't need to be a distinct count I could just use sum with a case statement, but this won't be distinct. I would like to use OLAP aggregates because I still need to bring back the detail data.
Thanks,
Kevin
Name Date Assessment Type Assessment Time
Chuck Smith 6/7/2010 1 6/7/10 1:00
Chuck Smith 6/7/2010 2 6/7/10 1:00
Chuck Smith 6/7/2010 3 6/7/10 1:00
Chuck Smith 6/8/2010 1 6/8/10 1:00
Chuck Smith 6/8/2010 1 6/8/10 2:00
Chuck Smith 6/8/2010 1 6/8/10 3:00
Chuck Smith 6/8/2010 2 6/8/10 1:00
Chuck Smith 6/8/2010 2 6/8/10 2:00
Chuck Smith 6/9/2010 2 6/9/10 3:00
Chuck Smith 6/9/2010 3 6/9/10 14:00
Larry Jones 6/7/2010 1 6/7/10 1:00
Larry Jones 6/7/2010 2 6/7/10 1:00
Larry Jones 6/8/2010 1 6/8/10 1:00
Larry Jones 6/8/2010 1 6/8/10 2:00
Larry Jones 6/8/2010 2 6/8/10 1:00
Larry Jones 6/9/2010 1 6/9/10 1:00
Larry Jones 6/9/2010 1 6/9/10 2:00
Larry Jones 6/9/2010 2 6/9/10 1:00
Larry Jones 6/9/2010 2 6/9/10 2:00
Larry Jones 6/10/2010 1 6/10/10 1:00
Larry Jones 6/11/2010 2 6/11/10 1:00
The result set should look something like the following:
Name Assessment Type 1 Assessment Type 2 Assessment Type 3
Chuck Smith 2 3 2
Larry Jones 4 4 0
If it didn't need to be a distinct count I could just use sum with a case statement, but this won't be distinct. I would like to use OLAP aggregates because I still need to bring back the detail data.
Thanks,
Kevin
RE: Conditional Distinct Count
CODE
, SUM(CASE AssessmentType WHEN 1 THEN 1 ELSE 0 END) AS Type1
, SUM(CASE AssessmentType WHEN 2 THEN 1 ELSE 0 END) AS Type2
, SUM(CASE AssessmentType WHEN 3 THEN 1 ELSE 0 END) AS Type3
FROM (SELECT DISTINCT Name, Date, AssessmentType FROM tblAssessments) D
GROUP BY Name
Hope This Helps, PH.
FAQ219-2884: How Do I Get Great Answers To my Tek-Tips Questions?
FAQ181-2886: How can I maximize my chances of getting an answer?
RE: Conditional Distinct Count
I had to do this in a couple of steps. First I create the unit by which I need to count. In this case it is a concatenation of Name, Assessment Date, and Assessment Type. Then I rank these values based on the assessment time. Then I can do a sum for each of the assessment types, but only of the records whose rank is 1. This is what makes the count distinct. The code looks like that below. It may not be exactly right because I was doing it on a different set of data with some different wrinkles, but the logic works. Note that I changed the column headings in the code to make it more coding friendly. This code returns all the detail data in addition to the aggregates.
-Kevin
CODE
SUM(CASE WHEN Assessment_Type = 1 THEN 1 ELSE 0 END) OVER () AS Assessment_1_Count,
SUM(CASE WHEN Assessment_Type = 2 THEN 1 ELSE 0 END) OVER () AS Assessment_2_Count,
SUM(CASE WHEN Assessment_Type = 3 THEN 1 ELSE 0 END) OVER () AS Assessment_3_Count,
q1.name, q1.assessment_date, q1.assessment_type, q1.assessment_time
FROM (
SELECT
Name || CAST(assessment_date AS CHAR(10)) || CAST(assessment_type AS CHAR(2)) AS counting_unit,
RANK() OVER (PARTITION BY counting_unit ORDER BY assessment_time) AS assessment_rank,
table1.name, table1.assessment_date, table1.assessment_type, table1.assessment_time
FROM Table1
) q1