Overview
We all probably know wtf is median value, but anyway...
Imagine any ordered set of values, like this one:
1 3 7 8 17 23 42
Median represents "middle" value in set, or in above case 4th element - 8. For even count of elements median becomes mean (average) of two middle values. So if we remove 8, median becomes (7+17)/2 = 12.
Purpose of median value is statistical - it gives "representative" value from set, and usually does it better than average value - assuming data distribution is not too f**ed up. Visit wikipedia for more information.
Objective
The goal of this puzzle is to calculate median per group. Sorta like custom aggregate, though even good SQL2005/CLR implementation is somewhat of tricky. Here is sample gen script:
To make things simpler, let's assume numValue cannot be NULL. Expected result set is:
Rules & restrictions
No OLAP/Analysis services allowed
You can do it with SQL2000, Yukon, whatever. Personally I'd still value 2000 code more - simply because it is harder to do.
There are no restrictions about T-SQL features used - as long as code is written set-based.
The most elegant answers (yup, subjective) get some stars...
-----
P.S. to George: what happened with points in polygon thing? Shall we continue testing?
------
Theory: everybody knows everything, nothing works
Practice: everything works, nobody knows why
![[banghead] [banghead] [banghead]](/data/assets/smilies/banghead.gif)
We all probably know wtf is median value, but anyway...
Imagine any ordered set of values, like this one:
1 3 7 8 17 23 42
Median represents "middle" value in set, or in above case 4th element - 8. For even count of elements median becomes mean (average) of two middle values. So if we remove 8, median becomes (7+17)/2 = 12.
Purpose of median value is statistical - it gives "representative" value from set, and usually does it better than average value - assuming data distribution is not too f**ed up. Visit wikipedia for more information.
Objective
The goal of this puzzle is to calculate median per group. Sorta like custom aggregate, though even good SQL2005/CLR implementation is somewhat of tricky. Here is sample gen script:
Code:
create table sampleData ( groupID int, numValue int )
insert into sampleData values ( 1, 1 )
insert into sampleData values ( 1, 2 )
insert into sampleData values ( 1, 6 )
insert into sampleData values ( 1, 16 )
insert into sampleData values ( 1, 7 )
insert into sampleData values ( 2, 3 )
insert into sampleData values ( 2, 9 )
insert into sampleData values ( 2, 23 )
insert into sampleData values ( 2, 31 )
insert into sampleData values ( 3, 10 )
insert into sampleData values ( 3, 17)
insert into sampleData values ( 3, 52 )
insert into sampleData values ( 3, 66 )
insert into sampleData values ( 4, 18 )
Code:
groupID medianValue
-------.-----------
1 6.0
2 16.0
3 34.5
4 18.0
No OLAP/Analysis services allowed
You can do it with SQL2000, Yukon, whatever. Personally I'd still value 2000 code more - simply because it is harder to do.
There are no restrictions about T-SQL features used - as long as code is written set-based.
The most elegant answers (yup, subjective) get some stars...
-----
P.S. to George: what happened with points in polygon thing? Shall we continue testing?
------
Theory: everybody knows everything, nothing works
Practice: everything works, nobody knows why
![[banghead] [banghead] [banghead]](/data/assets/smilies/banghead.gif)