Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

How do I isolate the max record in a group to use only

Status
Not open for further replies.

eo

MIS
Apr 3, 2003
809
I have a set of data on which I perform fuzzy grouping, but only need to use the most recent of each group.

The group is based on a Business Key (SSK). I will show bwo example data
Code:
DWSK SSK  CompanyGroupName   CompanyName
001  100  A.H.L              A.H.L London
002  100  A.H.L              A.H.L London
003  100  A.H.L              A.H.L London
004  101  AHL                AHL London
005  102  HL                 AHL London

If I run a fuzzy grouping over this set, I will get a set that looks very similar as there are 3 instances of SSK 100. But I want to only use that once in the query, so would need to remove 2 of the SSK 100 records, and this is the question, how do I do this.

of course a query such as the one below would do it, but note that the DWSK is actually a uniqueidentifier, so cannot use MAX()

Any ideas?

Code:
select Max(DWSK)
,SSK 
,CompanyGroupName
,CompanyName
from dbo.DirectCustomer
group by 
SSK 
,CompanyGroupName
,CompanyName

EO
Hertfordshire, England
 
If I understand the problem your desired results would be something like

003 100 A.H.L A.H.L London
004 101 AHL AHL London
005 102 HL AHL London

and if I continue to understand... you don't necessarily have to have dwsk to be the max but unique?

If your using 2005 and greater... you could use CTE.

add your fuzzy query then from that select distict CompanyGroupName

Simi

 
If the DWSK is uniqueidentifier how do you know which record is most recent one?

Borislav Borissov
VFP9 SP2, SQL Server 2000,2005 & 2008.
 
with dc as
(
select * from DirectCustomer
),
mymax as
(
select ssk, max(dwsk) as maxdwsk
from DirectCustomer
group by ssk
)
Select mymax.ssk, mymax.maxdwsk, dc.CompanyGroupName, dc.CompanyName
from dc
right join mymax
on dc.ssk=mymax.ssk and mymax.maxdwsk=dc.dwsk
 
Sorry take out the right join, just make it a join.

with dc as
(
select * from DirectCustomer
),
mymax as
(
select ssk, max(dwsk) as maxdwsk
from DirectCustomer
group by ssk
)
Select mymax.ssk, mymax.maxdwsk, dc.CompanyGroupName, dc.CompanyName
from dc
join mymax
on dc.ssk=mymax.ssk and mymax.maxdwsk=dc.dwsk

Simi
 
I made that more complicated then it needed to be...

with
mymax as
(
select ssk, max(dwsk) as maxdwsk
from DirectCustomer
group by ssk
)
Select mymax.ssk, mymax.maxdwsk, dc.CompanyGroupName, dc.CompanyName
from DirectCustomer dc
join mymax
on dc.ssk=mymax.ssk and mymax.maxdwsk=dc.dwsk
order by ssk


Simi
 
Simi:

of course a query such as the one below would do it, but note that the DWSK is actually a uniqueidentifier, so cannot use MAX()
So I still want to know what defines a record as "most recent"?

Borislav Borissov
VFP9 SP2, SQL Server 2000,2005 & 2008.
 
my assumption is the compbination of

SSK + DWSK

ssk being the company identifer and DWSK being some type record number or transaction id.

Simi
 
DWSK is uniqueidentifier in other words GUID.
That is random generated string with function NewId().
There is NO way to tell which GUID is first generated. That is not IDENTITY field.

Borislav Borissov
VFP9 SP2, SQL Server 2000,2005 & 2008.
 
Borislav is correct, DWSK is a GUID and serves as the surrogate key in a DW, whereas SSK is a business key. You cannot apply MAX() to the DWSK so this is why I am looking for some sort of a solution. My Fuzzy grouping is currently shwoing WAY more records than it should.

EO
Hertfordshire, England
 
OK, but HOW do you know which record is most recent one?


Borislav Borissov
VFP9 SP2, SQL Server 2000,2005 & 2008.
 
As Borislav says, you can't perform any kind of ordering on a UNIQUEIDENTIFIER so cannot ascertain the most recent record based on this field. The snippet below can demonstrate this when run in query analyzer as you can see by watching the returned identity field of the top ranked row of duplicate values in the resultset;
Code:
DECLARE @Test TABLE
(
	ID					INT					IDENTITY(1,1)
	,DWSK				UNIQUEIDENTIFIER	NOT NULL 
	,SSK				INT					NOT NULL
	,CompanyGroupName	VARCHAR(32)			NOT NULL
	,CompanyName		VARCHAR(32)			NOT NULL
)

INSERT INTO @Test
SELECT
	NewID()
	,100
	,'A.H.L'
	,'A.H.L London';
INSERT INTO @Test
SELECT
	NewID()
	,100
	,'A.H.L'
	,'A.H.L London';
INSERT INTO @Test
SELECT
	NewID()
	,100
	,'A.H.L'
	,'A.H.L London';
INSERT INTO @Test
SELECT
	NewID()
	,101
	,'AHL'
	,'AHL London';
INSERT INTO @Test
SELECT
	NewID()
	,102
	,'HL'
	,'AHL London';

SELECT
	ID
	,DWSK
	,SSK
	,CompanyGroupName
	,CompanyName
FROM
(
	SELECT
		ID
		,DWSK
		,SSK
		,CompanyGroupName
		,CompanyName
		,RANK() OVER 
		(
			PARTITION BY 
				SSK
				,CompanyGroupName
				,CompanyName
			ORDER BY 
				DWSK DESC
		) AS 'Ranking'
	FROM
		@Test
)xxx
WHERE
	Ranking = 1;

I have to second the question, HOW do you know which record is most recent one?

Rhys

"The trouble with having an open mind, of course, is that people will insist on coming along and trying to put things in it"
Terry Pratchett
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top