Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations derfloh on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

SQL Select most recent of multiple groups

Status
Not open for further replies.

stapleman527

Programmer
Joined
Nov 16, 2009
Messages
8
Location
US
I am working on a data migration project in which I have account data that I am massaging into a new format. I am also filtering out data that is not needed in the new system. For each account I can have 1 to n records that represent updates to the account and I want to create a query that will return the most recent record for each account number. See below:

Start Data:
Account_Num | First_Name | Last_Name | Open_Date
12345 Joe Blow 09/10/2004
12345 Joe Blew 09/11/2004
12346 Jane Smith 10/12/2007
12347 Sandy Back 02/24/2007
12347 Cindy Black 02/22/2007
12347 Sandy Black 02/23/2007

End Data:
Account_Num | First_Name | Last_Name | Open_Date
12345 Joe Blew 09/11/2004
12346 Jane Smith 10/12/2007
12347 Sandy Back 02/24/2007

I am using MS SQL Server, but will be putting this in a script along with other transformations to be run in a 3rd party tool so I would like to stick to as much basic SQL as possible.

Thanks,
Jonathan
 
Code:
SELECT accounts.*
FROM YourTable accounts

INNER JOIN

(SELECT Account_Num, MAX(Open_Date) AS MaxDate
FROM YourTable
GROUP BY Account_Num) recent

ON accounts.Account_Num = recent.Account_Num
AND accounts.Open_Date = recent.MaxDate
 
Thanks. Both solutions work great. I like the simplicity of the code from RiverGuy, which was also very similar to the first few examples on that blog, but it is possible to have accounts where there are multiple rows for with the same date, it doesn't really matter which one of these I pick but I need to end up with one. From what I was reading in that blog and just looking and playing with the statement I can't think of a way to make that only return one row per account. Am I missing something or should I go the route of the windowing function mentioned in the blog?

Thanks.
 
Using RiverGuy's approach I don't see a way to get only one record per multiple max dates. Windowing approach is the simplest to remember (IMHO), but it is not SQL Server 2000 and less compatible (and other Databases have their own implementation).

Also, with many records I believe the performance of row_number() approach decreases. Check the last article (T-SQL challenge) from
Look at the test cases (the second part of the blog).

If you don't have many records, then you can use this approach
Code:
select T.* from myTable T where ID = (select top 1 ID from  myTable T1 where T1.GroupField = T.GroupField ORDER BY Date DESC)

This approach, as far as I know, is cross-database compatible and in some situations may yield good results.
 
Here is what I do when I want to do a quick and dirty solution and you have no Identity column:

Code:
--Get an Identity Column
SELECT * INTO #TempYourTable FROM YourTable
ALTER TABLE #TempYourTable ADD IDCol INT IDENTITY

--Delete where not max id for each account/date combo
DELETE a
FROM #TempYourTable a
WHERE IDCol <> (SELECT MAX(IDCol) FROM #TempYourTable b WHERE a.Account_Num = b.Account_Num AND a.Open_Date = b.Open_Date)

SELECT accounts.*
FROM #TempYourTable accounts

INNER JOIN

(SELECT Account_Num, MAX(Open_Date) AS MaxDate
FROM #TempYourTable 
GROUP BY Account_Num) recent

ON accounts.Account_Num = recent.Account_Num
AND accounts.Open_Date = recent.MaxDate

DROP TABLE #TempYourTable
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top