INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Jobs

simplify partition query

simplify partition query

(OP)
Hi all,

I have historical data as per below:

CODE

Date	      Name
20160201	A
20160301	A
20160501	A
20160801	B
20161201	B 

I would like to pick up the latest name and since when the latest name changed.
so the expected result would be

Date Name
20160801 B

Can I please have the simplest codes to get the expected result?

Thank you all,

RE: simplify partition query

Since you want to know when the name changes, the only way that I can think of it so match each row with it's subsequent row. To accomplish this, I would assign a row number to each row, ordered by date so that your data would look like this.

CODE

Date	      Name    RowId
20160201	A       1
20160301	A       2
20160501	A       3
20160801	B       4
20161201	B       5 

I'm going to show some code. Please bear in mind that I am create a table variable so that I can dummy up your data. Obviously, in your actual solution, you wouldn't want to use the table variable.

CODE

Declare @Temp Table(Date Int, Name VarChar(10));

Insert Into @Temp Values(20160201,'A')
Insert Into @Temp Values(20160301,'A')
Insert Into @Temp Values(20160501,'A')
Insert Into @Temp Values(20160801,'B')
Insert Into @Temp Values(20161201,'B')

Select  *,
        Row_Number() Over (Order By Date) As RowId
From    @Temp 

When you run the code above, you will see that it generates the data with an incrementing RowId. This is important because now we can join the table back to itself based on the row id, like this....

CODE

Declare @Temp Table(Date Int, Name VarChar(10));

Insert Into @Temp Values(20160201,'A')
Insert Into @Temp Values(20160301,'A')
Insert Into @Temp Values(20160501,'A')
Insert Into @Temp Values(20160801,'B')
Insert Into @Temp Values(20161201,'B')

; With Data As
(
  Select *,
         Row_Number() Over (Order By Date) As RowId
  From   @Temp 
)
Select *
From   Data As Before
       Inner Join Data As After
         On Before.RowId = After.RowId - 1 

When you run the code above, you'll get results like this...

CODE

Date        Name       RowId                Date        Name       RowId
----------- ---------- -------------------- ----------- ---------- --------------------
20160201    A          1                    20160301    A          2
20160301    A          2                    20160501    A          3
20160501    A          3                    20160801    B          4
20160801    B          4                    20161201    B          5 

Note that the first 3 columns are from the "before" table and the last 3 rows are from the "After" table.

The data you want would be from the 3rd row where a.name <> b.name. Specifically, you want the date and name from the after table.

Now all we have to do is to add a where clause and restrict the columns returned. Like this:

CODE

Declare @Temp Table(Date Int, Name VarChar(10));

Insert Into @Temp Values(20160201,'A')
Insert Into @Temp Values(20160301,'A')
Insert Into @Temp Values(20160501,'A')
Insert Into @Temp Values(20160801,'B')
Insert Into @Temp Values(20161201,'B')

; With Data As
(
  Select *,
         Row_Number() Over (Order By Date) As RowId
  From   @Temp 
)
Select After.Date,
       After.Name
From   Data As Before
       Inner Join Data As After
         On Before.RowId = After.RowId - 1
Where  Before.Name <> After.Name 

When you run the code above, you'll see that you get exactly what you asked for in your original question.

Please note that this code will only work if the name only changes 1 time. If the name changes multiple times, you will get the date and new name for each time the name is changed. If you want to return only the last time the name is changed, then you could do this.

CODE

Declare @Temp Table(Date Int, Name VarChar(10));

Insert Into @Temp Values(20160201,'A')
Insert Into @Temp Values(20160301,'A')
Insert Into @Temp Values(20160501,'A')
Insert Into @Temp Values(20160801,'B')
Insert Into @Temp Values(20161201,'B')
Insert Into @Temp Values(20161202,'C')
Insert Into @Temp Values(20170101,'C')

; With Data As
(
  Select *,
         Row_Number() Over (Order By Date) As RowId
  From   @Temp 
)
Select Top 1 After.Date,
       After.Name
From   Data As Before
       Inner Join Data As After
         On Before.RowId = After.RowId - 1
Where  Before.Name <> After.Name
Order By Date DESC 

-George
Microsoft SQL Server MVP
My Blogs
SQLCop
twitter
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom

RE: simplify partition query

As this is some type of history of data table, I wonder if you only present part of the data, because it makes we wonder why the history is so detailed, if the name only changed once. I assume other things changed in the records, othe things you don't show. Therefore the concept is not showing.

If all these dates are seen as valid from dates you want to know the eaerliest validfrom date for the latest valid name B. This is a two step thing, first determining latest date and its name as latest name, then determine earliest date for that latest name. And that can be done much simpler:

CODE

with Latest as
(Select Name From History
 Where Date = (Select MAX(Date) From History)
)

Select Min(Date) as Firstdate, Min(Latest.Name) as Latestname 
From History
inner join Latest on History.Name=Latest.Name 

Bye, Olaf.

RE: simplify partition query

With sample data:

CODE

Declare @History Table(Date Int, Name VarChar(10));

Insert Into @History Values(20160201,'A'), (20160301,'A'),(20160501,'A'),(20160801,'B'),(20161201,'B'),(20161202,'C'),(20170101,'C');

with Latest as
(Select Name From @History
 Where Date = (Select MAX(Date) From @History)
)

Select Min(Date) as Firstdate, Min(Latest.Name) as Latestname 
From @History
inner join Latest on @History.Name=Latest.Name; 

The only problem with that is a name changes back to something it was earlier, eg name changes from A to B then A again, you get the earliest first A valid from date. To solve that, data from history would need to be limited to the latest two different name values at max. This plays no major role, it'll only become important, once you have such a situation, and it will become a very likely problem, if you watch over the history of a boolean (bit) column, which switches more than once.

Another totally independent solution would of course be to store histories of single columns, so your name history only had 2 records, the first A and the first B record.

Edit: An idea I just had: One major information about which columns changed in an update is within COLUMNS_UPDATED(). If you store that value in each history record, you can later limit data to be with a certain bit of the column of interest set, which would give you a pure history of these column changes only. That's the best way of determining single column history without needing ot have a history table for any single column. That won't help right now, unless you find that info in your current history table. It has to be changed wherever you now create such history data, eg in triggers.

Bye, Olaf.

RE: simplify partition query

(OP)
Thank you all for the helps and supports...

However, one more request if I may, from the raw data as per below:

CODE

Date	      Name
20160201	A
20160301	A
20160501	A
20160801	B
20161201	B 

we would like to extend the functional by having the historical data based on the name changed by generating expiry date by the next record minus 1 day then the last record would be 99991231.
so the result as per below:

CODE

Date	   Expiry_Date   Name  
20160201    20160731       A
20160801    99991231	   B 

Please let me know how to achieve this.
Thank you in advance,

RE: simplify partition query

Continuing on from my previous example....

CODE

Declare @Temp Table(Date int, Name VarChar(10));

Insert Into @Temp Values(20160201,'A')
Insert Into @Temp Values(20160301,'A')
Insert Into @Temp Values(20160501,'A')
Insert Into @Temp Values(20160801,'B')
Insert Into @Temp Values(20161201,'B')

; With Data As
(
  Select *,
         Row_Number() Over (Partition By Name Order By Date) As RowId
  From   @Temp
), Rows As
(
  Select Date, Name, Row_Number() Over (Order By Date) As RowId
  From   Data
  Where  RowId = 1
)
Select A.Date,
       Coalesce(DateAdd(Day, -1, Convert(VarChar(10), B.Date)), '99991231') As Expiry_Date, 
       A.Name
From   Rows As A
       Left Join Rows As B
         On A.RowId = B.RowId - 1 

-George
Microsoft SQL Server MVP
My Blogs
SQLCop
twitter
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom

RE: simplify partition query

Sorry, I didn't catch you having a follow up question. You already have a good answer, of course. My turn on this would need 2012 or later, well, that's almost 5 years old legacy SQL :)

CODE

Declare @History Table(Date date, Name VarChar(10));

Insert Into @History Values('20160201','A'), ('20160301','A'), ('20160501','A'), ('20160801','B'), ('20161201','B');

With Fromdates as
( Select Min(Date) as ValidFrom, Name 
  From @History 
  Group by Name)

Select *, 
   Coalesce( 
      Lead(DateAdd(Day, -1, Fromdates.ValidFrom)) over (order by ValidFrom),
	  '99991231'
	       ) As ValidTo
from Fromdates 


First part of it is much easier, you could also compute the end date client side without T-SQL at all, just taking the simple Min(Date), Name query.

In terms of datetime I would actually not subtract a day, the point in time of the expiration is exactly the same datetime as the next validfrom, even if you are pedantic about those being two different terms on the validity, two time spans are divided at one point in time.

So in datetime nomenclature:

CODE

Declare @History Table(Date datetime, Name VarChar(10));

Insert Into @History Values('20160201','A'), ('20160301','A'), ('20160501','A'), ('20160801','B'), ('20161201','B');

With Fromdatetimes as
( Select Min(Date) as ValidFrom, Name 
  From @History 
  Group by Name)

Select *, 
   Coalesce( 
      Lead(Fromdatetimes.ValidFrom) over (order by ValidFrom),
	  '99991231'
	       ) As ValidTo 
from Fromdatetimes 


The difference is in the point of view over the different date types, a date (precision day) stands for the whole day, so validto date is 1 less than the next validfrom, while looking at datetimes the point in time of the change is both the end of the previous time span and the begin of the next span.

Bye, Olaf.

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members!

Resources

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close