Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

MAX() and MIN() issue

Status
Not open for further replies.

SQLBill

MIS
May 29, 2001
7,777
US
I've run into a curiosity. While creating a script to correlate some data, I was trying to use MIN() and MAX() on a identity column based on values in a datetime column. Using MAX(), I got the results back in six seconds (00:00:06). Using MIN(), I got the results back in 8 minutes 43 seconds (00:08:43). Using the Estimated Execution Plan, I found this:

The MAX() query first does a Clustered Index SEEK (98%), then a TOP (2%), then a Stream Aggregate (0%).

The MIN() query first does a Clustered Index SCAN (100%), followed by Parallelism/Gather Stream (0%), then a TOP (2%), then a Stream Aggregate (0%).

The queries are:
Code:
SELECT MAX(RowID)
FROM mytable
WHERE dt_column = '2006-09-06 11:00'

SELECT MIN(RowID)
FROM mytable
WHERE dt_column = '2006-09-06 11:00'

The RowID column is a clustered index and the dt_column is a non-clustered index.

I can make the second query faster by changing it to a MAX() and using 2006-09-06 10:59:59 (acceptable for the need), but that doesn't explain why one uses a scan and parallelism/gather stream and the other uses a seek and no parallelism.

Any thoughts on this?

(I can't change the schema and that's not the point of this, I'm more interested in why the difference execution between the two.)

-SQLBill


Posting advice: FAQ481-4875
 
I don't really have a good answer to your question, but just for giggles, try this...

Code:
Declare @TheDate DateTime
Set @TheDate = '2006-09-06 11:00'

SELECT MAX(RowID)
FROM mytable
WHERE dt_column = @TheDate

SELECT MIN(RowID)
FROM mytable
WHERE dt_column = @TheDate

Maybe, since you are comparing a date field to a string, it chooses to use a less efficient index. Grasping at straws here, but hope it helps.

-George

Strong and bitter words indicate a weak cause. - Fortune cookie wisdom
 
George,

The real script accepts a variable @BeginDate which I want to subtract one hour from and then find stuff between those two times:
Code:
WHERE mydtcolumn > (SELECT DATEADD(HOUR,-1,@BeginDate))
  AND mydtcolumn <= @BeginDate
This was taking forever (1 1/2 hours to return less than 20,000 rows). I added 'timing checks' to the code and confirmed the problem was where the script checked the datetime field.
Code:
SELECT GETDATE(), ' step name' --timing code
Several of those spaced in the script let me see how long each part was taking. I then confirmed it by running the individual parts by themselves. Lastly, I used the code in my initial post to confirm the issue. I wanted to use ROWID to speed up the search, but I still needed to use the datetime info.

I finally got a solution, but I still don't know why MIN() is giving me the problem. My solution is to subtract one hour and then one second from the inputted datetime. Then find the MAX(ROWID) based on that value (i.e. Input 2006-09-08 11:00:00 - find the MAX(ROWID) for that value. Then subtract one hour and one second to get 2006-09-08 09:59:59 and find the MAX(ROWID) for that value. Then make my WHERE clause > than the earliest value and <= the latest value. The new version ran in 27 seconds.

So, 'fudging' it so I can use MAX() for both values made the query run MUCH faster. But I don't know why. At least I know why some of my other queries have run slow (just not this slow).

-SQLBill

Posting advice: FAQ481-4875
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top