Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Wanet Telecoms Ltd on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

SQL and Star Schema 1

Status
Not open for further replies.

talenx

Programmer
Aug 7, 2001
157
US
Hi all,
I was curious if anyone has heard of SQL server 2000 not optimizing it's self for a star schema.
Well... kind of...
I have a situation where the optimizer tends to display some odd results in execution plan. It wills attempt to first PULL ALL of the data from the fact table (which is huge by the way) something like 367 million last time I checked.
Anyways then it will join to the dimensional table to retrieval the filtered indexed value. It then rolls through a nested loop utilizing both indexes and finally spitting out a result. (10 min later) it’s almost like the optimizer is not take the table weight in to factor prior to execution

I have checked the index to confirm that they were valid and free of major fragmentation and also confirmed the index statistics were up to date.

Here is a dumbed down version of the SQL code

Code:
SELECT count(SP_Key) 
FROM FACT_TABLE with (nolock)
INNER JOIN DIMENSIONAL_TABLE
ON
DIMENSIONAL_TABLE.time_dimension_key = FACT_TABLE .Submitted_Date_Key
WHERE DIMENSIONAL_TABLE.date_number BETWEEN '1/1/2006 00:00:00.000' AND '3/31/2006 00:00:00.000'

Any thoughts

Thanks
Talenx


 
Update stats on the table as well as the indexes.
Rebuild the index on the fact table.
Is it doing a table scan, an index scan or an index seek?

Denny
MCSA (2003) / MCDBA (SQL 2000) / MCTS (SQL 2005)

--Anything is possible. All it takes is a little research. (Me)
[noevil]
 
Denny,

Can you tell me whats the difference between Index Scan and Index Seek and which is better and how can we create it.
 
An index seek is better than an index scan.

An index seek happens when the data in the index is all correctly ordered. An index scan happens when SQL needs to read the entire index to find the rows.

With the index seek SQL reads through the index, get's to the correct data, then stops when it's found what it's looking for.

To get SQL to do a seek instead of a scan you'll want to look to rebuild the index, and make sure that your query is using the index correctly.

Denny
MCSA (2003) / MCDBA (SQL 2000) / MCTS (SQL 2005)

--Anything is possible. All it takes is a little research. (Me)
[noevil]
 
Thanks Denny. That was a great piece of information. Thx a lot.. and yes ofcourse here is your star. Keep it up :)
 
mrdenny,

ya, that’s the strange thing... the stats are updated on both indexes and I have goes as far as to rebuild both indexes yet it still attempts to pull all of the data from the fact table. Then apply the filter via a nest loop.

Like I said... I haven't seen it before.... WEIRD!

Thanks

TalenX
 
Is this the estimated execution plan or the actual execution plan?

Denny
MCSA (2003) / MCDBA (SQL 2000) / MCTS (SQL 2005)

--Anything is possible. All it takes is a little research. (Me)
[noevil]
 
Oddly I thought there would be a difference as well... unfortunately the plans had the same result.

Thanks
TalenX
 
Is the index a single column index?

Denny
MCSA (2003) / MCDBA (SQL 2000) / MCTS (SQL 2005)

--Anything is possible. All it takes is a little research. (Me)
[noevil]
 
Yep, the index is on a Date_Key which is an int... OH!! One other thing I noticed about my code that I posted…
The dates are not hardcode as above. Rather they are pulling from another table. So it should look like this.

Code:
SELECT count(SP_Key) 
FROM FACT_TABLE with (nolock)
INNER JOIN DIMENSIONAL_TABLE
ON
DIMENSIONAL_TABLE.time_dimension_key = FACT_TABLE.Submitted_Date_Key
INNER JOIN DIMENSIONAL_AUTOPARAMETER_TABLE
ON
FACT_TABLE.Submitted_Date_Key.Date_Number >= DIMENSIONAL_AUTOPARAMETER_TABLE.From_Date AND
FACT_TABLE.Submitted_Date_Key.Date_Number >= DIMENSIONAL_AUTOPARAMETER_TABLE.To_Date 
WHERE DIMENSIONAL_AUTOPARAMETER_TABLE.ID = '1111'

When the values are hard coded the optimizer work fine.

hmmm...


 
I'm assuming that your second join should read:
Code:
FACT_TABLE.Submitted_Date_Key.Date_Number >= DIMENSIONAL_AUTOPARAMETER_TABLE.From_Date AND
FACT_TABLE.Submitted_Date_Key.Date_Number [red]<[/red]= DIMENSIONAL_AUTOPARAMETER_TABLE.To_Date

What if you change to using a between statement on the join?

Is the index clustered or not? (I don't see it in the thread.)



Denny
MCSA (2003) / MCDBA (SQL 2000) / MCTS (SQL 2005)

--Anything is possible. All it takes is a little research. (Me)
[noevil]
 

MrDenny,
Oh, sorry bout that...
Currently the index is not clustered, unfortunately the plan did not change when using the BETWEEN operator in the join clause.

I find it very odd that when a hard coded value is applied it uses the index and pulls from the dimension and pulls only those records that match the filter from the fact. I have even went as far as to code the hard coded dates as variables based on today’s date example ( odd thing here it also doesn't use the index like the hard coded date do.

Code:
SELECT count(SP_Key) 
FROM FACT_TABLE with (nolock)
INNER JOIN DIMENSIONAL_TABLE
ON
DIMENSIONAL_TABLE.time_dimension_key = FACT_TABLE.Submitted_Date_Key
INNER JOIN DIMENSIONAL_AUTOPARAMETER_TABLE
ON
FACT_TABLE.Submitted_Date_Key.Date_Number  BETWEEN dateadd(m, -1, cast(cast(month(getdate()) as varchar)+'/01/'+cast(year(getdate()) as varchar)  as datetime)) AND dateadd(d, -1, cast(cast(month(getdate()) as varchar)+'/01/'+cast(year(getdate()) as varchar)  as datetime))  
WHERE DIMENSIONAL_AUTOPARAMETER_TABLE.ID = '1111'

 
Well, it seems like the optimizer is looking at the date format...
Example
Will NOT use the index
Code:
select * from Time_Dimension where date_number between (dateadd (m, -1, cast (cast (month (getdate ())as varchar)+'/01/' +cast (year (getdate ())as varchar)as datetime))) and (dateadd (d, -1, cast (cast (month (getdate ())as varchar)+'/01/' +cast (year (getdate ())as varchar)as datetime)))
WILL used the index
Code:
select * from Time_Dimension where date_number between getdate() and getdate()

In addtion when appliing the the Date_Number from the Time_Dimension equal to the
Code:
(dateadd (m, -1, cast (cast (month (getdate ())as varchar)+'/01/' +cast (year (getdate ())as varchar)as datetime)))
(Which is the first set of code I had tried. the optimizer uses the index.... ugg I’m at a loss.)

Any thoughts

Thanks
TalenX
 
That was something I was going to recommend looking at this morning was data type issues. I know that if you have a function or a convert on the data then it won't use an index.

It may be that SQL is doing an implicit converion where it shouldn't be for some reason.

I assume that all your data types match up correctly?

Denny
MCSA (2003) / MCDBA (SQL 2000) / MCTS (SQL 2005)

--Anything is possible. All it takes is a little research. (Me)
[noevil]
 
I double checked the columns that are being used in the join and where clauses and they are compliant. We run pretty much all of our joining object off a segregate integer based key.

Very odd!! Even when I use an index hint the optimizer still performs a fact table scan.

hmm...
 
I know that if you have a function or a convert on the data then it won't use an index.
Do you happen to know where you found that documentation... that may be the issue here.

Thanks
TalenX
 
I don't think that I saw it anywhere. I think I got it from personal experience.

Denny
MCSA (2003) / MCDBA (SQL 2000) / MCTS (SQL 2005)

--Anything is possible. All it takes is a little research. (Me)
[noevil]
 

Ahh… I see. I think I’m going to have to concur with you ... it’s seem once a function is applied with some time of conversion the index becomes invalid to the optimizer.

I wonder if there is any way to force the optimizer to use the index... I have attempted to use an index hint... But with no avail.
 
If SQL is going an implicit conversion for some reason it won't use the index (even if you try and force it to).

How's the index on the dimention table look? Maybe it needs rebuilding and it's blocking the server from using the index on the fact table?

Denny
MCSA (2003) / MCDBA (SQL 2000) / MCTS (SQL 2005)

--Anything is possible. All it takes is a little research. (Me)
[noevil]
 
Thats the odd thing... everything looking good. here are the stats:

DBCC SHOW_STATISTICS
Statistics for FACT INDEX
Updated Rows Rows Sampled Steps Density Average key length
-------------------- -------------------- -------------------- ------ ------------------------ ------------------------
Mar 6 2006 2:40PM 71680256 304424 197 160.55275 4.0

Statistics for TIME DIMENSION INDEX .
Updated Rows Rows Sampled Steps Density Average key length
-------------------- -------------------- -------------------- ------ ------------------------ ------------------------
Feb 28 2006 6:46PM 7306 7306 4 1.368738E-4 4.0

DBCC SHOWCONTIG

DBCC SHOWCONTIG scanning Fact_TABLE' table...
Table: 'Pharmacy_Data_Fact_TABLE' (421576540); index ID: 4, database ID: 29
LEAF level scan performed.
- Pages Scanned................................: 155003
- Extents Scanned..............................: 24942
- Extent Switches..............................: 90108
- Avg. Pages per Extent........................: 6.2
- Scan Density [Best Count:Actual Count].......: 21.50% [19376:90109]
- Logical Scan Fragmentation ..................: 13.55%
- Extent Scan Fragmentation ...................: 11.16%
- Avg. Bytes Free per Page.....................: 1143.1
- Avg. Page Density (full).....................: 85.88%
DBCC execution completed. If DBCC printed error messages, contact your system administrator.


DBCC SHOWCONTIG scanning 'Time_Dimension' table...
Table: 'Time_Dimension' (597577167); index ID: 27, database ID: 29
LEAF level scan performed.
- Pages Scanned................................: 14
- Extents Scanned..............................: 2
- Extent Switches..............................: 1
- Avg. Pages per Extent........................: 7.0
- Scan Density [Best Count:Actual Count].......: 100.00% [2:2]
- Logical Scan Fragmentation ..................: 0.00%
- Extent Scan Fragmentation ...................: 0.00%
- Avg. Bytes Free per Page.....................: 268.1
- Avg. Page Density (full).....................: 96.69%
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top