Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

How can I speed up queries to this table?

Status
Not open for further replies.

Jasen

IS-IT--Management
May 1, 2002
435
US
I have a table storing hourly production counts from multiple heads on multiple machines on multiple lines. So it gets huge very fast. 10-25 million records.
The primary key for the table consists of 7 columns which define exactly which head/machine/line/etc and the timestamp. There 3 more columns that hold the production and defect counts.

Due to this table being ported from an Oracle instance, all the field datatypes are numeric(precision 10), except the timestamp. I can *probably* change this if needed.
The PK is clustered. Is there anything glaringly wrong with the way the PK is setup? Is there something I should look at adjusting with SQL server itself?

The queries used are sometimes as simple as this:

Select * from table where timestamp >= '4/18/2008 6:30:00 am'

A query like that can take 3-5 minutes. If I try filtering on other fields it only gets worse. Is there a potential problem using a text string to compare the datetime timestamp against maybe?

I feel like SQL server should be able to handle this amount of data and have similar performance as an Oracle instance with the same table structure and queries.
I have maintenance jobs setup to update statistics and index pages once a week.
 
That's the only index.
Should there be additional indexes, if the fields are already contained in the PK index? I hadn't thought of that.
 
>>update statistics and index pages once a week.

that is not frequent enough when doing that many inserts per hour, by the end of the week the stats are completely off

create a nonclustered index on timestamp

and change your where clause to

Select * from table where timestamp >= '2008-04-18 06:00:00.000'


>>all the field datatypes are numeric(precision 10),
show some sample data, can you use integers instead?

Denis The SQL Menace
--------------------
SQL Server Code,Tips and Tricks, Performance Tuning
SQLBlog.com, Google Interview Questions
 
Thanks guys.

I can update statistics nightly, that's easy enough.

The actual data in those numeric fields is definitely integer, the problem is that the app that's collecting and storing the data is an old C++ app originally written to talk to Oracle. We originally set all these fields to integer and it broke the app because it was looking for the precision. I can't change this until I have a resource to fix the source code of that app.

Instead of using this: '2008-04-18 06:00:00.000', could I stick the value into a variable of type datetime and use that in the where clause?

I will post the plan soon. Waiting for it to finish creating the new index.
In the meantime I'm studying the "Query tuning" chapter of Inside SQL Server. I'm determined to learn something today.
 
Execution plan, after having added the suggested index:

|--Compute Scalar(DEFINE:([Expr1008]=[Expr1003]-[Expr1004], [Expr1009]=[Expr1006]-[Expr1007]))

|--Parallelism(Gather Streams)

|--Stream Aggregate(GROUP BY:([Expr1002]) DEFINE:([Expr1003]=MAX([partialagg1010]), [Expr1004]=MIN([partialagg1011]), [Expr1005]=MAX([partialagg1012]), [Expr1006]=MAX([partialagg1013]), [Expr1007]=MIN([partialagg1014])))

|--Sort(ORDER BY:([Expr1002] ASC))

|--Parallelism(Repartition Streams, PARTITION COLUMNS:([Expr1002]))

|--Hash Match(Partial Aggregate, HASH:([Expr1002]), RESIDUAL:([Expr1002]=[Expr1002]) DEFINE:([partialagg1010]=MAX([HD_PRDC_EVNT_HIST].[qty_dfct_accm]), [partialagg1011]=MIN([HD_PRDC_EVNT_HIST].[qty_dfct_accm]), [partialagg1012]=M

|--Compute Scalar(DEFINE:([Expr1002]=datepart(hour, [HD_PRDC_EVNT_HIST].[dt_prdc])))

|--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:([TEST].[dbo].[HD_PRDC_EVNT_HIST]) WITH PREFETCH)

|--Index Seek(OBJECT:([TEST].[dbo].[HD_PRDC_EVNT_HIST].[IX_HD_PRDC_EVNT_HIST]), SEEK:([HD_PRDC_EVNT_HIST].[dt_prdc] <= [@te] AND [HD_PRDC_EVNT_HIST].[dt_prdc] >= [@ts]), WHERE:([HD_PRDC_EVNT_HIST].[num_mch]=10) O

I will say, the queries already run an order of magnitude faster.
I made the mistake of thinking that since the fields were already included in the clustered PK that they were indexed appropriately.
Should I add an index to the other fields that are often in the where clause? I know we want to strike a balance between indexes and slowing down updates/inserts.
 
>>I made the mistake of thinking that since the fields were already included in the clustered PK that they were indexed appropriately.


if you have a composite index and you are just interested in one column in you WHERE clause then if it is not the first column in the composite index SQL still has to do an index scan

>>Should I add an index to the other fields that are often >>in the where clause?

you have to test this to find the optimal balance or maybe shuffle the columns in the index around




Denis The SQL Menace
--------------------
SQL Server Code,Tips and Tricks, Performance Tuning
SQLBlog.com, Google Interview Questions
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top