Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Data Warehouse and indexes

Status
Not open for further replies.

UHNSTrust

Technical User
Dec 2, 2003
262
GB
I have just moved to a new company who have set up a data warehouse for a lot of their aggregated data.

They are all flat tables with no relationships. As an information dept we are continuously interrogating numerous single tables. Some of the tables are quite big (5.5 million records with 60 fields). When we are doing basic select queries we can wait 2-2.5 minutes for our results (some of the delay could be down to network reasons).

I have noticed that none of the tables have any indexes at all.

As I am new to Data Warehousing scenarios, could somebody advise me if this is the correct way to do it. The data warehouse system was purchased as a package. In the past I have only dealt with relational data (indexes being priorities).

Sitting and waiting for select queries to return data on a LAN is a real pain when you can do an internet search with a site like google that has a reported 1 Billion links and get a returned result in a second.

Thanks for any advice

Jonathan
 
Index would definitively improve the performance. But it would harder to decide which fields to Index; especially you are new to the data schema.
The way I would start is take one table with lot of fields and records. Start Index the primary key, and work your way to where conditions.


Dr.Sql
Good Luck.
 
Thanks for the replies Denny and Dr. SQL.

Looks like a battle with our SQL administrator who says it is best without indexes but without any reasoning.

Jonathan
 
To justify your point you can try to Exececute a a simple statement towards a big table before and after creating the index watch the time. That should be a good start, but you will in your DBAs "$#!^ list'
'I will get to you issue when I have time..'.
So good luck

Dr.Sql
Good Luck.
 
In the DW enviroment Clustered indexes are going to give you the most increase in performance with nonclustered assisting in rare occasions.

Dimension tables: should have a Clustered PK or the Surrogate Key should be clustured.

Fact Tables: The Key column that tie to your dimension tables.

Example:
Code:
dimDate

DateKey(PK)     Date   Year   Month
-------         ----   ----   -----
123            1/1/05 2005   Jan
124            1/2/05 2005   Jan

dimCust
CustKey(PK)     CustID    FName     LName
-------         ------    -----     -----
1               9871      Bob       Smith
2               8921      Jane      Doe

FactSales

DateKey     CustKey   ProdKey   QTY    Price    Cost
-------     -------   -------   ---    ------   ----
123         1          98       1      10.00    3.23
123         1          922      1      1.00     .33
In the Above the Clstered on the dimXXX tables would be the xxxKey Columns. On the fact table it would be the xxxKey columns as well. The thing to be careful in the fact table is to order your coumns correctly. I shoot for a pyramid type structure of least possibilities to greatest posibilities.

Example (1 Years data)

Dates 365
Products 500
Customers 2000


The thing to remember in the DW is that you optimize for read performance and not necessarily write performance.




Shoot Me! Shoot Me NOW!!!
- Daffy Duck
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top