Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

How to get rid of HASH

Status
Not open for further replies.

DavidJA

Programmer
Joined
Jan 10, 2002
Messages
58
Location
AU
Hey all,

I've only been developing on SQL server for a couple of years, but in those years I have learnt that HASH joins are BAD for performance. This is because they seem to always select EVERY record from table one and then pass it to the HASH joiner which does the filtering/matching.

I've come across a situation where I simply can't get rid of the has joins.

I'm basicly creating some quick lookup tables, which are kind of like indexed views, they only contain 1 PK and FKs for about 5 tables.

The application is a Job Management System for a manufacturing company, and OrderItems has another table called WorkflowProcesses, and for each manufacturing process that an order item goes through, there is a correspending row in WorkflowProcesses. If the order is current one row in WorkflowProcesses will have it's "ActiveProcess" bit set.

With around 1,000,000 rows in WorkFlowProcesses, and around 200,000 orders, queries are getting a little slow (around 2-3 seconds).

So what I've done is created a CurrentWorkflowProcesses. I have SPs to insert and delete from CurrentWorkflowProcesses as needed. CurrentWorkFlowProcesses contains a PK, and FKs for OrderItemID, OrderID, ClientID AND JobID. This way queries on a particalar client benifit also.

The problem is that every table that is joined to CurrentWorkflowProcesses is being HASH JOINED.

This is resulting in query plans with around 2000 rows comming from CurrentWorkFlowProcesses, 200,000 rows comming from tblOrderItems and being HASH joined to the 2000 rows, then 50,000 rows comming from tblOrders and being HASH joined.

If I use a INNER LOOP JOIN hint, the performance is a lot better, so how do I get SQL to choose a LOOP join everytime for these queries? (without using a hint) What should be indexed?

 
You need a combination of clutered and nonclustered indexes on all tables. On SQL the sequence of serching is Clustered indexes then Nonclustered indexes and then Table Scan, the decision on change this sequence or how many row the Clustered index would return before going to Table Scan is based on the current version of the statistics on the tables compared to the state of the server when the query plan is generated.
The way you could help to make this decision process to be more accurate as far as performance is to folow some guide lines.

- Keep the Statistics on the Server allways current
- Keep consistency on the where clause of all queries like
where Office = 'abc' and Datetime = Getdate()
where Office = 'abc' and OrderId = 'a123'
this helps the statistics to identify the most important columns on a table as far as the number of times you request it.
- Keep consistency on the join sequency as well
- Also Target you indexes to single columns avoinding multicolumns indexes. AL Almeida
NT/DB Admin
"May all those that come behind us, find us faithfull"
 
Thanks for that;

I've got a few couple of questions based on what you said: (I've also been wordering about these for a long time, but I have never found the information).

Does join order matter on INNER joins; EG, is:
SELECT * FROM tblOrders INNER JOIN tblOrderItems on tblOrders.OrderID = tblOrderItems.OrderID the same as:
SELECT * FROM tblOrderItems INNER JOIN tblOrders on tblOrderItems.OrderID = tblOrders.OrderID

and is FROM tblClients INNER JOIN tblOrders ON ... INNER JOIN tblOrderItems ON ... the same as FROM tblOrderItems INNER JOIN tblOrders ON ... INNER JOIN tblClients ON ...

and When ever I 'hand code' my FROM statments, they are always flat, eg FROM tblA INNER JOIN tblB on tblA.PK = tblB.PK INNER JOIN tblC on tblB.PK = tblC.PK. If I produce the same query in MSAccess's query designer, the SQL that comes out dosn't look like this, it has a lot of brackets everywhere, eg tblC INNER JOIN tblA on (INNER JOIN tblB))

Is my way of joining tables OK?
 
The order does not matter as far as the result set but it does that all the queries that join the same table would keep the same order to optimize the Server Statistics, that will give more handle on Over all Server Performance.

- The Brackets are optional it you follow the object naming recomendations. AL Almeida
NT/DB Admin
"May all those that come behind us, find us faithfull"
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top