Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Does this subquery rerun for every row? 1

Status
Not open for further replies.

VBAjedi

Programmer
Dec 12, 2002
1,197
KH
I'm writing a query to compare two copies of a table for changes (each copy contains about 10,000 records). The unique key that I'm matching on is a combination of two fields. So...

I wrote a query for each table that simply selects all fields from the table and adds a concatenated field to create the unique key. Then I used those two queries as my FROM sources (which is essentially the same thing as a subquery, right?).

It works but it's running extremely slow, making me wonder if it's rerunning those two queries for each of the 10,000 rows?!?! I wouldn't think it needs to, but if it is that would be processing 100 million rows (twice, actually), which would certainly explain the slowness!!! :)

Here's the actual query. "CurrCReportMOD" and "PastCReportMod" are the two queries I use to create the concatenated field, which is "POSNOandEIN":
Code:
SELECT 
CurrCReportMOD.POSNOandEIN, 
PastCReportMOD.POSNOandEIN, 
CurrCReportMOD.EIN, 
CurrCReportMOD.[Employee Name],  
'New Employee placement' AS Status, 
CurrCReportMOD.Posno AS PositionNo 
FROM 
CurrCReportMOD LEFT JOIN 
PastCReportMOD ON CurrCReportMOD.POSNOandEIN= PastCReportMOD.POSNOandEIN
WHERE 
PastCReportMOD.EIN Is Null And 
CurrCReportMOD.EIN Is Not Null



VBAjedi [swords]
 
What is the SQL code of CurrCReportMOD and PastCReportMOD ?

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
The SQL in CurrCReportMOD and PastCReportMOD is identical other than that it points to the two different source tables. All it does is SELECT every field from the source tables, and add a single calculated field ("POSNOandEIN") as the first field that uses IIF to return a null if EIN is null (POSNO is never null) and the two values concatenated if EIN is not null. The actual SQL is (sorry, not formatted pretty):

Code:
SELECT IIf([EIN] Is Null,Null,[Posno] & [EIN]) AS POSNOandEIN, CurrCReport.ID, CurrCReport.RDC, CurrCReport.PDC, CurrCReport.OrgStr, CurrCReport.Authno, CurrCReport.Posno, CurrCReport.EIN, CurrCReport.[Employee Name], CurrCReport.PosCls, CurrCReport.PosRng, CurrCReport.[PICS Class], CurrCReport.[PICS Rng], CurrCReport.Repr, CurrCReport.EClass, CurrCReport.ERng, CurrCReport.[Employee Class Title], CurrCReport.[Work Class], CurrCReport.[Pay Opt], CurrCReport.[Rng Opt], CurrCReport.RSD, CurrCReport.[Layoff Svc Date], CurrCReport.[Sal Elig Date], CurrCReport.[Perf Appraisal], CurrCReport.[Trial Svc End], CurrCReport.[Appt Date], CurrCReport.[Sep Date], CurrCReport.[Base Rate], CurrCReport.[Pay Basis], CurrCReport.[Pay Rate], CurrCReport.[Dif Pay1], CurrCReport.[Dif Fmla1], CurrCReport.DifAmt1, CurrCReport.[Dif Pay2], CurrCReport.[Dif Fmla2], CurrCReport.DifAmt2, CurrCReport.[Dif Pay3], CurrCReport.[Dif Fmla3], CurrCReport.DifAmt3, CurrCReport.Step, CurrCReport.[Off Step], CurrCReport.[At Max], CurrCReport.[E Stat], CurrCReport.[F Stat], CurrCReport.[Appt Type], CurrCReport.[FP Code], CurrCReport.[FP%], CurrCReport.PICS, CurrCReport.[DF Code], CurrCReport.FLSA, CurrCReport.OT, CurrCReport.Div, CurrCReport.Section, CurrCReport.[Unit Name], CurrCReport.SDA, CurrCReport.Program, CurrCReport.LCC1, CurrCReport.[LCC1%], CurrCReport.LCC2, CurrCReport.[LCC2%], CurrCReport.LCC3, CurrCReport.[LCC3%], CurrCReport.LCC4, CurrCReport.[LCC4%], CurrCReport.FTE, CurrCReport.Months, CurrCReport.OF, CurrCReport.FF, CurrCReport.GF, CurrCReport.[Phase in], CurrCReport.[Phase out]
FROM CurrCReport;

VBAjedi [swords]
 
So, why not simply this ?
Code:
SELECT 
CurrCReport.POSNO + CurrCReport.EIN AS CurrPOSNOandEIN, 
Null AS PastPOSNOandEIN, 
CurrCReport.EIN, 
CurrCReport.[Employee Name],  
'New Employee placement' AS Status, 
CurrCReport.Posno AS PositionNo 
FROM 
CurrCReport LEFT JOIN 
PastCReport ON (CurrCReport.POSNO + CurrCReport.EIN)=(PastCReport.POSNO + PastCReport.EIN)
WHERE 
PastCReport.EIN Is Null And 
CurrCReport.EIN Is Not Null

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
Son of a GUN. Because I didn't know you can concatenate like that in the ON section of a query? Have a star for taking me back to Access elementary school. :-D

Strange though... that query returns the full result recordset (about 200 rows) in about 20 seconds on my machine, but even after I can see the results on screen the core of my CPU that's running Access stays at 100% and Access basically freezes for about 10 minutes. My machine isn't the latest greatest, but that still seems inordinately long for running this on two 10,000-row tables. Does it not?

VBAjedi [swords]
 
Ok, conceptually... if I LEFT join two tables, then for each row in the left table the SQL engine has to scan every row in the right table for matches, right? And it has to keep scanning the rest of the right table even once it finds a match because their might be more matches, right?

If that's true then LEFT joining two 10,000 row tables means that it's processing 10,000 x 10,000 = 100 million rows no matter what I do. So these 10-minute query run times probably aren't going away, huh?

VBAjedi [swords]
 
What are the primary keys of CurrCReport and PastCReport ?

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
These "tables" are simply an import of a flat file personnel report I get monthly (and I don't have access to the database the report is generated from). As such, the only primary key in the reports columns is that combination between Position Number (Posno) and Employee Number (EIN). I can't just use Position number because our badly designed system allows multiple employees to share the same Position number (and also allows a single employee to occupy multiple position numbers at once).

I do slap an autonumbered key column (creatively named "ID") on the report when I import it as a new table each month, but I don't think that's factoring into this query at all.

Not sure the background is necessary in this situation, but better too much info than too little. Basically I'm tasked with creating lists of new employees, employees that left, employees that were promoted, etc by comparing these monthly reports.

Importing each month's report as a new table seemed like the simplest way to answer many of these questions. It allows me to quickly compare the current report to any past month. The biggest thing I haven't figured out how to do yet is to answer questions like "What's the average number of employees hired each month over the last year?" but that's probably a question for another post/day. :)

VBAjedi [swords]
 
So am I correct in my understanding that LEFT joining two 10,000 row tables means that it's processing 10,000 x 10,000 = 100 million rows?

VBAjedi [swords]
 
So am I correct in my understanding that LEFT joining two 10,000 row tables means that it's processing 10,000 x 10,000 = 100 million rows?

It should not be processing 100 million rows.

There could be other reasons that it takes so long. Are these tables actually in your database or are they linked out on a network somewhere?

Greg
People demand freedom of speech as a compensation for the freedom of thought which they seldom use. Kierkegaard
 
The tables are in my database, not linked.

I guess I don't understand the logic of how SQL works then... and I'm having trouble finding any high level descriptions on the web.

As I understand it, for a LEFT join the SQL engine steps through every row in the "left" table, and for each row scans the "right" table looking for rows that match the WHERE criteria of the join. If that's true, then it seems to me that when two 10,000 row tables are involved, it's scanning 10,000 x 10,000 rows = 100 million rows.

VBAjedi [swords]
 
Create a composite key (POSNO,EIN) in both tables.
Then try the following:
Code:
SELECT 
CurrCReport.POSNO + CurrCReport.EIN AS CurrPOSNOandEIN, 
Null AS PastPOSNOandEIN, 
CurrCReport.EIN, 
CurrCReport.[Employee Name],  
'New Employee placement' AS Status, 
CurrCReport.Posno AS PositionNo 
FROM 
CurrCReport LEFT JOIN 
PastCReport ON CurrCReport.POSNO=PastCReport.POSNO AND CurrCReport.EIN=PastCReport.EIN
WHERE 
PastCReport.EIN Is Null And 
CurrCReport.EIN Is Not Null

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top