Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations bkrike on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Why does query take either 1 minute or 2 hours to run?

Status
Not open for further replies.

PeterBickford

Programmer
Sep 1, 2005
9
US
Hi folks,

Working with a MS Access 2002-format database, I'm firing off the following query (VB format):

Code:
SQL = "SELECT UI FROM Issues I RIGHT JOIN Update_Issues UI ON (I.FullIssue = UI.FullIssue) AND (I.Title = UI.Title)" & vbCrLf & _
    "WHERE I.FullIssue IS NULL" & vbCrLf & _
    "ORDER BY UI.Title, UI.FullIssue;"
  newRS.Open SQL, myConn, adOpenForwardOnly, adLockReadOnly
The tables in question contain about 250,000 issues, indexed on both FullIssue and Title. In most cases, the query runs in under a minute. Every so often, however, it takes about an hour or more to run. During that time, the CPU is running at about 50%, memory is about 75% free, and the disk is nearly silent.

What's going on? What system resource is being exhausted which is causing the slowdown, and what can I do to prevent it?

Thanks for any help folks can offer. A bottle of bubbly goes to the first person who help me get to the bottom of this one!

-Pete
 
Just a few questions <g>.

Are you running across a network or on a standalone PC?

Are there other users on the database?

Can you get exclusive use of the tables before running the query?

Do you have anything running on a scheduler? I had a virus scanner which would slow the PC to a crawl whenever it decided to run a full system scan.

How many Null records are there? If there are only a few you might find it faster to pull these out into a temporary cursor and use this smaller cursor in the Join.

One final thought, the query obviously works but I wonder how Access evaluates I.FullIssue = UI.FullIssue when I.FullIssue is Null? Normally Null isn't equal to anything.

Geoff Franklin
 
It's on a standalone PC, with exclusive access to the tables, and only a single user. No other processor-intensive activities are running (the task in question is accounting for about 48% of the 52% CPU load throughout).

There are likely only a handful (< 200 null records) <I.e., records which are in the one set, but not in the other>. I'm not sure how you would efficiently use the other you were referring to--can you elaborate?

I.FullIssue is Null is basically just evaluating which records are in the Update set (UI) but not in the original set (I), in the Right Join.

-Pete
 
SQL = "SELECT UI FROM Issues I RIGHT JOIN Update_Issues UI ON (I.FullIssue = UI.FullIssue) AND (I.Title = UI.Title)" & vbCrLf & _
"WHERE I.FullIssue IS NULL" & vbCrLf & _
"ORDER BY UI.Title, UI.FullIssue;"
newRS.Open SQL, myConn, adOpenForwardOnly, adLockReadOnly

I am perplexed by the speed issue unless there is a pattern such as the first run is slow but the second run is fast.

If it was always slow, then indexes is where I would look. Perhaps there is an issue with the indexes being corrupt or something. Seems pretty weird that it's all local and such a speed difference.

Your database file size isn't at 2GB is it? I know that gives weird behaviour sometimes.
 
I am perplexed by the speed issue unless there is a pattern such as the first run is slow but the second run is fast

Me too. If you're the only user and the PC's not on a network and there are no hungry processes running and there are no major changes to the data between runs then I can't see what can be causing it. You would see a slow run if the index were corrupt but that wouldn't get better and return to normal running of its own accord.

That's a thought, are you compacting the database and defragmenting the disk between these slow runs?

Geoff Franklin
 
There's really no pattern that I can discern, although Compacting just before the slow run >seems< to prevent the problem (although this also involves a database close/reopen, so it's hard to tell whether the underlying condition is system resource-based, whether simply reducing the size of the database gets it under the "death join" threshold. Since the program is used by some thousands of users (who have reported the problem sporadically), it'd be key for me to know what's behind the problem, instead of just chewing up 3 minutes on a Compact, if at all possible.

Does anyone have any deep knowledge of what Jet is doing during that JOIN which would involve such a processor load, while at the same time not apparently doing much to burn memory or disk?

As a possibly useful, or possibly distracting data point: I have done similar work with a related database, where one user's database exhibited the same sort of behavior when there were > 210,000 rows of data to join (It would always burn CPU, but not disk or memory, with the Join taking over an hour). When I throttled back the Join set to < 210,000 rows, the Join happened in seconds. This was repeatable 100% of the time. Database size was approximately 189 MB.

-Pete

 
Hmmm I have a database (inheirted and just finishing a miogration to Oracle thankfully) that has 2 tables of 1GB each for every month! They are in separate database files and the main app links to these tables for any given month as required. Even querying across non indexed text fields joining one of these beasts (over 3 million records) to a clients table (200 records) takes less than an hour and that's on a P3-800.

Let's throw the virus/trojan/spyware bat at this and see if you can duplicate the issue on another machine? Probably a bit hard to do since if you have an undetected nasty on one machine then it's likely to spread anyhow.

Things I would be doing are to remove the indexes on the two fields in the join and run the query. This may take so long that it's worthless, but if it happens to take around 90 minutes then I think it's more likely to point at indexes.

As I wrote this I thought about memo fields and OLE Object fields. Do your tables happen to have any of these in them? If so, could you try creating two subqueries to present only the fields that are relevant to your main query and run that? Probably just another straw to grasp at.

Have you tried starting a new database and importing everything across? That's also a last resort idea that I've done in the past when I lost the ability to view any of my modules after a GPF.
 
I am able to recreate this (sporadically) on numerous machines, both in-house and via reports from customers. The in-house machines all undergo weekly virus checks, etc., and I feel confident they're clean.

The problem (particularly with the latter database I described, assuming it's the same problem) does occur after both dropping/recreating the relevant indexes, and converting to a different format of Access (e.g. moving to 2000-format vs. 2002)

Memo fields are involved in the table, but not as part of the JOIN expression. The JOINed fields are both VarChar (255 for Title, 20 for Issue)

-Pete
 
Do the Memo fields have large chunks of data in them?

You've stated that the database is local to the machine so are you definitely linking to the database using Jet? I ask this because I know for an app that I inherited I had big issues with memo fields involved in a query (granted not what you have) when the tables were linked via ODBC. I read a little about the Memo fields and page locking previously (might be an earlier version of access).

Sinc eyou're probably willing to go through hoops to see this resolved, can you remove the memo fields completely form the two tables and re-test?

Since you have the problem on many machines and not only ones you or your organisation built (i.e. not all built with same patches/Windows updates etc) it really has to be something that any of us could reproduce, given the same table structure and populating it with suitable rubbish. I'm willing to try that if you fire over the DDL or equivalent to allow me to create the same tables. Then I can populate the tables with psuedo-random data to see the behaviour too (maybe).
 
I appreciate the idea on messing with the memo fields, but remember, the query I'm currently worried about (as opposed to the one I *suspect*, but can't verify may be related) exhibits this behavior only sporadically. It appears to be resource-related, and compacting *seems* to prevent it from occurring. As such, running a single before/after test isn't likely to be instructive as to determining whether or not the thing we changed affected the result.

(I'm also reluctant to pursue this particular path without good reason, as the query is actually worthless without the memo fields; It would require us to then loop through the result set, firing off Select's to match the rest of the fields, a process which stands a fair chance of taking 10-30 minutes to run every time, due to the number of potential records.)

Returning to the original question: does anyone have deep knowledge of what Jet (and yes,it is OLEDB/Jet 4.0 as the provider) is actually doing with the CPU during the lengthy join? If the disk is silent and there is extensive RAM, what's the CPU actually doing all that time (on the times the query runs an hour vs. the times it runs in seconds).

-Pete
 
so are there memo fields in the original query then?

Som links that you probably already have seen. Nothing to tell much about the inner workings. It might be time to use a support call to Microsoft.

If any of those fields in your original query are Memo then you're doing something I have never doen and would never try in any database. Even Oracle and MSSQL don't recommend BLOB to be used in join criteria due to performance reasons.


Also here..

 
To PCLewis:

I realize the post is getting long, so you may have missed it: The Join is along Title and FullIssue, both of which are indexed varchar fields. There are memo fields in the query result set, but they aren't part of the join criteria.
 
To PHV:

Thanks for the article on ShowPlan. Here's the output of that join:

- Inputs to Query -
Table 'Issues'
Table 'Update_Issues'
- End inputs to Query -

01) Sort table 'Update_Issues'
02) Sort table 'Issues'
03) Outer Join result of '01)' to result of '02)'
using Merge Join
join expression "UI.FullIssue=I.FullIssue And UI.Title=I.Title"
04) Restrict rows of result of 03)
by scanning
testing expression "I.FullIssue Is Null"
05) Sort result of '04)'



I'll admit it all looks normal to me. Does it tell you anything?

P.S.: I initiated a close DB/Compact/Reopen DB in the code before the problematic join, and it seems to prevent the problem, but does anyone know what the problem happens in the first place? (Uncompacted DB size runs about 430 MB, compacted size is just under 200 MB).
 
indexed on both FullIssue and Title
Hopefully no two index but one composite index (on both tables).

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ181-2886
 
I'm afraid it's separate indexes. Is Jet not able to take performance advantage of it in this manner?

If so, does anyone know the SQL to create and drop a composite index? (Since I'd have to do patch up on the fly for existing customers).

-Pete
 
CurrentDB.Execute "CREATE INDEX uiTitleIssueIdx ON Update_Issues (Title, FullIssue)"
CurrentDB.Execute "CREATE INDEX iTitleIssueIdx ON Issues (Title, FullIssue)"

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ181-2886
 
Sigh. I've added the joint index, and it seems like it helped some folks, but not others. Anyone have any other ideas as to what might be happening?
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top