Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Professional input needed 4

Status
Not open for further replies.

nice95gle

Programmer
Nov 25, 2003
359
US
Not sure if this should be placed here or in the 'General Database Discussion'. So let's start here.

I've been pulled in to analyze our production database and help look for potential problems. I've been able to come up with some very interesting findings (much different from the analysis side of the fence), but now I’m faced with a little problem. As I'm looking at queries running against large tables (1 mil rows and above) I see allot of table and index scans. I'm making notes to present to management, when one of the developers come and say “if I was you I would put to much time in to those scans”. He stated that 99% of these queries run in under 2 sec or less; which they do. I do know that when I send this report to management this might mean extra work for the development team, so I can see were his concern is coming from. I've always been trained to believe that if a scan is being performed on a large table regardless of execution time, this is a bad thing in terms of performance. So my question is, should I report the queries running quickly or only focus on the queries taking a longer time to execute.

Thanks

Well Done is better than well said
- Ben Franklin
 
If the audience is non-technical, I would not report them. If the audience is technical, I would report them, but also stipulate that they are currently running quickly, and may be something which should be looked at in the future.
 
over a million rows and still completing in under 2 seconds seems pretty good to me.

even if you make the change to force index seeks, you can get a max improvement of 2 seconds.

However, if the table is being queried every 5 seconds then a saving of 2 second is HUGE...

I would focus on improving overall performance from the users point of view, even making some things less effecient or slower as long as the over experiance is better.

While it is always good to aim for perfection, there's not always the resources to get there, so try to do the best you can with what you have and IF there's time / resources left over, then you can fix the little things...

--------------------
Procrastinate Now!
 
I would look at all the queries.

Let's take your 2 second query as an example. Suppose users are running queries that take 2 seconds to run at the same time another user is running a query that takes 20 seconds. the 20 second query is likely to take longer because the 2 second query is also running. This problem can be alleviated somewhat by having a server with multiple processors.

Multiple processors will not completely solve the problem either. Suppose you have 4 cpu's and there are 10 users running 2 second queries. It all adds up.

My approach would be....

Find the queries that are performing index scans. Fix those first. Then, concentrate on the longer running queries. As you fix your longer running queries, eventually the 2 second queries will be the longest running, and you'll fix those too. What I'm saying is, all queries should be fast, but you need to prioritize your time. By fixing the table scans first, some of your longest running queries may be affected simply because it can do an index scan/seek instead of a table scan.

While you are going through this process, do NOT go willy nilly adding indexes to your database. Indexes will slow down inserts, updates, and deletes. This is usually tolerable if there is only a couple indexes on a table. However, if you have a large number of indexes on a table with millions of rows, you will begin to notice the time it takes to do the insert/update/delete.

Hope this helps.

-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Thanks to everyone who has responded thus far.

If the audience is non-technical, I would not report them.
Yes the audience is technical.

over a million rows and still completing in under 2 seconds seems pretty good to me.
That's the same thing I said. But I believe there's another process(s) I haven't seen yet that takes 20 - 30 mins to run. I think they are trying to fix a few problems by dissecting the stored proc's

Find the queries that are performing index scans. Fix those first.
That's what I would like to do; I just wanted to make sure I wasn't wasting time by also focusing on the faster queries.

While you are going through this process, do NOT go willy nilly adding indexes to your database.
Part of my job is to also find indexes no long in use(that's a whole other thread) so I wouldn't be recommending adding any more indexes until my analysis is complete.




Well Done is better than well said
- Ben Franklin
 
I meant to say....

Find the queries that are performing [!]table[/!] scans. Fix those first

-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
I see it from a slightly different perspective.

1 - ignore long-running queries that run when users are not in the system, if there is such a time. If something is scheduled to run each night at midnight and that no one else is using the system then, ignore it for now.

2 - Look at total accumulated execution time for each query, not single execution time. If one query takes 30 seconds, it's a good candidate for speeding up, of course. But if that query is run no more than 10 times a day, and another query which takes 2 seconds is run 10,000 times a day, cutting the 2-second query in half will save 10,000 seconds of execution time. Cutting the 30-second query in half will save 150 seconds of execution time.

3 - If switching from table scan to index scan or seek is easy, go ahead and do that. But place your effort where the greatest benefit will be realized the most quickly. Saving even 5% on the 2-second query will save 500 seconds of execution time. That's still more than three times the payback of the 30-second query.

4 - Decide whether you are trying to improve individual user experience, or optimize the server. Optimizing the server sometimes involves making each user wait a little longer for the benefit of the server being able to service more users at once. This tradeoff depends on the load and usage on your servers. If you have tens of thousands of users, forget individual user execution time and focus as much as possible on total server load.

[COLOR=black #e0e0e0]For SQL and technical ideas, visit my blog, Squared Thoughts.

The best part about anything that has cheese is the cheese.[/color]
 
This is good stuff. But it does seem that everyone has a different perspective on how to prioritize optimization.

Different points of view are exactly what I was looking for.

Thanks

Well Done is better than well said
- Ben Franklin
 
Also, think about batches and locking.

Batches:

I had a query I needed to perform on a table with almost 3 million rows. It was taking to long and I was getting an index scan.

I batched the query into smaller chunks to get the optimizer to go with an index seek (you can research this with a search engine) and the series of small queries executed about 10 times faster than the one long unbatched query. I experimented until I found the maximum size that would be an index seek instead of scan.

Locking:

Locking is important because it can deny other users the service they require. If your query runs for a long time but others can still do their work, it's not so bad. If you block others for even 5 seconds, this could be a problem, especially if you block others who have locks. Perhaps their lock would have been active for .01 seconds, not really affecting the rest of the users. But your single long-held lock could block a dozen users, all of whom have other locks that will be held. I have seen a single long-held block in one "unimportant" table cascade until dozens of processes were blocked in a tree-like effect.

So for one thing batching locks fewer rows at once, allowing other work to slip in between your chunks.

Another way to help with blocking (and deadlocks, too) is to perform your operations in similar orders. Don't read from table A then B in one procedure, then B then A in another. Choose an order of INSERT, UPDATE, and DELETE, and use that same order in all procedures. Don't insert first then update in one and update first then insert in another.

The more research you do into all aspects of SQL Server, the better you'll be able to identify possible ways to improve performance. It doesn't always involve changing individual queries.

Learn to read execution plans. Learn the effects different join types have. What is a left anti semi join and why does it matter if I know that?
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top