Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations wOOdy-Soft on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Update schema stats and its frequency

Status
Not open for further replies.

dbalearner

Technical User
Aug 23, 2003
170
GB
I have recently been given responsibility for a back office Oracle 9i system. Users were complaining that putting through 1000 trades using one schema takes roughly an hour. On further investigation I noticed that out of 350 tables only three big tables has the stats updated in May this year!

In actual fact if I do a rowcount on these tables as shown below the number of rows is much higher

Tables with stats on = 3

Tables with no stats on = 347


TABLE_NAME BYTES NUM_ROWS Last Analyzed
------------------------------ ---------- ---------- --------------------
ACCOUNT_POSTING 981467136 378723 2005/05/18
AUDIT_MVT 578813952 1050945 2005/05/18
HISTOMVTS 134217728 201394 2005/05/18

Total count of rows in ACCOUNT_POSTING = 7538240

Total count of rows in AUDIT_MVT = 2901447

Total count of rows in HISTOMVTS = 573802

I have already backed up whatever the stats is for this schema using the following:

EXEC DBMS_STATS.CREATE_STAT_TABLE ('SOPHISNY','SOPHIS_STATISTICS','USERS');
EXEC DBMS_STATS.EXPORT_SCHEMA_STATS ('SOPHISNY','SOPHIS_STATISTICS','SOPHISNY','SOPHISNY');

Now I would like to update stats in all this schema's tables using the following:

DBMS_STATS.gather_schema_stats(
ownname => 'SOPHISNY',
cascade => true);

Since tables are fairly small I have opted for full (as opposed to ESTIMATE option). Believe it or not we do not have a dev environment for this application so I have to update stats in production without testing it. However, I am convinced that it will only do good. I will then finish the task by:

alter system flush shared_pool;


The questions that I have are

1) Does this approach make sense and there will not be any side effects (in case)
2) With full schema stats what sort of locking is applied to tables and indexes and will there be any blocking locks. Obviously I will be doing this task at a maintenance time
3) I have been tol to believe that update stats is only required if more than 10% of rows in underlying table ared added or removed. I can easily do the update stats nightly. If so will there be any detrimental effect? Or shall I opt for weekly update stats?

Many thanks,

Learner
 
Learner,

1) Your strategy is harmless.

2) Remember, gathering stats is a READ activity with no locking either implicit or explicit. The Oracle world is famous for the fact that no one is ever blocked from reading. And in this case, since there are no locks, your stats gathering will also not block writers.

3) I would be interested to see the reference on the "10%" threshold. The bottom line is, if stale stats negatively impact performance, then you should gather fresh stats. We re-gather stats by schema on a weekly basis. You may need nightly if your numbers swing more drastically.

ESTIMATE is less invasive than COMPUTE. We use a 10% ESTIMATE and have found that the results are remarkably accurate without negatively impacting performance.

[santa]Mufasa
(aka Dave of Sandy, Utah, USA)
[ Providing low-cost remote Database Admin services]
Click here to join Utah Oracle Users Group on Tek-Tips if you use Oracle in Utah USA.
 
Thanks for your feed

The other option (in place of compute) is for me to use the following which generates the statistics for all tables and indexes owned by the schema by analyzing 10% of data from each of the objects. It will also creates histograms (with 10 buckets) on all columns which are part of any index. However, no column statistics will be collected for the remaining (non-indexed) columns
*/
DBMS_STATS.gather_schema_stats(
ownname => 'SOPHISNY',
estimate_percent => 10,
method_opt => 'for all indexed columns size 10',
cascade => true);

I believe this apprioch as you correctly suggested is good (appreciate any comments/suggestions). However, as a matter of interet if I use estimate, will the column user_tables_num_rows display the correct number of rows or does it matter?

Thanks
 
Hi Learner,

Why flush the shared pool?

We use the compute method, but only on stats that need refresh:
Once:
Code:
exec DBMS_STATS.ALTER_SCHEMA_TABLE_MONITORING (
       ownname    => 'SOPHISNY',
       monitoring => TRUE);
And every night:
Code:
exec DBMS_STATS.gather_schema_stats(
       ownname => 'SOPHISNY',
       estimate_percent => NULL,
       method_opt => 'for all indexed columns size 10',
       options => 'GATHER STALE',
       cascade => TRUE);

The view sys.dba_tab_modifications shows you the "traffic" on your tables. Based on this values and the size of the table the system decides whether the stats have to be updated or not.

Stefan
 
Hi Stefan,

I flush the shared pool because I assume that you want to get rid of the old and potentially wrong plans based upon the old stats.

With regard to your advice on sself tuning, from your experience does system get it right most of the time (as a matter of interest)?

Thanks

 
When you gather statistics all references to the analyzed object in the shared pool are marked invalid and your statements is hardparsed anyway. So no need to flush.

We use GATHER STALE for live databases and found it to be quite accurate. You still have the option to exclude a table from monitoring with
Code:
ALTER TABLE <table_name> NOMONITORING;
and analyze it more frequently, but until now I never had to.

Stefan
 
Stefan,

Thanks again. What are the detriments of flushing the shared pool. I assume that immediately after flushing the shared pool, applications will run slowly.

I think flushing shared pool is useful in benchmark performance testing where you do not want to bounce the instance but you want to go back to a clean state. Is this correct or there are other valid reasons for flushing the shared pool?

Thanks
 
Thanks guys. I had to apply the update schema stats this morning. All went through OK with trades being processed 1000 every 10 minutes as opposed to 1000 every hour before sp effectively a 6 fold performance gain.

I am planning to do the full compute stats every night for now. It just takes over half an hour. I know it is an overkill but would not do any harm either. I will shift it to a weekly task later.

Learner
 
Santa,

In response to your query:

3) I would be interested to see the reference on the "10%" threshold. The bottom line is, if stale stats negatively impact performance, then you should gather fresh stats. We re-gather stats by schema on a weekly basis. You may need nightly if your numbers swing more drastically.

The following might give an indication from
"Objects are considered stale when 10% of the total rows have been changed. When you issue GATHER_TABLE_STATS with GATHER STALE, the procedure checks the USER_TAB_MODIFICATIONS view. If a monitored table has been modified more than 10%, then statistics are gathered again. The information about changes of tables, as shown in the USER_TAB_MODIFICATIONS view, can be flushed from the SGA into the data dictionary with the DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO procedure."

Hope this helps
 
Santa,

Also this statement in
indicates that during analyze index shared locks are applied to the table so there is a potential point of contention as I thought. However, this doc may be incorrect.

Statistics locks?
A table that is in the process of being analyzed cannot undergo DDL changes but DML activities can be carried on. Analyzing an index puts a shared lock on the related table; hence, neither DML nor DDL can be performed. Preferably avoid all activities during the statistics generation phase
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top