Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Need help finding table space used/unused

Status
Not open for further replies.

e5c4p3artist

Programmer
Sep 11, 2003
13
US
I inherited a project from someone and have relatively little knowledge of Oracle8 and PL/SQL. So you'll have to treat me like a dummy.

What I need to do is calculate the actual space used in several tables. I found a script that appears to be written to do exactly that here:


However, I cannot get it to tell me what I need to know. I have been using Oracle SQL Worksheet (because I'm a n00b that needs the GUI to understand what I'm doing). What I have done is copied and pasted the script into the SQL Worksheet window and tried to run it.

It gives me no critical errors (it does say that 'set verify on' is an invalid SET option), but does not print out any numbers. It simply says "Statement Processed."

I have toyed around with it a bit (for example hard-coding in an owner name and table name for the variables) and have gotten the lines "Displaying Actual Space Used for schema T170 ...." to print, but still no table space information.

What am I missing here? Probably a lot. Can anyone take the time to enlighten me please? I'm a fast learner, I just don't know what I'm doing yet. =)
 
"Art" (short for "e5c4p3artist"),

I'm certain we can help, but tell us more specifically what you want. Do you want to see space consumption at the tablespace, file, or object level? (A small set of contrived sample output would be nice.)

Dave
 
Thanks for the reply. I'm having trouble getting my arms around what I'm being asked for myself, but here is what I'm after:

Say I have one data table and only one. It has only two columns in it: a record ID, and a name column. It is a list of boys first names (or whatever).

When I create the table, it has some amount of space allocated to it, correct? I'm not completely up on my Oracle terminology (even after reading the whole of "Teach Yourself Oracle8 in 21 Days in the last two hours), but I think that would be it's "extent", correct?

In any case, I ultimately want to find out the following values about the table:

1. The number of rows in the table.

2. The amount of storage space allocated to the table.

3. The amount of storage space actually taken up by data stored in the table.

4. Out of the amount of storage space allocated to the table, the amout that is unused - either because nothing has been written there yet or because data has been deleted from it.

Number 1 is easy:

SELECT count(*) FROM <tablename>

Number 2 was done before I took over the project. I did not write the SQL myself, and although it is simple and I understand it fairly well now, I would never have been able to write it before this morning when I learned myself a thing or two:

SELECT sum(bytes)
FROM dba_extents, dba_tables
WHERE dba_extents.segment_name = dba_tables.table_name
AND dba_extents.owner = dba_tables.owner
AND dba_tables.table_name = '<tablename>'

Number 3 is what I'm after in this thread. Now keep in mind that I'm still getting comfortable with writing individual SQL statements, let alone scripts. So when I found the script I mentioned above, I was a bit overwhelmed by it. I have a baground in programming so I definitely understand the basic construct, but not enough to feel comfortable modifying or troubleshooting the script.

Number 4 should be simple once number 3 is calculated if I'm thinking correctly. It should simply be (number 2 minus number 3).

So as an example, say the table gets allocated 500 bytes of storage space. One record gets put into it that takes up 20 bytes. My above answers would be:

1. 1 row
2. 500 bytes allocated
3. 20 bytes actually used
4. 480 bytes unused

Let me know if you need more info or if I'm totally whacked out here in how I'm approaching this. And thanks for the help.
 
Art,

Below is an alternate query that gives precisely what you requested.

Note: before running the query, you need to ensure that statistics (in dba_tables or user_tables) exist for the table. You can gather statistics in a variety of ways (from the SQL*Plus prompt):

For an individual table: &quot;SQL> ANALYZE TABLE <tablename> COMPUTE STATISTICS;&quot;

For an Oracle user: &quot;SQL> exec DBMS_UTILITY.ANALYZE_SCHEMA('<schemaname>','<method>')&quot; [<method> can be either 'COMPUTE' or 'ESTIMATE'

For the entire database: &quot;SQL> exec DBMS_UTILITY.ANALYZE_DATABASE('<method>')&quot;

Here is the code for the query you requested:

col a heading &quot;Rows&quot; format 999,999,999,999
col b heading &quot;Bytes|Allocated&quot; format 999,999,999,999
col c heading &quot;Bytes|Used&quot; format 999,999,999,999
col d heading &quot;Bytes|Unused&quot; format 999,999,999,999
set verify off
accept owner prompt &quot;Enter the owner of the table to check: &quot;
accept tabname prompt &quot;Enter the name of the table to check: &quot;
select num_rows a, blocks*block_size b, avg_row_len*num_rows c
, (blocks*block_size)-(avg_row_len*num_rows) d
from dba_tables, (select value block_size from v$parameter where name = 'db_block_size')
where owner = upper('&owner') and table_name = upper('&tabname')
/
Enter the owner of the table to check: dhunt
Enter the name of the table to check: emp

Bytes Bytes Bytes
Rows Allocated Used Unused
---------------- ---------------- ---------------- ----------------
51,200 3,735,552 3,174,400 561,152

Cheers,

Dave

 
Thanks Dave. I'm *almost* getting it to work. I'm logged in as SYS which is not the owner of the tablespace in which the table resides. The table is called T170 (yep, it's a Remedy Action Request System table) and the owner is ARADMIN. I did try entering the table when prompted as both &quot;T170&quot; and as &quot;ARADMIN.T170&quot; but am getting no rows selected. I can see that the table exists though. Should the script be prompting me for owner? Because it is not as you can see below. So close....... =)

----------------------------

SQL> ANALYZE TABLE ARADMIN.T170 COMPUTE STATISTICS;

Table analyzed.

SQL>
SQL> col a heading &quot;Rows&quot; format 999,999,999,999
SQL> col b heading &quot;Bytes|Allocated&quot; format 999,999,999,999
SQL> col c heading &quot;Bytes|Used&quot; format 999,999,999,999
SQL> col d heading &quot;Bytes|Unused&quot; format 999,999,999,999
SQL> set verify off
SQL> accept owner prompt &quot;Enter the owner of the table to check: &quot;
Enter the owner of the table to check: accept tabname prompt &quot;Enter the name of the table to check:
&quot;
SQL> select num_rows a, blocks*block_size b, avg_row_len*num_rows c
2 , (blocks*block_size)-(avg_row_len*num_rows) d
3 from dba_tables, (select value block_size from v$parameter where name = 'db_block_size')
4 where owner = upper('&owner') and table_name = upper('&tabname')
5 /
Enter value for tabname: T170

no rows selected
 
Art,

Sorry I didn't explain...Since the code includes SQL*Plus &quot;ACCEPT&quot; prompts, you cannot just copy-and-paste (because a copy-and-paste results in the inadvertent &quot;entry&quot; of a SQL*Plus command where your keyboard response for OWNER should be.) So, copy and paste the code into an empty file that you can run as a script from the SQL*Plus prompt as &quot;@filename&quot;. Then it should work properly.

Let me know,

Dave
 
Woohoo! Thanks, Dave. That did it. I modified your script as follows to allow for dynamic statistics gathering on the specified table. It worked okay in my tests, will it be okay in the long run?

Also, as excited as I am by getting this to work, will it only work on systems that have statistics turned on? If so, I'll need to look into writing a script that can get this information on systems that do not have statistics turned on as well. A prospect I won't look forward to unless you are willing to help with that as well. =)

Here's your script with one line modified by me. Does it still look okay? =)

----------------------------------------

col a heading &quot;Rows&quot; format 999,999,999,999
col b heading &quot;Bytes|Allocated&quot; format 999,999,999,999
col c heading &quot;Bytes|Used&quot; format 999,999,999,999
col d heading &quot;Bytes|Unused&quot; format 999,999,999,999
set verify off
accept owner prompt &quot;Enter the owner of the table to check: &quot;
accept tabname prompt &quot;Enter the name of the table to check: &quot;
ANALYZE TABLE &owner..&tabname COMPUTE STATISTICS;
select num_rows a, blocks*block_size b, avg_row_len*num_rows c
, (blocks*block_size)-(avg_row_len*num_rows) d
from dba_tables, (select value block_size from v$parameter where name = 'db_block_size')
where owner = upper('&owner') and table_name = upper('&tabname')
/
----------------------------------------
 
Art,

This particular query relies upon gathered statistics for whatever table you are researching. But if you NEED this information (for some business purpose) then the alternative (not using Oracle-gathered statistics) could be even costlier. The alternative is doing your own &quot;count(*)&quot;, which does a full table scan anyway; plus, there is no easy query to determine ACTUAL space used without doing way tedious LENGTH() calculations. So, I believe Oracle-gathered statistics are probably your best bet.

Now, to the task of gathering statistics: Your method (of gathering statistics within the script itself) is fine, so long as you don't run the script so frequently that re-gathering the statistics becomes costly. The happy medium is to gather statistics when significant changes occur, which we assume is less frequently than running your script.

If you plan to analyze lots, or all, of your tables (which I infer from your earlier post is a job you do not anticipate with glee), then you can gather statistics either on a schema-by-schema basis [&quot;exec dbms_utility.ANALYZE_SCHEMA(<schema>,'COMPUTE'|'ESTIMATE'|'DELETE')&quot; -- notice the options] or a full-database basis. The method for analyzing the full database depends upon whether you are running Oracle 8i or Oracle 9i:

Oracle 8i:
1) exec dbms_utility.analyze_database('COMPUTE')
2) exec dbms_utility.analyze_schema('SYS','DELETE')
...since you do not want statistics for the SYS schema.

Oracle 9i:
exec dbms_stats.gather_database_stats(cascade=>true)

Cheers,

Dave
 
Dave,

I really appreciate all the help you've been so far! Let me give you a little insight into what I'm working on.

The software development company for which I work is about to release a product written to do database analysis specific to Remedy's Action Request System (ARS). I don't know if you've had exposure to ARS, but the typical ARS administrator is not a DBA. Combine this with the cryptic nature of ARS database objects (with generously descriptive table names like T77, H56, and B44), and the database becomes a nightmare to analyze for the ARS admin and has been the cause for more than one communication breakdown between such admins and DBAs over the years.

The product I'm developing allows an ARS admin to define groups of tables and run an analysis on these groups on a defined schedule. Usually there would not be more than one analysis run per day, and most users will probably only run it once a week.

The product is intended to work cross-platform and be backwards compatible for both older versions of Oracle (at least back to 8, not sure about support for 7 yet) and SQL Server on the database side and Remedy on the front-end. Since I'm trying to make it work as broadly as possible, I cannot assume that every potential customer who is running an Oracle dB has statistics turned on. I'd like to, but I can't.

Therefore, my ultimate goal will be to write a SQL query to find the information I seek that does not rely on gathering statistics from an Oracle dB.

I can tell that you are very knowledgeable about Oracle. And perhaps you can give me your opinion about this whole situation and how possible it is to obtain the information I seek without statistics. And any assistance in figuring out and writing the actual SQL scripts (such as the type you've already provided) is very much appreciated. Thanks!
 
Art,

I'm certain we can come up with a solution that does not depend upon Oracle statistics. (In fact, I was a bit concerned about the statistics factor since we had not discussed their implication upon the Oracle Cost-based optimizer.)

&quot;Carp&quot;, a colleague also on Tek-Tips, and I were compelled by another thread to come up with a standalone Oracle module to analyze and defrag objects once the objects cross a user-defined threshold from the analysis. Carp is putting the finishing touches on that module (which does not rely upon statistics). Since Carp reads these posts, as well, this will be a &quot;spear-in-the-rear&quot; for us to finish asap that opus magnum.

In the meantime, what I propose is that you clarify and re-state your &quot;business&quot; objectives that you must meet with your results. That way, we can ensure that we are meeting your needs.

How does that sound?

Dave
 
Dave,
1. count(*) doesn't necessary imply full scan: it may use primary key.
2. dbms_stats exists in 8i also.

As for original question: that script is for SQL*Plus, not for SQL Worksheet. Its estimation is not too precise, because the results may depend on table storage parameters; it also doesn't take into account chained rows. You may test it on a table with LOB field: 1 record table will show the same size regardless on its real content.

As for portability, I suppose that it can not be done at all, because different vendors use completely different techniques to store data. As well as different terminology.

Regards, Dima
 
The actual business objectives of this project are to develop a product that gives ARS administrators a view into their ARS database from ARS itself. The assumption is that they are not DBAs and do not have access (let alone the knowledge) to get to a SQL prompt and start writing queries to analyze their database. Therefore, we have ARS write the queries for them and present it in a nice little GUI package wrapped up with a neat little bow. =P

Specifically, the ARS admin will choose a table or group of tables and tell ARS to &quot;analyze&quot; them. ARS will run several embedded SQL commands and return a table such as this:

dB tablename ARS tablename Rows Allocated Space
------------ ------------- ---- --------------
T170 Main_Helpdesk 255 2048KB

Used Unused Row Average Index Size
---- ------ ----------- ----------
270KB 1778KB 8.03 72KB


The table will be GUI and look all nice. You'll be able to press the column header and sort by that column, etc. - standard Windows fare. But you get the idea.

The main items I'm looking at getting are the above 8 items:

1. Oracle true tablename
2. ARS alias tablename
3. # Rows in the table
4. Allocated (or reserved) space in KB/MB
5. Actual space used in KB/MB
6. Unused space in KB/MB
7. Row Average (Allocated space divided by # Rows)
8. Size of the table's associated Index

The person who handled this project before me wrote the SQL to get all of this out of SQL Server, but only wrote the SQL to get items 1,2,3,4,7, and 8 for Oracle. With this post, I'm trying to get a grip on the SQL need to generate items 5 and 6.

So, in the end, ARS will actually be running the SQL statements - I will not be writing them at a SQL prompt myself. This imposes some constraints, but if I can get a handle on the SQL that is needed to generate the info I seek, I can deal with ARS limitations.

Once again, thanks for the help.
 
I would recommend gathering and maintaining statistics - if you are using cost-based optimizer (CBO) and your code is not optimized for rule-based optimizer (RBO) then your code will probably be more efficient. Statistics will ALSO give you a better idea about your data storage.

Once you have gathered statistics on a table, you can go to dba_tables (or all_tables if you don't have access to dba_tables) and see things like:
BLOCKS - number of blocks below the table's High Water Mark (HWM)
EMPTY_BLOCKS - number of blocks above the table's HWM
AVG_ROW_LEN - Average Row Length
NUM_ROWS - Number of rows in the table
CHAIN_CNT
Now, how to use this to get what you want:
(But first, a couple of caveats:
1. The statistics are a snapshot of the data at the time they were gathered. If you gather statistics on Sunday PM and insert 100 rows a day, your NUM_ROWS will be off by about 200 rows by Tuesday.
2. If you estimate statistics instead of compute them, you will save time, but your AVG_ROW_LEN will probably be off. How far off will be determined by things like the percentage of rows you use and the variance in row lengths).

The total amount of data stored is going to be approximately AVG_ROW_LEN * NUM_ROWS.

The total number of blocks allocated to the table will be BLOCKS + EMPTY_BLOCKS. However, if you are interested in how many blocks actually have data in them, you can find that with the following query:

SELECT count(distinct dbms_rowid.rowid_block_number(rowid)) occupied_blocks FROM my_table;

Once you know how many blocks actually have data, you can then tell how many DON'T have data by subtracting the results of that query from the number of blocks allocated to the table.

Finally, you will probably want to keep an eye on your chained rows. However, knowing you have 120 chained rows doesn't tell you much, so I would recommend watching the percentage that are chained:
SELECT (chain_cnt/num_rows)*100 FROM dba_tables WHERE table_name = 'MY_TABLE';

Hope this helps.

 
It does look like I'm going to need to figure out an option to compute these values without using statistics.

Bummer, I know. But can you offer any help?

What about the script I allude to in my original post that can be found here: ?

I did get it to run, but ARS can't run SQL scripts. It can only execute one individual SQL instruction at a time. Therefore, I'll have to issue a SELECT statement, capture the value, issue the next SELECT statement that includes the previously captured value, and so on. If you guys can verify that the above script works correctly (and works without using statistics), then I need to figure out how to run it one statement at a time.

I can figure out how to run the script you came up with one step at a time, Dave, but this script is losing me.

It looks like my final answer should be calculated by:

(block_count * l_db_block_size)/1024

I can see how the variable l_db_block_size is being calculated, but block_count is totally escaping me once the script enters the loop.

Question 1: Is the script really going to give me what I'm after?

Question 2: If so, can you help me understand how the value for block_count is being calculated and how to duplicate it more simply?

If this script gives me the data I need, I just really need to understand the SQL lines that deal with the calculations (not the the stuff that helps me loop through multiple tables, etc.)
 
Within the loop, it's building a sql query and then executing it dynamically.

What this script appears to be doing is giving you the number of blocks that actually have data in them (note the similarity to my previous query: SELECT count(distinct dbms_rowid.rowid_block_number(rowid)) occupied_blocks FROM my_table;).

This is PART of what you want, but not all.

Let's back up a moment - what is the reason you can't use statistics? I'm not being critical here; I'd just like to understand why you can't have the benefits that statistics would provide you.

As to your problem with just one query at a time, you might be able to (a) create a view that gathers the information for you (enabling you to do something like: SELECT * FROM stat_view WHERE table_name = 'MY_TABLE';) and/or (b) a series of stored functions that could be called from a simple query.

But I would certainly have one more try at getting the statistics generated!
 
Thanks for the replies, carp! Are you familiar with Remedy's ARS at all? Basically, you build a bunch of GUI forms and add fields to them (just like you can do in MS Access or with JavaBeans, etc.). Then you add workflow behind the forms using a built-in 4GL. Everything you build can be packaged up in a neat little export file. When it is exported, the file is automatically cleaned of server references so that it can be imported into any server running ARS.

While it is possible that we will provide different versions depending on the version of ARS or brand of database that a customer is using, right now the application is written to cover everything in one package.

Right now, you could import the app into your ARS system and run it with no glitches running:

1. ARS version 4.0x, 4.5x, or 5.x
2. SQL Server 7 or 2000
3. Oracle 8 or 9 (unsure about 7)

This is a product we will be selling to ARS admins within a month or two and we want to be able to market to everyone using ARS that meet the above qualifications. The reason to write it using SQL routines that don't use statistics is simply that there may be potential users who run their Oracle databases with statistics turned off.

Being very new to the Oracle world, I'm not sure what percentage of Oracle DBAs run without statistics turned on and why they would decide to do so, but the idea is we want to be able to market to them without requiring them to turn on statistics if they have made a conscious business decision to turn them off.

In other words, if we do not use statistics, we have a larger possible user base because anyone running Oracle 8 or 9 can use this application, statistics or no. Using statistics potentially eliminates a portion of that user base.
 
Is there any way that u delete duplicate rows from a table without using rowid?
 
BZC,

Probably a good question for your own thread. But I'll take a shot here. First, what would prevent you from using/wanting to use rowid? Absent using rowid, I'll bet you could use &quot;rownum&quot;, but I'd still prefer using rowid in all cases. You need to use something (such as &quot;rowid&quot;) that uniquely distinguishes the otherwise identical rows, else, being identical, they all either get deleted or not deleted. You want to delete all but one row, right?

Dave
 


e5c4p3artist -
Well, that explains why you want to avoid relying on statistics - and that IS problematic. I'm not sure there is any real efficient way to do what you want to do without statistics!

BZCtJ292 - Concur w/ SantaMufasa - this really SHOULD be a separate thread! With that said, you CAN delete duplicates without using rowid, but if you don't want to sit at your keyboard forever, it's going to take a PL/SQL script. Basically, what you would do is something along these lines (mind you, this is on the fly and not at all optimized!):

DECLARE
CURSOR dupe_cursor IS
SELECT suspected_dupe_column_list
FROM my_table
GROUP BY suspected_dupe_column_list
HAVING count(*) > 1;
dupe_rec dupe_cursor%ROWTYPE;
BEGIN
LOOP
OPEN dupe_cursor;
LOOP
FETCH dupe_cursor INTO dupe_rec;
dbms_output.put_line('value: '||dupe_rec.suspected_dupe_column_list);
EXIT WHEN dupe_cursor%NOTFOUND;
DELETE FROM my_table
WHERE suspected_dupe_column_list =
dupe_rec.suspected_dupe_column_list
AND rownum < 2;
dbms_output.put_line(dupe_cursor%rowcount||': '||dupe_rec.suspected_dupe_column_list);
END LOOP;
dbms_output.put_line('ROWCOUNT IS: '||dupe_cursor%ROWCOUNT);
EXIT WHEN dupe_cursor%ROWCOUNT = 0;
CLOSE dupe_cursor;
END LOOP;
CLOSE dupe_cursor;
COMMIT;
END;

All things considered, I'd use ROWID if I could!
 
Sorry - remove all lines involving dbms_output in the previous post - that was just to make sure things were working as I thought they should be!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top