Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

URGENT Help on large duplication removal

Status
Not open for further replies.

MikeJones

Programmer
Nov 1, 2000
259
GB
Hi,

I have a problem where I need to remove a large amount of duplicate records from a large table.

There are around 300 - 400K duplicate records in the table. The table containts about twice that in terms of total records (800K)

Not all records are duplicated and some are duplicated many times.

In order to spot the differeneces I have to group by most of the columns.

My problems are this...

It will take a long time to run a query with group by to spot the duplicates.
I don't want to run into the snapshot to old problem when I start deleteing rows that my query is trying to hold a consistent read on

This is running on HP-UX 64 bit on 8.1.7

Thanks in advance.

Mike.
 
delete from dup_tab d1
where rowid <
(select max(rowid) from dup_tab d2
where d1.key1 = d2.key1
and d1.key2 = d2.key2...)

Speed this up by putting an index on (key1,key2...).

 
Hi,
delete from emp a
where rowid not in(select max(rowid) from emp b
where a.empno=b.empno) and
rownum < 2;
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top