Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

How to remove duplicate records

Status
Not open for further replies.

TheRambler

Programmer
Jan 23, 2003
523
BO
I have seen many posts about removing duplicate records when a single field (or concatenated fields) is the same in two or more records.

But what about if you want to delete records that are exact duplicates of each other?

When every field is the same (replica) in more than one record, the easiest way to remove them is:

SELECT * FROM table UNION ;
SELECT * FROM table

 
mlv1055

I'm not sure what this code will accomplish. Can you explain?

Mike Gagnon

If you want to get the best response to a question, please check out FAQ184-2483 first.
 
Hi,

I thought the previous explanation was good enough, but seems I needed an example.

This text is from the help file:
Include the UNION clause to combine the final results of one SELECT with the final results of another SELECT. By default, UNION checks the combined results and eliminates duplicate rows.

For example:

CREATE cursor tmp (fld1 n(1), fld2 c(3), fld3 d(8))
INSERT INTO tmp VALUES (1, "AAA", DATE())
INSERT INTO tmp VALUES (2, "BBB", DATE())
INSERT INTO tmp VALUES (3, "CCC", DATE())
INSERT INTO tmp VALUES (1, "AAA", DATE())
INSERT INTO tmp VALUES (1, "AAA", DATE())
? "Table with duplicate records"
LIST
SELECT * from tmp UNION ;
SELECT * from tmp INTO CURSOR tmp2
? "Duplicate records deleted"
LIST

That's it.
I found this thread where they were trying to accomplish the same thing, but with a more complicated approach.

Generic delete of duplicate Recs
thread184-336195
 
Just to clarify a little, I would like to change this ambiguous line:

? "Duplicate records deleted"


It was meant to say "After deletion of duplicate records"
so the last LIST command shows the remaining records, not the deleted records.
 
uhhhmm... i thin what sir mike was trying to ask is "in your code, where's the actual deletion talking place?" seems like the code just displayed duplicte records. sorry if am a bit slow on this. [ponder]

kilroy [trooper]
 
"in your code, where's the actual deletion taking place?"

That is the tip, there is no need for a DELETE command when combining the results of two SELECTs with the UNION clause.

Try this code and see for yourself:

CREATE cursor tmp (fld1 n(1), fld2 c(3), fld3 d(8))
INSERT INTO tmp VALUES (1, "AAA", DATE())
INSERT INTO tmp VALUES (2, "BBB", DATE())
INSERT INTO tmp VALUES (3, "CCC", DATE())
INSERT INTO tmp VALUES (1, "AAA", DATE())
INSERT INTO tmp VALUES (1, "AAA", DATE())
INSERT INTO tmp VALUES (3, "CCC", DATE())
? "Table with duplicate records before deletion"
LIST
SELECT * from tmp UNION ;
SELECT * from tmp INTO CURSOR tmp2
? "Resulting table after deletion of duplicate records"
LIST


A new table without duplicates is created, maybe that was confusing because the original table keeps the duplicates.

Anyway, I would use this tip only for situations like described in Thread184-336195

For the most common cases when you want to delete records which have a duplicate key field, use the other methods.

sorry if am a bit slow on this
don't worry, but bear me for not being good at explaining ideas... I will get better at it, though. Gime me time.
 
Is this the same as DISTINCT clause?
wouldn't it be more readable to use (select * from table distinct)

from the help file (DISTINCT
Excludes duplicates of any rows from the query results.)

Am I missing something?
 
Hi Beauty lover,

1. DISTINCT cannot be used with MEMO or GENERAL fields.

UNION can be used.


2. The different MEMO or general field conetents are not checked and will be treated as duplicates when the UNION is used as the method of duplicate checking. SO a diffrent remark in a memo field, could still see only one of the records to be picked up in the result. SO one has to be very careful with this. DISTINCT is not suitable either.

:)



____________________________________________
ramani - (Subramanian.G) :)
When you ask VFP questions, please add VFP version.
 
Thanks ramani for the clarification.

Best regards
 
BeautyLover,
wouldn't it be more readable to use (select * from table distinct)...Am I missing something?

I guess I am missing something. After reading Ramani's post and yours, I think this is not a good tip.

Next time I will phrase my tips as questions.

Ramani,
Your observations apply to VFP 8 only, when using memo or general fields. Good thing you helped us to notice that.

The error is Operation is invalid for a Memo, General or Picture field

If someone is interested, look at this article
for more information.

Thanks to both of you
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top