Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Delete Duplicate Rows 1

Status
Not open for further replies.

TitleistDBA

IS-IT--Management
Apr 22, 2002
162
US
I have identified my duplicate rows.

SELECT *
FROM customer
WHERE (emailaddress IN
(SELECT emailaddress
FROM customer
GROUP BY emailaddress
HAVING (COUNT(emailaddress) > 1)))

Now I need to delete the duplicate rows with the min(id).

That would be the oldest record.
I thought this might do the trick but it didn't.

DELETE customer
WHERE (emailaddress IN
(SELECT emailaddress
FROM customer
GROUP BY emailaddress
HAVING (COUNT(emailaddress) > 1)))
AND ID = (SELECT MIN(ID) FROM customer)

Can anyone help with this?
Thanks


 
Try

DELETE customer
WHERE ID IN
(SELECT MIN(ID),emailaddress
FROM customer
GROUP BY emailaddress
HAVING (COUNT(emailaddress) > 1))

DBomrrsm
 
That won't work for the following reason.

Only one expression can be specified in the select list when the subquery is not introduced with EXISTS.

And if I change the query to

SELECT * FROM customer
WHERE EXISTS
(SELECT MIN(ID),emailaddress
FROM customer
GROUP BY emailaddress
HAVING (COUNT(emailaddress) > 1))

it will get every row in the table.


 
Sorry should have noticed that - there is a load of threads on this topic - go to Keyword Search in this forum and put in delete duplicates.

Good luck.

DBomrrsm
 
I came up with this solution to my problem.
It works but it is slow. Does anybody know a way that would be faster or to do it in one statement?
Thanks

DECLARE @id int
SELECT @id = count(*)
FROM customer c1, customer c2
WHERE c1.id != c2.id
AND c1.emailaddress = c2.emailaddress
WHILE @id > 1
BEGIN

DELETE customer
WHERE ID = ( SELECT MIN(c1.ID)
FROM customer c1, customer c2
WHERE c1.id != c2.id
AND c1.emailaddress = c2.emailaddress)

SELECT @id = count(*)
FROM customer c1, customer c2
WHERE c1.id != c2.id
AND c1.emailaddress = c2.emailaddress
END


 
Code:
DELETE C1
   FROM Customer C1 INNER JOIN
      (SELECT MinID = Min(ID)
         FROM Customer
         GROUP BY EmailAddress
         HAVING Count(ID) > 1
      ) C2 ON C1.ID = C1.MinID

Although if the maximum number of duplicates is n, you'll have to run that n-1 times.

So fine... here's a version to delete them all in one swell foop. I should have thought of it right away. :)

Code:
DELETE C1
   FROM Customer C1 LEFT JOIN
      (SELECT MaxID = Max(ID)
         FROM Customer
         GROUP BY EmailAddress
      ) C2 ON C1.MaxID = C2.ID
   WHERE C2.MaxID IS NULL
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top