Delete Duplicate Rows 1

TitleistDBA · Jul 15, 2004

I have identified my duplicate rows.

SELECT *
FROM customer
WHERE (emailaddress IN
(SELECT emailaddress
FROM customer
GROUP BY emailaddress
HAVING (COUNT(emailaddress) > 1)))

Now I need to delete the duplicate rows with the min(id).

That would be the oldest record.
I thought this might do the trick but it didn't.

DELETE customer
WHERE (emailaddress IN
(SELECT emailaddress
FROM customer
GROUP BY emailaddress
HAVING (COUNT(emailaddress) > 1)))
AND ID = (SELECT MIN(ID) FROM customer)

Can anyone help with this?
Thanks

http://www.acushnetcompany.com

dbomrrsm · Jul 15, 2004

Try

DELETE customer
WHERE ID IN
(SELECT MIN(ID),emailaddress
FROM customer
GROUP BY emailaddress
HAVING (COUNT(emailaddress) > 1))

DBomrrsm

TitleistDBA · Jul 15, 2004

That won't work for the following reason.

Only one expression can be specified in the select list when the subquery is not introduced with EXISTS.

And if I change the query to

SELECT * FROM customer
WHERE EXISTS
(SELECT MIN(ID),emailaddress
FROM customer
GROUP BY emailaddress
HAVING (COUNT(emailaddress) > 1))

it will get every row in the table.

http://www.acushnetcompany.com

dbomrrsm · Jul 15, 2004

Sorry should have noticed that - there is a load of threads on this topic - go to Keyword Search in this forum and put in delete duplicates.

Good luck.

DBomrrsm

TitleistDBA · Jul 15, 2004

I came up with this solution to my problem.
It works but it is slow. Does anybody know a way that would be faster or to do it in one statement?
Thanks

DECLARE @id int
SELECT @id = count(*)
FROM customer c1, customer c2
WHERE c1.id != c2.id
AND c1.emailaddress = c2.emailaddress
WHILE @id > 1
BEGIN

DELETE customer
WHERE ID = ( SELECT MIN(c1.ID)
FROM customer c1, customer c2
WHERE c1.id != c2.id
AND c1.emailaddress = c2.emailaddress)

SELECT @id = count(*)
FROM customer c1, customer c2
WHERE c1.id != c2.id
AND c1.emailaddress = c2.emailaddress
END

http://www.acushnetcompany.com

ESquared · Jul 15, 2004

Code:

DELETE C1
   FROM Customer C1 INNER JOIN
      (SELECT MinID = Min(ID)
         FROM Customer
         GROUP BY EmailAddress
         HAVING Count(ID) > 1
      ) C2 ON C1.ID = C1.MinID

Although if the maximum number of duplicates is n, you'll have to run that n-1 times.

So fine... here's a version to delete them all in one swell foop. I should have thought of it right away.

Code:

DELETE C1
   FROM Customer C1 LEFT JOIN
      (SELECT MaxID = Max(ID)
         FROM Customer
         GROUP BY EmailAddress
      ) C2 ON C1.MaxID = C2.ID
   WHERE C2.MaxID IS NULL

ESquared · Jul 15, 2004

Oh, btw, you might be interested in the following FAQs.

Should I use JOIN or WHERE to join tables?
JOIN Fundamentals

TitleistDBA · Jul 15, 2004

Thanks

http://www.acushnetcompany.com

ESquared · Jul 15, 2004

You're welcome.

Where's my star? [wink]

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Delete Duplicate Rows 1

TitleistDBA

IS-IT--Management

dbomrrsm

Programmer

TitleistDBA

IS-IT--Management

dbomrrsm

Programmer

TitleistDBA

IS-IT--Management

ESquared

Programmer

ESquared

Programmer

TitleistDBA

IS-IT--Management

ESquared

Programmer

Similar threads

Part and Inventory Search

Sponsor