Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

duplicate records

Status
Not open for further replies.

mickywall

Programmer
Joined
Sep 2, 2002
Messages
88
Location
GB
i have a table with the following structure

CREATE TABLE [dbo].[dbtable] (
[DBkey] [int] NOT NULL ,
[nvarchar] (50) ,
[Name] [nvarchar] (100) ,
[Keywords] [nvarchar] (255) ,
[About] [nvarchar] (255) ,
[Url] [nvarchar] (100) ,
[Region] [int] NULL ,
[Done] [bit] NULL
) ON [PRIMARY]
GO

DBKey is a primary field.

i was wondering how i would get rid of duplicates. i figured that it would be to hard to get rid of duplicates in the 'keywords' field so the 'url' would be the most appropriate

i hope you can help!
 
Create a unique constraint on url field. RT
 
This select should give a list where email, Name, url are duplicates (you can use more or less fields...).

SELECT email, Name, url, count(*) FROM dbTable
GROUP BY email, Name, url HAVING count(*) > 1

I'm sure there's a more elegant/efficient way, but I would put these results into a temp table and use write a stored proc that removed all of the values where dbKey > the first dbKey you find for each value.

If you are interested in this solution (and no one has a more elegant approach), I can provide more info/code.

-- Just trying to help...
[wolf]<--- This is a wolf? We need a new icon.......
mikewolf@tst-us.com
 
i have found this code, i was wondering how this would apply to my code...


DELETE
FROM Foo1
WHERE Foo1.ID IN

-- List 1 - all rows that have duplicates
(SELECT F.ID
FROM Foo1 AS F
WHERE Exists (SELECT Field1, Field2, Count(ID)
FROM Foo1
WHERE Foo1.Field1 = F.Field1
AND Foo1.Field2 = F.Field2
GROUP BY Foo1.Field1, Foo1.Field2
HAVING Count(Foo1.ID) > 1))
AND Foo1.ID NOT IN

-- List 2 - one row from each set of duplicate
(SELECT Min(ID)
FROM Foo1 AS F
WHERE Exists (SELECT Field1, Field2, Count(ID)
FROM Foo1
WHERE Foo1.Field1 = F.Field1
AND Foo1.Field2 = F.Field2
GROUP BY Foo1.Field1, Foo1.Field2
HAVING Count(Foo1.ID) > 1)
GROUP BY Field1, Field2);
 
That looks like it should work! Good find..... -- Just trying to help...
[wolf]<--- This is a wolf? We need a new icon.......
mikewolf@tst-us.com
 
mike how do i change it so it works with my table field names?

 
I used textpad's find/replace - I haven't checked to make sure that there were no errors.... Good Luck!


DELETE
FROM dbTable
WHERE dbTable.dbKey IN

-- List 1 - all rows that have duplicates
(SELECT F.dbKey
FROM dbTable AS F
WHERE Exists
(SELECT Email, URL, Count(dbKey) FROM dbTable
WHERE dbTable.Email = F.Email
AND dbTable.URL = F.URL
GROUP BY dbTable.Email, dbTable.URL
HAVING Count(dbTable.dbKey) > 1))

AND dbTable.dbKey NOT IN
-- List 2 - one row from each set of duplicate
(SELECT Min(dbKey) FROM dbTable AS F
WHERE Exists
(SELECT Email, URL, Count(dbKey)
FROM dbTable
WHERE dbTable.Email = F.Email
AND dbTable.URL = F.URL
GROUP BY dbTable.Email, dbTable.URL
HAVING Count(dbTable.dbKey) > 1)
GROUP BY Email, URL); -- Just trying to help...
[wolf]<--- This is a wolf? We need a new icon.......
mikewolf@tst-us.com
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top