Removing Duplicate Records 1

MrMystery · Mar 13, 2003

Hi.
I have a table containing project numbers which are numbered as follows
1000N
1000S
1000T
1000X
through to
9999N
9999S
9999T
9999X

The suffix letter indicates a department of the company.
We have changed our reporting requirements to dispense with individual departments, and instead to report under one heading 'H'
I have therefore created a whole new list of projects
1000H
through to
9999H
based on existing project numbers. Obviously this has left me with duplicated numbers e.g. 4 occurrences of 1000H etc.
Can anyone help with a SQL script which will delete the duplicates, leaving me with just 1 occurrence of each project?
Any help appreciated!

JamesLean · Mar 13, 2003

Are all the other columns for each duplicate project number the same? If so, use SELECT DISTINCT * to copy each unique project to a new table, truncate the original table and copy the data back in.

If not, how do you know which project to keep out of the four? --James

MrMystery · Mar 13, 2003

All of the columns except one are the same.
The one that may be different is the project manager assigned to the project.
However, we are taking the view that once we are left with only 1 occurrence of each project, these will be printed out for checking by each project manager, and any errors will be corrected then.
Therefore, I am not concerned with which record to keep of the 4, only with deleting 3 and keeping 1.
Hope this makes sense!

JamesLean · Mar 13, 2003

OK, if it doesn't matter which manager you keep, use this query for getting the distinct projects:

Code:

SELECT projectnumber, <other columns>, MAX(manager)
FROM table
GROUP BY projectnumber, <other columns>

--James

TheBugSlayer · Mar 13, 2003

Here is a tip from a book I have.
The technique is to create a table with a similiar structure
to the result set, put an index on it with the ignoredups clause like this:

CREATE TABLE [dbo].[ProviderNoDups] (
[ID] [numeric](18, 0) IDENTITY (1, 1) NOT NULL ,
[PhyID] [char] (9) NULL ,
[PhySSNEIN] [char] (1) NULL ,
[PhyLast] [char] (30) NULL ,
[PhyFirst] [char] (20) NULL ,
[PhyMI] [char] (1) NULL ,
[PhyComp] [char] (35) NULL ,
[PhyAddress] [char] (40) NULL ,
[PhyCity] [char] (25) NULL ,
[PhyState] [char] (2) NULL ,
[PhyZip] [char] (5) NULL ,
[Phy9Zip] [char] (4) NULL ,
[PhyPhone] [char] (10) NULL ,
[PhySpec] [char] (4) NULL ,
[msrepl_synctran_ts] [timestamp] NOT NULL
) ON [PRIMARY]

-- make this table ignore duplicate value --
CREATE UNIQUE INDEX removedups ON ProviderNoDups
(PhyID,PhySSNEIN,PhyLast,PhyFirst,PhyMI,PhyComp,

PhyAddress,PhyCity,PhyState,PhyZip,Phy9Zip,PhyPhone,PhySpec)
WITH IGNORE_DUP_KEY

Then insert your values into the new table. Duplicates will be left out.

Sorry for the lengthy code: I am using real info.

MrMystery · Mar 18, 2003

Brilliant! This worked a treat!
Many thanks.

Mike

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Removing Duplicate Records 1

MrMystery

Technical User

JamesLean

Programmer

MrMystery

Technical User

JamesLean

Programmer

TheBugSlayer

Programmer

MrMystery

Technical User

Similar threads

Part and Inventory Search

Sponsor