Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations MikeeOK on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Delete Similar Records ~ Redux 1

Status
Not open for further replies.

pmccue

Instructor
Jul 29, 2006
18
First off, special thanks to IT4EVR and PHV and Vic for their aide. Problem is, I'm not sure I stating my problem correctly. Here goes again:

I have ~165K records with some records similar. I want to delete the similar records and retain the first instance of the record. As mentioned, one thing that's been a pain about this effort is that some of the repeating records in the original table are not duplicates, where all field values match, but are 'similar' where some field values match, like LastName, FirstName, DOB, but the other field values in the similar records may not match.

I don't see how an Append can work, since the original table and the append table would have different structures. I've worked through such as strategy but get seem to get invalid answers. There must be a script method but I will need a sample as I am not formally trained in sql scripting. If automation fails, then maybe I could use a manual approach as there appear to be about 1500 similar records that require deleting.

Thanks,
Phil....
 
First I would resolve this by working in a new table rather than trying to change the original. That way any mistake is not fatal.

Secondly, the question is rather vague about what is meant by similar.

I will assume that similar means having the same LastName, FirstName, DOB.

Copy the structure of the original table to the new table. You can do this by selecting the table in database view, copy, paste and select structure.

Now populate the new table with the distinct FirstName, LastName, DOB. You can do this by creating a select query on the original table for those 3 fields, set the property of the query to Unique Values and then turn the query into an append query pointing at your new table.

Now create an update query on the new table to populate the other fields based on the new table joined to the old table with the join being on your 3 fields.

With an update in this way if there is more than one instance in the original table of FirstName, LastName, DOB then only one of those instances will be used in the update. You may be able to influence which one gets picked by using a sort.

 
A similar answer already given 29 Jul 06 20:00 here:
thread701-1260448
 
Cherrio,

...you are correct in what is similar: LastName, FirstName, DOB (ref your paragraph 2).

I've got everything done but need help in what you describe in paragraph 6, the one that starts with "Now create an update..."

I can create an update query but what needs to be done to join the new table to the old table based on the 3 fields, LastName, FirstName, DOB? Is this done in the query grid, or in a script? Guess I need more detail here on joining.

Thanks,
Phil....
 
First, not a criticism, but you should really keep this to one post, since it is really the same issue.

The problem that if you do this update, how do you know you are getting the correct data for the non-repeating fields?

Let's say you have this:
Code:
LastName   FirstName  DOB     ST  Supervisor
Smith       John      1/1/64  MD   Jackson
Smith       John      1/1/64  VA   Thompson

Maybe I am wrong but it appears to me that just doing an update on the similar fields would update all the records that matched on LastName, FirstName, DOB.
 
IT4EVR,

My appologies about multiple posts. Can you answer the question in my last entry above regarding Cherrio's method?

Thanks,
Phil....
 
Well, a question I have is that if you have records with exactly the same FirstName, LastName and DOB, WHY do you want to retain records that have all three of these fields the same but other fields different?

I think the core issue here is the structure of your table. Until that is fixed you are going to have these problems.
 
But to answer your question, you would add this created temp table to your query designer, design view,add all fields to the designer except for the DOB, LastName, FirstName.

Then add the original table (right click designer, add table).

Join the 3 fields from both tables by dragging one field of one table to the other.

In the update to property of each field, then preface the name with temptablename.fieldname...

or an example of a sql statement
Code:
UPDATE temp AS a INNER JOIN tbl_Original AS b ON a.LastName = b.LastName AND a.FirstName = b.FirstName AND a.DOB = b.DOB SET a.State = b.State...etc
 
Well, as the table sits now, I have records containing the same Name & DOB that were entered multiple times for different reasons, in this case, different offenses. So some fields defining the various offense data would be different for each record. What's needed is a listing so each Name & DOB occurs one time, a sort of listing of individual offenders without regard to number of offenses each committed.

Hope this helps.
 
SELECT DISTINCT Name, DOB From TableName

will return each combination of Name and DOB ONCE.

is that what you are looking to do?

leslie
 
IT4EVR,

Thanks for the clarification. In the Update query grid, the 'Table' row and the 'Update to' row should each refer to different tables, right? So I have in 'Table' row, reference to the original table, and in the 'Update to' row, the TempTable. Is this correct? The query runs but the original table still contains duplicate Name and DOB fields.

Phil....
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top