Hi,
I use some software to identify duplicate records based on inital and surname for people who live at the same address.
There is a field in the database called dup_code which stores the identifying value e.g if two records were identified as duplicates they would both be given a dup code of 1. The next preceeding duplicate records would then be given a dup_code of two and so on.
The problem is this: for example the software identifies S Andrews, Sarah Andrews, Sharon Andrews as duplicates because they all have the same initial. What I want to be able to do in this scenario is say that none of them are duplicates as we cannot determine whether S Andrews is the same as Sarah or Sharon. On the other hand if the following scenario arose where S Andrews and Sarah Andrews were marked as duplicates you could assume that these are duplicates and can be left as so. In the first scenario I just want to update the dup_code to 0 as 0 means that the records are not duplicates.
The software automatically detects that Sharon Andrews and Sharon Andrews are duplicates and so there is nothing more I need to do with scenario's such as this.
The table which stores the customers name and addresses is called customers, the field names concerned are:
Customers.first_name
Customers.last_name
Customers.dup_code
Can anyone shed any light on this?
Cheers
Paul
I use some software to identify duplicate records based on inital and surname for people who live at the same address.
There is a field in the database called dup_code which stores the identifying value e.g if two records were identified as duplicates they would both be given a dup code of 1. The next preceeding duplicate records would then be given a dup_code of two and so on.
The problem is this: for example the software identifies S Andrews, Sarah Andrews, Sharon Andrews as duplicates because they all have the same initial. What I want to be able to do in this scenario is say that none of them are duplicates as we cannot determine whether S Andrews is the same as Sarah or Sharon. On the other hand if the following scenario arose where S Andrews and Sarah Andrews were marked as duplicates you could assume that these are duplicates and can be left as so. In the first scenario I just want to update the dup_code to 0 as 0 means that the records are not duplicates.
The software automatically detects that Sharon Andrews and Sharon Andrews are duplicates and so there is nothing more I need to do with scenario's such as this.
The table which stores the customers name and addresses is called customers, the field names concerned are:
Customers.first_name
Customers.last_name
Customers.dup_code
Can anyone shed any light on this?
Cheers
Paul