I need help figuring out how to do the following:
I have a table that I want to remove duplicate records. However, my definition of a duplicate is any record that has the same Agent_ID and ANI values, but also have a Call_Date value that is within 1 hour of each other. (Call_Date is a date/time field. Agent_ID and ANI are character fields.) When there is a duplicate, I would like to keep the record with the last Call_Date only.
Here's sample data (where Rec_ID is a unique identifier):
Rec_ID Agent_ID ANI Call_Date
1 JOHN 123 5/9/07 09:23:05 AM
2 JOHN 123 5/9/07 10:09:05 AM
3 JOHN 123 5/9/07 10:57:24 AM
4 JILL 123 5/9/07 02:59:16 PM
5 JESS 456 5/9/07 04:50:22 PM
6 JESS 456 5/9/07 10:03:64 PM
My desired deduped result will be to keep records 3 through 6 but not 1. Note: The 1st record is dropped because it is within 1 hour of the 2nd record. The 2nd record is then compared to the 3rd record. The 2nd record is then dropped because it is within 1 hour of the 3rd record. (So the process needs to order the records by Call_Date first before comparing.)
Feel free to add flags to the existing table (to flag duplicates or non-duplicate records) or create an entirely new table with the desired results. However, I don't want to delete any records from the original table.
Thanks!
I have a table that I want to remove duplicate records. However, my definition of a duplicate is any record that has the same Agent_ID and ANI values, but also have a Call_Date value that is within 1 hour of each other. (Call_Date is a date/time field. Agent_ID and ANI are character fields.) When there is a duplicate, I would like to keep the record with the last Call_Date only.
Here's sample data (where Rec_ID is a unique identifier):
Rec_ID Agent_ID ANI Call_Date
1 JOHN 123 5/9/07 09:23:05 AM
2 JOHN 123 5/9/07 10:09:05 AM
3 JOHN 123 5/9/07 10:57:24 AM
4 JILL 123 5/9/07 02:59:16 PM
5 JESS 456 5/9/07 04:50:22 PM
6 JESS 456 5/9/07 10:03:64 PM
My desired deduped result will be to keep records 3 through 6 but not 1. Note: The 1st record is dropped because it is within 1 hour of the 2nd record. The 2nd record is then compared to the 3rd record. The 2nd record is then dropped because it is within 1 hour of the 3rd record. (So the process needs to order the records by Call_Date first before comparing.)
Feel free to add flags to the existing table (to flag duplicates or non-duplicate records) or create an entirely new table with the desired results. However, I don't want to delete any records from the original table.
Thanks!