Felix18807
Programmer
I'm trying to remove duplicates from a table through use of a distance algorithm (Fuzzy matching). The query I'm running inserts possible duplicate IDs side by side into a table so I can link these up with details later for somebody to review if it is a duplicate or not. The query I'm using looks like this:-
INSERT INTO tblDeDuplication ( Account_ID, Match_ID, Criteria )
SELECT Account.Id, Account_1.Id, Account_1.ParsedName
FROM Account AS Account_1 INNER JOIN Account ON Account_1.[AddressPostcode/Zip] = Account.[AddressPostcode/Zip]
WHERE (((LD(Nz([Account].[ParsedName]),Nz([account_1].[ParsedName])))<2) AND (([account].[ID]<>[account_1].[id])=True) AND (([account].[ID]>[account_1].[id])=True));
Where LD is a distance function.
The query itself runs very well given the quantity of records it is handling (40000+) and completes happily within 30 minutes. What I'm finding annoying is the fact that the interface locks up everytime (NOT RESPONDING) so I can't see the progress bar. Has anyone else experienced this? Or maybe you can offer some possible fixes? I've tried creating suitable indexs with the same result.
INSERT INTO tblDeDuplication ( Account_ID, Match_ID, Criteria )
SELECT Account.Id, Account_1.Id, Account_1.ParsedName
FROM Account AS Account_1 INNER JOIN Account ON Account_1.[AddressPostcode/Zip] = Account.[AddressPostcode/Zip]
WHERE (((LD(Nz([Account].[ParsedName]),Nz([account_1].[ParsedName])))<2) AND (([account].[ID]<>[account_1].[id])=True) AND (([account].[ID]>[account_1].[id])=True));
Where LD is a distance function.
The query itself runs very well given the quantity of records it is handling (40000+) and completes happily within 30 minutes. What I'm finding annoying is the fact that the interface locks up everytime (NOT RESPONDING) so I can't see the progress bar. Has anyone else experienced this? Or maybe you can offer some possible fixes? I've tried creating suitable indexs with the same result.