Hello-
Using Access2000...
I have a table of transactions that are imported from an external source. Here's the table structure:
Transactions
-transaction_id (PK)
-cust_number
-group_id
-individual_id
-individual_name
-transaction_date
In certain situations, there will be duplicate transactions listed in the external file. These are imported into the Transactions table and are identical in every way except for the 'transaction_id' field. I'm trying to write a DELETE query to get rid of the duplicate lines. I wrote this:
DELETE *
FROM Transactions
WHERE transaction_id IN
(SELECT MAX(transaction_id) AS dup_id
FROM Transactions
GROUP BY cust_number, group_id, individual_id,
individual_name, transaction_date
HAVING COUNT(transaction_id) > 1);
The subquery that finds the dup_id runs just fine.. it returns 272 out of about 25,000 transaction_ids in about 3 seconds. But when I try to run the query as written, it hangs. I let it run for about an hour and then gave up.
If I use the subquery to load the 272 dup_ids into a temporary table, and run:
DELETE *
FROM Transactions
WHERE transaction_id IN (SELECT dup_id FROM tmpTable);
..it works just fine.
I was able to work around the problem fairly easily, but I was hoping someone could help me understand why this doesn't work. It seems to me like if the stand-alone subquery returns the list of transaction_ids with ease, then the outer DELETE query should have no problem.
Thanks for your time.
Using Access2000...
I have a table of transactions that are imported from an external source. Here's the table structure:
Transactions
-transaction_id (PK)
-cust_number
-group_id
-individual_id
-individual_name
-transaction_date
In certain situations, there will be duplicate transactions listed in the external file. These are imported into the Transactions table and are identical in every way except for the 'transaction_id' field. I'm trying to write a DELETE query to get rid of the duplicate lines. I wrote this:
DELETE *
FROM Transactions
WHERE transaction_id IN
(SELECT MAX(transaction_id) AS dup_id
FROM Transactions
GROUP BY cust_number, group_id, individual_id,
individual_name, transaction_date
HAVING COUNT(transaction_id) > 1);
The subquery that finds the dup_id runs just fine.. it returns 272 out of about 25,000 transaction_ids in about 3 seconds. But when I try to run the query as written, it hangs. I let it run for about an hour and then gave up.
If I use the subquery to load the 272 dup_ids into a temporary table, and run:
DELETE *
FROM Transactions
WHERE transaction_id IN (SELECT dup_id FROM tmpTable);
..it works just fine.
I was able to work around the problem fairly easily, but I was hoping someone could help me understand why this doesn't work. It seems to me like if the stand-alone subquery returns the list of transaction_ids with ease, then the outer DELETE query should have no problem.
Thanks for your time.