Hi George,
Thanks for your suggestion!
The purpose of this SP, which is scheduled using a job to run every 3 hours currently, is to delete records older than 15 days. (Spam e-mails)
Roughly daily it gains about a 100,000 spam e-mails in this table.
Currently got around 1.5 million records.
tblQuarantine has the main fields of the e-mail, and a msgID field links to another table that has the messages.
The application adding records is smart enough to recognize if a msg body matches another e-mail's msg body, so one msgID may be used for multiple quarantine records.
So my SP actually runs three loops:
1) Mark the Expire bit as 1 for all that are over 15 days old.
2) Delete all records that have the Expire set as 1
3) look for orphan msgid records in the msg table.
All three go through 500 records at a time because the process can take a while at midnight when all of a sudden a 100,000 records need to be deleted, and we want to make sure the tables are available for the application to add records in the mean time.. between two 500 batches.
I also limited the process to 30,000, because I don't want the load to run so long to possibly cause issues.
I have it grab datetime before and after the SP and email it to me.
So currently, when no records are found to be deleted at all, it takes 37 seconds to run. Too long!
With 60 records deleted, it takes 61 seconds.
At midnight, 30,000 quarantine and 30,000 msgID deletion took 27 minutes!!
So for the sp_helpindex on tblQuarantine I get:
Code:
EmailFrom nonclustered located on PRIMARY EmailFrom
EmailTo nonclustered located on PRIMARY EmailTo
MsgDate nonclustered located on PRIMARY MsgDate
MsgID nonclustered located on PRIMARY MsgID
QuarID clustered, unique, primary key located on PRIMARY QuarID
RejectID nonclustered located on PRIMARY RejectID
ServerID nonclustered located on PRIMARY ServerID
Subject nonclustered located on PRIMARY Subject
So for marking expired, I have:
Code:
SET ROWCOUNT 500
SET @RowCountHolder = 0
delete_more1:
-- PRINT 'delete1 processing marking expired ones' + '[marked:'+CAST(@CountDeleted AS VARCHAR(10)) + ']'
SET @msg = @msg + 'D1:'+CAST(@CountDeleted AS VARCHAR(10))
UPDATE tblQuarantine SET Expire = 1 WHERE ((DATEDIFF(day, MsgDate, GETDATE()) > 15) AND (Expire <> 1))
SET @RowCountHolder = @@ROWCOUNT
SET @msg = @msg + '-Prcsd:' + CAST(@RowCountHolder AS VARCHAR(10)) + '<br>'
SET @CountDeleted = @CountDeleted + 500
IF @RowCountHolder < 1 GOTO delete_more1_skip
IF ((@RowCountHolder > 0) AND (@CountDeleted < 30001)) GOTO delete_more1
delete_more1_skip:
SET ROWCOUNT 0
SET @CountDeleted = 0
SET @RowCountHolder = 0
SET @msg = @msg + 'T1:' + CAST((SELECT CONVERT(VARCHAR,GETDATE(),8)) AS VARCHAR(100)) + '<BR>'
Then to delete actual records:
Code:
-- PRINT 'delete_more2 is next'
SET ROWCOUNT 500
delete_more2:
-- PRINT 'delete2 processing actually deleting from quarantine' + '[marked:'+CAST(@CountDeleted AS VARCHAR(10)) + ']'
SET @msg = @msg + 'D2:'+CAST(@CountDeleted AS VARCHAR(10))
DELETE FROM tblQuarantine WHERE tblQuarantine.Expire <> 0
SET @RowCountHolder = @@ROWCOUNT
SET @msg = @msg + '-Prcsd:' + CAST(@RowCountHolder AS VARCHAR(10)) + '<br>'
SET @CountDeleted = @CountDeleted + 500
IF @RowCountHolder < 1 GOTO delete_more2_skip
IF ((@RowCountHolder > 0) AND (@CountDeleted < 30001)) GOTO delete_more2
delete_more2_skip:
SET ROWCOUNT 0
SET @CountDeleted = 0
SET @RowCountHolder = 0
SET @msg = @msg + 'T2:' + CAST((SELECT CONVERT(VARCHAR,GETDATE(),8)) AS VARCHAR(100)) + '<BR>'
Lastly to delete the Orphan msgs:
Code:
--===CLEANUP Orphaned Msgs in tblMsgs=====
-- PRINT 'delete_more3 is next - CleanUp Time'
SET ROWCOUNT 500
delete_more3:
-- PRINT 'delete3 processing deleting msgs' + '[marked:'+CAST(@CountDeleted AS VARCHAR(10)) + ']'
SET @msg = @msg + 'D3:'+CAST(@CountDeleted AS VARCHAR(10))
DELETE tblMsgs FROM tblMsgs LEFT JOIN tblQuarantine
ON tblMsgs.MsgID = tblQuarantine.MsgID WHERE (tblQuarantine.MsgID IS NULL)
SET @RowCountHolder = @@ROWCOUNT
SET @msg = @msg + '-Prcsd:' + CAST(@RowCountHolder AS VARCHAR(10)) + '<br>'
SET @CountDeleted = @CountDeleted + 500
IF @RowCountHolder < 1 GOTO delete_more3_skip
IF ((@RowCountHolder > 0) AND (@CountDeleted < 20001)) GOTO delete_more3
delete_more3_skip:
SET ROWCOUNT 0
SET @CountDeleted = 0
SET @RowCountHolder = 0
Perhaps with all three in view you may have an even better more efficient suggestion.
For the 30,000 record delete:
Marking Expired took: 19 seconds
Deleting Expired Quarantine records: 6 min 20 sec
Deleting Orphans took: 29 minutes!!
So the very last part is my biggest concern.
I'll mess with the first part that you suggested, and see if I can get significant improvement.