INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Jobs

Need to restore 200 millions rows from 400-million-row archive table, fast!

Need to restore 200 millions rows from 400-million-row archive table, fast!

(OP)
The audit tableon a SQL Server 2008 contains more than 400 million records. The table is heavily used and has become a drag on the database. Over the week-end it was renamed and a brand new audit table was created.

Now we need to copy a year’s worth of data back to the live audit table, the one that was recreated. However, there is not one single column that uniquely identifies rows in the audit table. I need to find a way to add a year’s worth of records back to the live audit table while keeping track of the records that have already been added back to avoid duplication. There is about 200 million records to add back, we cannot add them all at once but rather a certain number at a time so as not to tax the database too much.

I am thinking of duplicating the whole archive audit table, the one with the 400 million rows and adding a new Boolean column to it. That column will indicate whether the row has already been added over to the live audit table. The rows to be added back will come from the result set of a query that filters by the transaction date, returning only a year’s worth of data.

If anyone has a different idea, a more efficient one I would like to hear about it. I will be tinkering with my idea in the meantime…

Thank you.

MCP SQL Server 2000, MCTS SQL Server 2005, MCTS SQL Server 2008 (DBD, DBA)

RE: Need to restore 200 millions rows from 400-million-row archive table, fast!

You could do it by date, say one week at a time.

djj
The Lord is my shepherd (Psalm 23) - I need someone to lead me!

RE: Need to restore 200 millions rows from 400-million-row archive table, fast!

(OP)
True djj55. That's a great idea. The challenge is to keep track of which records have already been copied back. If the process fails in the middle of processing a week, when I resume I only need to copy over the rows withing that week that have not been copied yet...As I type that I am thinking, process the week in a transaction so that either all rows are copied or not, and insert the successful week value in a table to keep track...What do you think?

MCP SQL Server 2000, MCTS SQL Server 2005, MCTS SQL Server 2008 (DBD, DBA)

RE: Need to restore 200 millions rows from 400-million-row archive table, fast!

(OP)
Ah, but then the same week number may occur in more than one year...I guess I could save first and last day of the week. Start from 01-01-2012 and do +7...

MCP SQL Server 2000, MCTS SQL Server 2005, MCTS SQL Server 2008 (DBD, DBA)

RE: Need to restore 200 millions rows from 400-million-row archive table, fast!

Absolutely. You should use transactions.

Doing the copy a week at a time will almost certainly make the overall time faster. I would also suggest that you make sure instant file initialization is enabled on your server.

When SQL Server needs to grow a database, the process can be a little slow. Since you need to copy so many rows, your database will certainly get larger.

Without instant file initialization, SQL Server will write a bunch of zero's to the hard drive. With instant file initialization enabled, SQL Server will not do this, it will simply allocate the space and make it available to you.

Instant file initialization only applied to mdf files (not log files). By copying a week at a time within a transaction, your transaction log file should not grow too much if you are using simple logging.

-George
Microsoft SQL Server MVP
My Blogs
SQLCop
twitter
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom

RE: Need to restore 200 millions rows from 400-million-row archive table, fast!

(OP)
Thanks George! I did not know about the instant file initialization feature. The database uses the full recovery model...I doubt I'll be able to change the recovery model in the middle of production as it requires restarting the service. I am still copying the archive audit table to a dev database to test my solution...52/400 millions so far...it's been about almost two hours.

This is what my script looks like so far:

CODE --> T-SQL

DECLARE @FirstDayOfWeek AS CHAR(10) --First date of audit rows to copy over to live from archive.
DECLARE @LastDayOfWeek AS CHAR(10)  --First day plus 6, inclusive.
DECLARE @StopDate AS CHAR(10)       --Last date to copy from the archive table to live audit.

SELECT @FirstDayOfWeek = '2012-10-01',
       @LastDayOfWeek = CONVERT(char(10),dateadd(day, 6, @FirstDayOfWeek), 121),
       @StopDate = '2013-10-03'

WHILE (@FirstDayOfWeek <= @StopDate)
BEGIN	    
	/*Last day of the week is first day + 6, inclusive if the range is withing a week,
	  otherwise it's the stop date.
	*/
    IF DATEDIFF(dd, @FirstDayOfWeek, @StopDate) <= 6 
		SELECT @LastDayOfWeek = @StopDate
    ELSE 
		SELECT @LastDayOfWeek = CONVERT(char(10),dateadd(day, 6, @FirstDayOfWeek), 121)
    
    BEGIN TRY
		BEGIN TRAN T1
		
		--Insert into production audit table
		INSERT INTO AUDITTABLE
		SELECT * FROM [AUDITTABLE-Archive]
		WHERE TRANSACTION_TIME BETWEEN @FirstDayOfWeek AND @LastDayOfWeek --AND IsCopied = 0
		--Update archive audit records transferred over
		--UPDATE [AUDITTABLE-Archive] SET IsCopied = 1 WHERE TRANSACTION_TIME BETWEEN @FirstDayOfWeek AND @LastDayOfWeek --AND IsCopied = 0
		
		--Wait 3 minutes. *Seems too long.
		WAITFOR DELAY '00:03:00'
		--SELECT @FirstDayOfWeek, @LastDayOfWeek, DATEDIFF(dd, @FirstDayOfWeek, @StopDate)
		--The next week starts at last day of current week + 1.   
		SELECT @FirstDayOfWeek = CONVERT(char(10),dateadd(day, 1, @LastDayOfWeek), 121)
		
		COMMIT TRAN
	END TRY
	BEGIN CATCH
		IF @@TRANCOUNT > 0 ROLLBACK TRAN T1
	END CATCH	
END 

MCP SQL Server 2000, MCTS SQL Server 2005, MCTS SQL Server 2008 (DBD, DBA)

RE: Need to restore 200 millions rows from 400-million-row archive table, fast!

for those volumes I would consider setting up the main destination table partitioned, load each individual week onto its own partition in parallel and then, if required, merge all the partitions into a single table.

the following article has info not only how to setup partitioning, but also on how to change the final table back to un-partitioned if that is required at the end.
http://www.patrickkeisler.com/2013/01/how-to-remov...


Regardless... having the new field is copied and doing an update of it is really not required - you can easily keep an audit of which weeks have been copied on another small purpose build table to control that.
That update is definitely going to kill your performance if done as you are doing.

Regards

Frederico Fonseca
SysSoft Integrated Ltd
www.syssoft-int.com

FAQ219-2884: How Do I Get Great Answers To my Tek-Tips Questions?
FAQ181-2886: How can I maximize my chances of getting an answer?

RE: Need to restore 200 millions rows from 400-million-row archive table, fast!

(OP)
Thanks Frederico. If only the people I am dealing with would actually listen the advice of those more verse in the technology...The update on the same table was not a problem at all. I am the only one using that table, unlike the live one.

MCP SQL Server 2000, MCTS SQL Server 2005, MCTS SQL Server 2008 (DBD, DBA)

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members!

Resources

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close