Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Slow Update == HELP?

Status
Not open for further replies.

Variation180

Programmer
Apr 18, 2003
6
US
I'm using 4.7 to aggregate data from a table that is 24 million rows to an aggregate table that is approximatly 11 million rows. The first mapping does the aggregation and I have a second mapping that writes back the aggregate_id from the aggregate table to all the source records that make up the aggregate. This process is extremly slow updating at 22 rows/sec (approx 5 day runtime). Does anyone have a better solution to this?
 
????????????????????????????????????????

I am breaking my head in trying to figure out what you are doing. What possible use is there in writing back an id from aggregate to the source?

Why create the aggregate at all if you still have nearly 50% of the size of the source table?

Why still use version 4.7?
Version 5.x will be inherently be quite a bit faster, version 6 even more efficient.

I am sure there is a story behind all of this.........

T. Blom
Information analyst
tbl@shimano-eu.com
 
Our company is currently in the process of testing for upgrade to version 6 (or 7 now that it has been released). It may be 6 months before we can begin to use it.

The initial load for 2 years of data is 300 million rows to the source table. I am writing back the aggregate id's to the source because this process is going to be an incremental load (receiving approx 2 million records per month.) The new records will be validated via a Vitria process where valid records will be inserted into the source and corrected error records will be inserted or updated to the source while being posted to a redo table. The redo table contains key values from the source table to identify the records updated, and the first level aggregate id. The aggregate ids help the process find which aggregate the source records apply to by creating a many to one join.

The actual process flow is as follows:

The 300 million rows are aggregated to 24 million.

The 24 million row aggregate table is split and aggregated to 2 aggregate tables by category. One is approx 11 million, the other is 1.5.

Currently the process takes 23.4 hours to aggregate the first level and 8.5 hours to create the additional two aggregate tables. On best calculations at 22 rows per second the write back from the 11.5 million records aggregate table would take nearly 5 days.

I hope that I'm giving you a better idea of the process. I've only been working on this for a week and since I inherited this process from a former employee, I prabably don't have all the details.

Thanks!
Chris
 
Chris,

My suggestion would be to take a look elsewhere.
Especially with large databases like the one you are working with you may want to get in touch with some specialists that can help you out.

One way to get there is to apply for registration with core integration:


There a some real guru's there, with loads of experience with LARGE databases.......

T. Blom
Information analyst
tbl@shimano-eu.com
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top