Our company is currently in the process of testing for upgrade to version 6 (or 7 now that it has been released). It may be 6 months before we can begin to use it.
The initial load for 2 years of data is 300 million rows to the source table. I am writing back the aggregate id's to the source because this process is going to be an incremental load (receiving approx 2 million records per month.) The new records will be validated via a Vitria process where valid records will be inserted into the source and corrected error records will be inserted or updated to the source while being posted to a redo table. The redo table contains key values from the source table to identify the records updated, and the first level aggregate id. The aggregate ids help the process find which aggregate the source records apply to by creating a many to one join.
The actual process flow is as follows:
The 300 million rows are aggregated to 24 million.
The 24 million row aggregate table is split and aggregated to 2 aggregate tables by category. One is approx 11 million, the other is 1.5.
Currently the process takes 23.4 hours to aggregate the first level and 8.5 hours to create the additional two aggregate tables. On best calculations at 22 rows per second the write back from the 11.5 million records aggregate table would take nearly 5 days.
I hope that I'm giving you a better idea of the process. I've only been working on this for a week and since I inherited this process from a former employee, I prabably don't have all the details.
Thanks!
Chris