×
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!
  • Students Click Here

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

Best place to do an Aggregation?

Best place to do an Aggregation?

Best place to do an Aggregation?

(OP)
I am building a SSIS package to pull in and aggregate job cost data from our SQL Server OLTP system to our data warehouse.  I have built a view on the OLTP server that will pull JobNumber,  MtlQty, and MtlCost for each material item associated with the job.  In the data warehouse, I just want to store total material cost for each job.  As I see it, I have two options:

1.Tweak the view to provide:

CODE

SELECT
  JobNum,
  IsNull(Sum(MtlQty * MtlCost),0) as TotalCost

-OR-

2. Add a Derived Column transform in my SSIS package to perform the (MtlQty * MtlCost) calculation for each row, then use an Aggregate transform to sum them by job.

Will one method typically be faster than the other, or is it hit and miss depending on the usual variables (server memory, server CPU configuration, etc.)?

RE: Best place to do an Aggregation?

Since this isn't really ann aggregate function but a mathematical function you shouldn't have any issue using the derived column and just multiply the needed column together.  If you Data volume isn't to great you may look at using the Aggregate Transform in your SSIS packages, rather than your view.  Doing this would mean you hit the source system for a smaller period of time as the SUM() won't pass records to the data flow immediately.  THe Aggregate transform is a blocking transform meaning that until all records for the grouping columns are at the aggregate transform the records will not proceed to the next step.  With the addition of the need to get the total cost you could do it one of 2 ways

1 Aggregate and the Multiply  This may infact take longer as you typically want the aggregate to be one of the last steps in your data flow.

2 Multiply QTY then Cost in a derived column and then apply the aggregate transform and sum these 3 columns.


I run about 38K records through an Aggregate transform and have it group by customer and date  avg is about 1800-2500 customers a day and the over head is leass than 2 minutes.

Paul
---------------------------------------
Shoot Me! Shoot Me NOW!!!
                           - Daffy Duck

RE: Best place to do an Aggregation?

(OP)
Paul -

The paper you linked to in your other thread inspired me to run a test.  I set up three parallel threads in an SSIS data flow.  The first used a SQL Statement in the OLE DB Source to pull aggregated data (about 1M rows).  The second used a SQL Statement in the OLE DB source to pull unaggregated data (47M rows) and pass it to an Aggregation transform.  The third used a view to aggregate the data; the data source simply read from the view.  In each case I terminated the data flow with a Trash destination.

The results:

SQL Statement Aggregate: 25 minutes
Aggregate in Data Flow: 37 minutes
Aggregate in View: 50 minutes

Incidentally, the aggregation task itself in the data flow took about 4 seconds; the remainder of the time was pulling the 47M rows.

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members! Already a Member? Login

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close