INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Jobs

Performance problem on Teradata SQL query

Performance problem on Teradata SQL query

(OP)
If anyone could assist me in making this query run any faster, I would really appreciate it. Any comments or recommendations? I'm new to Teradata and been using Oracle for 8 years.
Here is Explanation
  1) First, we lock a distinct stage_dev."pseudo table" for read on a RowHash to prevent global deadlock for stage_dev.upc.
  2) Next, we lock time_dev.lu_day for access, we lock
     time_DEV.lu_week for access, we lock store_dev.lu_store for access, we lock other_dev.deli_capital_prfrm_tgt for access, we
     lock stage_dev.div_item_day for access, and we lock
     stage_dev.upc for read.
  3) We execute the following steps in parallel.
       1) We do a single-AMP RETRIEVE step from time_dev.lu_day by way of the unique primary index "time_dev.lu_day.d_date =
          (DATE )- 6" with no residual conditions into Spool 3 (group_amps) (compressed columns allowed), which is redistributed by hash code to all AMPs.  The size of Spool 3 is estimated with high confidence to be 1 row.  The estimated time for this step is 0.01 seconds.
       2) We do an all-AMPs RETRIEVE step from
          other_dev.deli_capital_prfrm_tgt by way of an all-rows scan with a condition of (
          "other_dev.deli_capital_prfrm_tgt.launch_dt <= DATE") into Spool 4 (all_amps) (compressed columns allowed), which is redistributed by hash code to all AMPs.  Then we do a SORT to order Spool 4 by row hash.  The size of Spool 4 is estimated with no confidence to be 1,204 rows.  The estimated time for this step is 0.01 seconds.
  4) We do an all-AMPs JOIN step from time_DEV.lu_week by way of a RowHash match scan with a condition of (
     "time_DEV.lu_week.week_end_dt <= DATE"), which is joined to Spool 4 (Last Use) by way of a RowHash match scan.  time_DEV.lu_week and Spool 4 are joined using a merge join, with a join condition of "(time_DEV.lu_week.week_end_dt >= launch_dt) AND (time_DEV.lu_week.week_id = week_id)").  The result goes into Spool 5 (all_amps) (compressed columns allowed), which is redistributed by hash code to all AMPs.  Then we do a SORT to order Spool 5 by row hash.  The size of Spool 5 is estimated with no confidence to be 402 rows.  The estimated time for this step is 0.01 seconds.
  5) We execute the following steps in parallel.
       1) We do an all-AMPs JOIN step from Spool 5 (Last Use) by way of a RowHash match scan, which is joined to
          store_dev.lu_store by way of a RowHash match scan with no residual conditions.  Spool 5 and store_dev.lu_store are joined using a merge join, with a join condition of (
          "store_dev.lu_store.store_id = store_id").  The result goes into Spool 6 (all_amps) (compressed columns allowed), which is redistributed by hash code to all AMPs.  Then we do a SORT to order Spool 6 by row hash.  The size of Spool 6 is estimated with no confidence to be 402 rows.  The estimated time for this step is 0.01 seconds.
       2) We do an all-AMPs RETRIEVE step from time_dev.lu_day by way of an all-rows scan with a condition of ("NOT (time_dev.lu_day.week_id IS NULL)") into Spool 7 (all_amps) (compressed columns allowed), which is redistributed by hash code to all AMPs.  Then we do a SORT to order Spool 7 by row hash.  The size of Spool 7 is estimated with high confidence to be 5,117 rows.  The estimated time for this step is 0.02 seconds.
  6) We execute the following steps in parallel.
       1) We do an all-AMPs JOIN step from Spool 6 (Last Use) by way of a RowHash match scan, which is joined to Spool 7 (Last Use) by way of a RowHash match scan.  Spool 6 and Spool 7 are joined using a merge join, with a join condition of ("(week_id = week_id) AND (week_id = week_id)").  The result goes into Spool 2 (all_amps), which is redistributed by hash code to all AMPs.  The size of Spool 2 is estimated with no confidence to be 2,814 rows.  The estimated time for this step is 0.09 seconds.
       2) We do an all-AMPs RETRIEVE step from time_DEV.lu_week by way of an all-rows scan with a condition of ("time_DEV.lu_week.week_end_dt <= DATE") into Spool 10 (all_amps) (compressed columns allowed), which is redistributed by hash code to all AMPs.  Then we do a SORT to order Spool 10 by row hash.  The size of Spool 10 is estimated with no confidence to be 244 rows.  The estimated time for this step is 0.01 seconds.
       3) We do an all-AMPs RETRIEVE step from time_dev.lu_day by way of an all-rows scan with a condition of ("(time_dev.lu_day.d_date <= ((DATE )- 6 )) AND (NOT (time_dev.lu_day.week_id IS NULL ))") into Spool 11 (all_amps) (compressed columns allowed), which is
          redistributed by hash code to all AMPs.  Then we do a SORT to order Spool 11 by row hash.  The size of Spool 11 is estimated with low confidence to be 3,593 rows.  The
          estimated time for this step is 0.02 seconds.
       4) We do an all-AMPs RETRIEVE step from
          other_dev.deli_capital_prfrm_tgt by way of an all-rows scan with a condition of (
          "other_dev.deli_capital_prfrm_tgt.launch_dt <= ((DATE )- 6)") into Spool 12 (all_amps) (compressed columns allowed), which is redistributed by hash code to all AMPs.  Then we do a SORT to order Spool 12 by row hash.  The size of Spool 12 is estimated with no confidence to be 1,204 rows.  The estimated time for this step is 0.01 seconds.
  7) We execute the following steps in parallel.
       1) We do an all-AMPs JOIN step from Spool 10 (Last Use) by way of a RowHash match scan, which is joined to Spool 11 (Last Use) by way of a RowHash match scan.  Spool 10 and Spool 11 are joined using a merge join, with a join condition of ("week_id = (week_id + 100)").  The result goes into Spool 13 (all_amps) (compressed columns allowed), which is built locally on the AMPs.  The size of Spool 13 is estimated with no confidence to be 1,200 rows.  The estimated time for this step is 0.02 seconds.
       2) We do an all-AMPs JOIN step from store_dev.lu_store by way of a RowHash match scan with no residual conditions, which is joined to Spool 12 (Last Use) by way of a RowHash match scan. store_dev.lu_store and Spool 12 are joined using a merge join, with a join condition of ("store_dev.lu_store.store_id = store_id").  The result goes into Spool 14 (all_amps) (compressed columns allowed), which is redistributed by hash code to all AMPs.  Then we do a SORT to order Spool 14 by row hash.  The size of Spool 14 is estimated with no confidence to be 1,204 rows.  The estimated time for this step is 0.02 seconds.
  8) We do an all-AMPs JOIN step from Spool 13 (Last Use) by way of a RowHash match scan, which is joined to Spool 14 (Last Use) by way of a RowHash match scan.  Spool 13 and Spool 14 are joined using a merge join, with a join condition of ("(d_date >= launch_dt) AND
     (week_id = week_id)").  The result goes into Spool 9 (all_amps) (compressed columns allowed), which is built locally on the AMPs. The size of Spool 9 is estimated with no confidence to be 1,977 rows.  The estimated time for this step is 0.02 seconds.
  9) We do an all-AMPs SUM step to aggregate from Spool 9 (Last Use) by way of an all-rows scan, and the grouping identifier in field 1. Aggregate Intermediate Results are computed globally, then placed in Spool 15.  The size of Spool 15 is estimated with no confidence to be 1,483 rows.  The estimated time for this step is 0.03 seconds.
 10) We do an all-AMPs RETRIEVE step from Spool 15 (Last Use) by way of an all-rows scan into Spool 1 (all_amps) (compressed columns allowed), which is built locally on the AMPs.  The size of Spool 1 is estimated with no confidence to be 1,483 rows.  The estimated time for this step is 0.02 seconds.
 11) We execute the following steps in parallel.
      1) We do an all-AMPs RETRIEVE step from Spool 1 (Last Use) by wayof an all-rows scan into Spool 17 (all_amps) (compressed columns allowed), which is redistributed by hash code to all AMPs.  Then we do a SORT to order Spool 17 by row hash.  The size of Spool 17 is estimated with no confidence to be 1,483 rows.  The estimated time for this step is 0.02 seconds.
      2) We do an all-AMPs RETRIEVE step from time_dev.lu_day by way of an all-rows scan with a condition of ("NOT (time_dev.lu_day.week_id IS NULL)") into Spool 18 (all_amps) (compressed columns allowed), which is redistributed by hash code to all AMPs.  Then we do a SORT to order Spool 18 by row hash.  The size of Spool 18 is estimated with high confidence to be 5,117 rows.  The estimated time for this step is 0.02 seconds.
 12) We execute the following steps in parallel.
      1) We do an all-AMPs JOIN step from Spool 17 (Last Use) by way of a RowHash match scan, which is joined to Spool 18 (Last Use) by way of a RowHash match scan.  Spool 17 and Spool 18 are joined using a merge join, with a join condition of ("WEEK_ID = week_id").  The result goes into Spool 2 (all_amps), which is redistributed by hash code to all AMPs.  Then we do a SORT to order Spool 2 by the sort key in spool field1 eliminating duplicate rows.  The size of Spool 2 is estimated with no confidence to be 6,598 rows.  The estimated time for this step is 0.33 seconds.
      2) We do a group-AMP RETRIEVE step from Spool 3 (Last Use) by way of an all-rows scan into Spool 22 (all_amps) (compressed columns allowed), which is duplicated on all AMPs.  The size of Spool 22 is estimated with high confidence to be 28 rows. The estimated time for this step is 0.00 seconds.
 13) We execute the following steps in parallel.
      1) We do an all-AMPs JOIN step from Spool 22 (Last Use) by way of an all-rows scan, which is joined to all partitions of stage_dev.div_item_day with a condition of ("(NOT(stage_dev.div_item_day.division_id IS NULL )) AND ((NOT(stage_dev.div_item_day.upc_id IS NULL )) AND (NOT
         (stage_dev.div_item_day.txn_dt IS NULL )))").  Spool 22 and stage_dev.div_item_day are joined using a product join, with a join condition of ("(1=1)").  The result goes into Spool 23 (all_amps) (compressed columns allowed), which is redistributed by hash code to all AMPs.  Then we do a SORT to order Spool 23 by row hash.  The size of Spool 23 is estimated with no confidence to be 2,981 rows.  The estimated time for this step is 0.02 seconds.
      2) We do an all-AMPs RETRIEVE step from Spool 2 (Last Use) by wayof an all-rows scan with a condition of ("(NOT (CORPORATION_ID IS NULL )) AND (NOT (DIVISION_ID IS NULL ))") into Spool 24 (all_amps) (compressed columns allowed), which is redistributed by hash code to all AMPs.  Then we do a SORT to order Spool 24 by row hash.  The size of Spool 24 is estimated with no confidence to be 13,195 rows.  The estimated time for this step is 0.03 seconds.
 14) We do an all-AMPs JOIN step from Spool 23 (Last Use) by way of a RowHash match scan, which is joined to Spool 24 (Last Use) by way of a RowHash match scan.  Spool 23 and Spool 24 are joined using a merge join, with a join condition of ("(division_id = DIVISION_ID)
     AND (txn_dt = D_DATE)").  The result goes into Spool 25 (all_amps) (compressed columns allowed), which is redistributed by hash code to all AMPs.  Then we do a SORT to order Spool 25 by row hash.
     The size of Spool 25 is estimated with no confidence to be 342,426 rows.  The estimated time for this step is 0.60 seconds.
 15) We do an all-AMPs JOIN step from Spool 25 (Last Use) by way of a RowHash match scan, which is joined to stage_dev.upc by way of a RowHash match scan with a condition of ("(NOT
     (stage_dev.upc.category_id IS NULL )) AND (NOT
     (stage_dev.upc.upc_id IS NULL ))").  Spool 25 and
     stage_dev.upc are joined using a merge join, with a join condition of ("(upc_id = stage_dev.upc.upc_id) AND
     ((stage_dev.upc.category_id = CATEGORY_ID) AND
     (stage_dev.upc.corporation_id = CORPORATION_ID ))").  The result goes into Spool 21 (all_amps) (compressed columns allowed), which is built locally on the AMPs.  The size of Spool 21 is estimated with no confidence to be 342,426 rows.  The estimated time for this step is 0.31 seconds.
 16) We do an all-AMPs SUM step to aggregate from Spool 21 (Last Use) by way of an all-rows scan, and the grouping identifier in field 1. Aggregate Intermediate Results are computed globally, then placed in Spool 26.  The size of Spool 26 is estimated with no confidence to be 308,184 rows.  The estimated time for this step is 0.52 seconds.
 17) We do an all-AMPs RETRIEVE step from Spool 26 (Last Use) by way of an all-rows scan into Spool 19 (all_amps), which is built locally on the AMPs.  The size of Spool 19 is estimated with no confidence to be 308,184 rows.  The estimated time for this step is 0.25 seconds.
 18) Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
  -> The contents of Spool 19 are sent back to the user as the result of statement 1.  The total estimated time is 2.29 seconds.

RE: Performance problem on Teradata SQL query

Hi zanzemaj,
it looks like there are some statistics missing, because there are lots of "no confidence". And V2R6.1 is quite sensible for missing stats...

First try a "diagnostic helpstats on for session" and check explain for recommended stats...

And better show the query itself and the DDL and info about table sizes.

Dieter

RE: Performance problem on Teradata SQL query

(OP)
Thank you for the response dnoeth. How will I execute "diagnostic helpstats on for session". Sorry, but it's my first time to do optimization. Thanks...

Here's the query:
SELECT   dates.division_id,
    dates.launch_dt,
    dates.store_id,
    upc.category_id,
    SUM(CASE WHEN date_ind = 1
       AND dates.week_id = last_week.week_id THEN  SUM_net_amt ELSE 0 END) last_wk_sales,
    SUM(CASE WHEN date_ind = 1 THEN
SUM_net_amt ELSE 0 END) launch_thru_last_wk_sales,
    SUM(CASE WHEN date_ind = 2
        AND dates.week_id = last_week.week_id - 100 THEN SUM_net_amt ELSE 0 END) ly_week_sales,
    SUM(CASE WHEN date_ind = 2 THEN
SUM_net_amt ELSE 0 END) ly_launch_thru_last_wk_sales
   FROM stage_dev.vdiv_item_day ag,
    (SELECT  week_id
     FROM dss_dev.lu_day
     WHERE  d_date = date - 6) last_week(week_id),
    stage_dev.cap_perf_upcs upc,
    (SELECT 1,
      str.corporation_id,
      str.division_id,
      goals.store_id,
      goals.launch_dt,
      goals.category_id,
      dy.d_date,
      dy.week_id
     FROM
      dss_dev.deli_capital_prfrm_tgt goals,
      dss_dev.lu_day dy,
      dss_dev.lu_week wk,
      dss_dev.lu_store str
     WHERE
      goals.week_id = dy.week_id        
     AND          dy.week_id = wk.week_id             
     AND          wk.week_end_dt >= goals.launch_dt    
     AND          wk.week_end_dt <=date         
     AND          str.store_id = goals.store_id
     UNION
     SELECT 2,
      x.corporation_id,
      x.division_id,
      x.store_id,
      x.launch_dt,
      x.category_id,
      dy.d_date ly_date,
      x.week_id
     FROM
      dss_dev.lu_day dy,
      (SELECT str.corporation_id,
       str.division_id,
       goals.store_id,
       goals.launch_dt,
       goals.category_id,
       wk.week_id
      FROM
       dss_dev.deli_capital_prfrm_tgt goals,
       dss_dev.lu_day dy,
       dss_dev.lu_week wk,
       dss_dev.lu_store str
      WHERE
       goals.week_id = dy.week_id        
    AND   dy.week_id = wk.week_id + 100   
    AND   dy.d_date >= goals.launch_dt               
    AND   wk.week_end_dt <= date        
    AND   dy.d_date <= date - 6            
    AND   str.store_id = goals.store_id
    GROUP BY  1,2,3,4,5,6  ) x (corporation_id,division_id,store_id,launch_dt,category_id,week_id)
     WHERE
       x.week_id = dy.week_id  ) dates(date_ind,corporation_id,division_id,store_id,launch_dt,category_id,d_date,week_id)
   WHERE
     ag.txn_dt = dates.d_date    
    AND ag.upc_id = upc.upc_id           
    AND  ag.division_id = dates.division_id             
    AND  upc.category_id = dates.category_id    
    AND  upc.corporation_id = dates.corporation_id GROUP BY 1,2,3,4;

RE: Performance problem on Teradata SQL query

Hi zanzemaj,
"diagnostic helpstats on for session" is a valid SQL statement, so just submit it.

Some remarks:
Replace UNION with UNION ALL
Try to replace the UNION Derived Tables with a single access to those 4 tables, the main difference is just this year/last year.

Btw, "total estimated time is 2.29 seconds.", doesn't sound that bad, what's the actual runtime?

Dieter

RE: Performance problem on Teradata SQL query

(OP)
Ok, I will try out your remarks. Thank you so much for sharing your expertise.

Btw, I'm not sure of actual runtime. The user just reported this me, that the query takes much time and that's not acceptable runtime for them.

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members!

Resources

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close