Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

GROUP BY giving different results

Status
Not open for further replies.

gsdcrazy

Programmer
Sep 2, 2002
31
US
Gurus,

Someone in my office has written the SQL below:

select a12.distrbtn_chnl_id
,max(a14.distrbtn_chnl) distrbtn_chnl
,a13.prod_type_id
,max(a16.prod_type) prod_type
,a11.month_id
,max(a15.month_name) month_name
,max(a15.year_id) year_id
,count(distinct a11.agrmt_id) ofactivecont
from salesmi_dm_owner.agreement_makeup_summary a11
,salesmi_dm_owner.agent_appointments a12
,salesmi_dm_owner.products a13
,salesmi_dm_owner.distribution_channels a14
,salesmi_dm_owner.lookup_months a15
,salesmi_dm_owner.product_types a16
where a11.agent_apptmnt_id = a12.agent_apptmnt_id
and a11.prod_id = a13.prod_id
and a12.distrbtn_chnl_id = a14.distrbtn_chnl_id
and a11.month_id = a15.month_id
and a13.prod_type_id = a16.prod_type_id
and a11.month_id in (200207, 200307)
and a13.prod_type_id in (0, 2, 1)
and a13.prod_line_id = 1
and a11.agrmt_status_id in 1
group by a12.distrbtn_chnl_id
,a13.prod_type_id
,a11.month_id;

The SQL returns 10 rows. If you change the WHERE clause to eliminate the line "and a13.prod_type_id in (0, 2, 1)", the SQL returns 19 rows. The only "prod_type_id"s in the database for all the criteria are 1 and 2. The difference is that without this line in the WHERE clause, the "a11.month_id" portion of the WHERE clause works. In the SQL above all rows are summarized to the rolled into the "200307" month id.

The explain plans for the two SQL's are vastly different, but that is not unusual. The extra check in the WHERE clause forces a new execution plan. The execution plan for the SQL above is:

Q_PLAN
--------------------------------------------------------------------------------
SELECT STATEMENT
SORT GROUP BY
HASH JOIN
TABLE ACCESS FULL DISTRIBUTION_CHANNELS
HASH JOIN
TABLE ACCESS FULL PRODUCT_TYPES
HASH JOIN
HASH JOIN
TABLE ACCESS FULL LOOKUP_MONTHS
NESTED LOOPS
TABLE ACCESS FULL PRODUCTS
PARTITION RANGE INLIST
TABLE ACCESS BY LOCAL INDEX ROWID AGREEMENT_MAKEUP_SUMMARY
BITMAP CONVERSION TO ROWIDS
BITMAP AND
BITMAP OR
BITMAP INDEX SINGLE VALUE I_AGRMKSM_MTH_ID
BITMAP INDEX SINGLE VALUE I_AGRMKSM_MTH_ID
BITMAP INDEX SINGLE VALUE I_AGRMKSM_PROD_ID
BITMAP INDEX SINGLE VALUE I_AGRMKSM_AS_ID
TABLE ACCESS FULL AGENT_APPOINTMENTS

The execution plan without the line in the WHERE plan is:

SELECT STATEMENT
SORT GROUP BY
HASH JOIN
TABLE ACCESS FULL DISTRIBUTION_CHANNELS
HASH JOIN
TABLE ACCESS FULL PRODUCT_TYPES
HASH JOIN
HASH JOIN
TABLE ACCESS FULL LOOKUP_MONTHS
HASH JOIN
TABLE ACCESS FULL PRODUCTS
PARTITION RANGE INLIST
TABLE ACCESS FULL AGREEMENT_MAKEUP_SUMMARY
TABLE ACCESS FULL AGENT_APPOINTMENTS

Any ideas why the "GROUP BY" for month_id works when this line ("and a13.prod_type_id in (0, 2, 1)") is taken out of the WHERE clause?

If you need any other information, please let me know. This has me baffled. The use of this line should not change the results.

Thanks,
gsdcrazy
 
GSDCrazy,

I believe there are many of us "crazies" that would enjoy helping you with this, but since your SQL is not generating any syntax errors, we cannot reasonably trouble shoot this type of logic issue without some sample data. Since one of Tek-Tips rules is that we cannot e-mail one another, it makes it a bit rough to tackle this one. I believe this hurdle is why you haven't seen any responses yet.

I anyone aware of a Tek-Tips posting/bulletin-board area/methodology that would allow GSDCrazy to post some export files and make this troubleshooting easier for us?
 
SantaMufasa,

Thanks for the desire to help. We got the SQL to return the correct results by changing the "and a13.prod_type_id in (0, 2, 1)" line to use the equivalent title "and a16.prod_type in ('UNKNOWN PRODUCT TYPE','FIXED','VARIABLE')", but it still puzzles me why the parser (or whatever) ignored the MONTH_ID group by entry when we used the PROD_TYPE_ID, but recognized and used it when we eliminated the PROD_TYPE_ID condition altogether or changed it to the look-up title.

I understand not being able to troubleshoot SQL without some test data. Maybe someday we will be able to include it, but I was hoping someone might have encountered something similar before. Since the SQL is working, I guess it is just a curiosity point for me now. I had never seen part of a GROUP BY statement ignored before and to make it more intriguing it was due to a condition on a different column.

Thanks again for the thought. I guess this will remain one of those Oracle features I never really understand.

Thanks,
GSDCrazy
 
Why do you think that GROUP BY is ignored? Do you have duplicates in your result set? And can you select distinct values for salesmi_dm_owner.products.prod_type_id? Query, not tell about.
Do you use some querying tool? If so, try to issue those statements directly from sql*plus or other "not-so-clever-but-not-falsifying" program.

Regards, Dima
 
Dima,

I describe the results as one element of the GROUP BY being ignored based upon the results set I am returned. It may be an inappropriate description. I am running the SQL using SQL*Plus. I am not sure how else to explain it, but I included the detail of my analysis below to see if I did something wrong.

If the line is part of the WHERE condition I receive a results set with a MONTH_ID of only "200307" as below:

DISTRBTN_CHNL_ID DISTRBTN_CHNL PROD_TYPE_ID PROD_TYP
---------------- ---------------------------------- ------------ --------
MONTH_ID MONTH_ YEAR_ID OFACTIVECONT
---------- ------ ---------- ------------
1 INSURANCE PRODUCERS/PPAS 1 FIXED
200307 July 2003 1

2 INVESTMENT DEALER 1 FIXED
200307 July 2003 14409

2 INVESTMENT DEALER 2 VARIABLE
200307 July 2003 474135

3 REGIONAL WIREHOUSE 1 FIXED
200307 July 2003 5916

3 REGIONAL WIREHOUSE 2 VARIABLE
200307 July 2003 116422

4 FINANCIAL INSTITUTIONS 1 FIXED
200307 July 2003 155980

4 FINANCIAL INSTITUTIONS 2 VARIABLE
200307 July 2003 203507

5 NATIONWIDE RETIREMENT SOLUTIONS 2 VARIABLE
200307 July 2003 73

6 NATIONWIDE AGENCY 1 FIXED
200307 July 2003 9253

6 NATIONWIDE AGENCY 2 VARIABLE
200307 July 2003 81803


10 rows selected.

When I run the SQL with the one WHERE condition commented out, I receive a results set containing both MONTH_IDs of "200307" and "200207" as follows:

DISTRBTN_CHNL_ID DISTRBTN_CHNL PROD_TYPE_ID PROD_TYP
---------------- ---------------------------------- ------------ --------
MONTH_ID MONTH_NAME YEAR_ID OFACTIVECONT
---------- -------------------- ---------- ------------
1 INSURANCE PRODUCERS/PPAS 1 FIXED
200307 July 2003 1

2 INVESTMENT DEALER 1 FIXED
200207 July 2002 12815

2 INVESTMENT DEALER 1 FIXED
200307 July 2003 13188

2 INVESTMENT DEALER 2 VARIABLE
200207 July 2002 441328

2 INVESTMENT DEALER 2 VARIABLE
200307 July 2003 419677

3 REGIONAL WIREHOUSE 1 FIXED
200207 July 2002 3195

3 REGIONAL WIREHOUSE 1 FIXED
200307 July 2003 5674

3 REGIONAL WIREHOUSE 2 VARIABLE
200207 July 2002 94558

3 REGIONAL WIREHOUSE 2 VARIABLE
200307 July 2003 106635

4 FINANCIAL INSTITUTIONS 1 FIXED
200207 July 2002 116803

4 FINANCIAL INSTITUTIONS 1 FIXED
200307 July 2003 149198

4 FINANCIAL INSTITUTIONS 2 VARIABLE
200207 July 2002 175527

4 FINANCIAL INSTITUTIONS 2 VARIABLE
200307 July 2003 185028

5 NATIONWIDE RETIREMENT SOLUTIONS 2 VARIABLE
200207 July 2002 62

5 NATIONWIDE RETIREMENT SOLUTIONS 2 VARIABLE
200307 July 2003 34

6 NATIONWIDE AGENCY 1 FIXED
200207 July 2002 6362

6 NATIONWIDE AGENCY 1 FIXED
200307 July 2003 8938

6 NATIONWIDE AGENCY 2 VARIABLE
200207 July 2002 76470

6 NATIONWIDE AGENCY 2 VARIABLE
200307 July 2003 74561


19 rows selected.

I selected detail rows for the DISTRBTN_CHNL_ID of 5 and PROD_TYPE_ID of 2 (the smallest one) and found the counts on the second SQL to be correct for both "200307" and "200207". I found the count on the first SQL to be correct if both "200307" and "200207" data were selected and the DISTINCT clause in the COUNT was taken into consideration. 73 is the total of distinct AGRMT_IDs for a DISTRBTN_CHNL_ID of 5 and a PROD_TYPE_ID of 2 and MONTH_IDs of both "200307" and "200207". That suggested to me that the MONTH_ID portion of the group by was being ignored and that it was coincidence (random chance) that the MONTH_ID of "200307" was displayed.

I would really appreciate any insight into why this would occur. I know it is impossible to really do any testing since we can not provide dumps of the data through tek-tips, but just some reasonable ideas would be helpful. As I mentioned in another reply, we have been able to get the correct result set using the look-up value for PROD_TYPE_ID instead of PROD_TYPE_ID, but I am now sort of looking for a way to predict this behavior to set up some coding rules or hints for others in the company that may run into this.

All your input is greatly appreciated.

Thanks,
GSDCrazy

 
I may asume that those queries differ not only in where caluse. My suggestion is to execute them again. I suppose that your original query might contain a typo: something like and a11.month_id in (200307, 200307)

Regards, Dima
 
Oops,it was wrong assumption. Though as these case tends to be very strange, can you execute both statements one-by-one with autotrace and publish the whole sequence: statement-results-plan?

Regards, Dima
 
Dima,

Running autotrace was a good idea. It certainly gave me more information, but nothing that I can interpret as affecting the results. I guess this is just going to remain one of those Oracle mysteries. I have included the output from the two runs just in case someone else might see something that I do not recognize.

Thanks everyone for your patience, interest, and ideas.

gsdcrazy

col distrbtn_chnl_id format 9
col distrbtn_chnl format a34
col prod_type format a8
col month_name format a6
set pagesize 50000
set linesize 150
spool distribution.rlt
set autotrace on
select a12.distrbtn_chnl_id
,max(a14.distrbtn_chnl) distrbtn_chnl
,a13.prod_type_id
,max(a16.prod_type) prod_type
,a11.month_id
,max(a15.month_name) month_name
,max(a15.year_id) year_id
,count(distinct a11.agrmt_id) ofactivecont
from salesmi_dm_owner.agreement_makeup_summary a11
,salesmi_dm_owner.agent_appointments a12
,salesmi_dm_owner.products a13
,salesmi_dm_owner.distribution_channels a14
,salesmi_dm_owner.lookup_months a15
,salesmi_dm_owner.product_types a16
where a11.agent_apptmnt_id = a12.agent_apptmnt_id
and a11.prod_id = a13.prod_id
and a12.distrbtn_chnl_id = a14.distrbtn_chnl_id
and a11.month_id = a15.month_id
and a13.prod_type_id = a16.prod_type_id
and a11.month_id in (200207, 200307)
and a13.prod_type_id in (0, 2, 1)
and a13.prod_line_id = 1
and a11.agrmt_status_id in 1
group by a12.distrbtn_chnl_id
,a13.prod_type_id
,a11.month_id;
set autotrace off
spool off

DISTRBTN_CHNL_ID DISTRBTN_CHNL PROD_TYPE_ID PROD_TYP MONTH_ID MONTH_ YEAR_ID OFACTIVECONT
---------------- ---------------------------------- ------------ -------- ---------- ------ ---------- ------------
1 INSURANCE PRODUCERS/PPAS 1 FIXED 200307 July 2003 1
2 INVESTMENT DEALER 1 FIXED 200307 July 2003 14409
2 INVESTMENT DEALER 2 VARIABLE 200307 July 2003 474135
3 REGIONAL WIREHOUSE 1 FIXED 200307 July 2003 5916
3 REGIONAL WIREHOUSE 2 VARIABLE 200307 July 2003 116422
4 FINANCIAL INSTITUTIONS 1 FIXED 200307 July 2003 155980
4 FINANCIAL INSTITUTIONS 2 VARIABLE 200307 July 2003 203507
5 NATIONWIDE RETIREMENT SOLUTIONS 2 VARIABLE 200307 July 2003 73
6 NATIONWIDE AGENCY 1 FIXED 200307 July 2003 9253
6 NATIONWIDE AGENCY 2 VARIABLE 200307 July 2003 81803

10 rows selected.


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=12732 Card=4 Bytes=400)

1 0 SORT (GROUP BY) (Cost=12732 Card=4 Bytes=400)
2 1 HASH JOIN (Cost=12706 Card=3195 Bytes=319500)
3 2 TABLE ACCESS (FULL) OF 'DISTRIBUTION_CHANNELS' (Cost=1 Card=12 Bytes=288)

4 2 HASH JOIN (Cost=12704 Card=3195 Bytes=242820)
5 4 TABLE ACCESS (FULL) OF 'PRODUCT_TYPES' (Cost=1 Card=3 Bytes=42)

6 4 HASH JOIN (Cost=12702 Card=3195 Bytes=198090)
7 6 HASH JOIN (Cost=12072 Card=3195 Bytes=169335)
8 7 TABLE ACCESS (FULL) OF 'LOOKUP_MONTHS' (Cost=1 Card=2 Bytes=34)

9 7 NESTED LOOPS (Cost=12070 Card=49522 Bytes=1782792)

10 9 TABLE ACCESS (FULL) OF 'PRODUCTS' (Cost=2 Card=49 Bytes=539)

11 9 PARTITION RANGE (INLIST)
12 11 TABLE ACCESS (BY LOCAL INDEX ROWID) OF 'AGREEMENT_MAKEUP_SUMMARY' (Cost=12070 Card=738780 Bytes=18469500)

13 12 BITMAP CONVERSION (TO ROWIDS)
14 13 BITMAP AND
15 14 BITMAP OR
16 15 BITMAP INDEX (SINGLE VALUE) OF 'I_AGRMKSM_MTH_ID'

17 15 BITMAP INDEX (SINGLE VALUE) OF 'I_AGRMKSM_MTH_ID'

18 14 BITMAP INDEX (SINGLE VALUE) OF 'I_AGRMKSM_PROD_ID'

19 14 BITMAP INDEX (SINGLE VALUE) OF 'I_AGRMKSM_AS_ID'

20 6 TABLE ACCESS (FULL) OF 'AGENT_APPOINTMENTS' (Cost=617 Card=917390 Bytes=8256510)

Statistics
----------------------------------------------------------
1250 recursive calls
70 db block gets
418488 consistent gets
206269 physical reads
0 redo size
1536 bytes sent via SQL*Net to client
426 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
515 sorts (memory)
1 sorts (disk)
10 rows processed


col distrbtn_chnl_id format 9
col distrbtn_chnl format a34
col prod_type format a8
set pagesize 50000
set linesize 150
spool distribution2.rlt
set autotrace on
select a12.distrbtn_chnl_id
,max(a14.distrbtn_chnl) distrbtn_chnl
,a13.prod_type_id
,max(a16.prod_type) prod_type
,a11.month_id
,max(a15.month_name) month_name
,max(a15.year_id) year_id
,count(distinct a11.agrmt_id) ofactivecont
from salesmi_dm_owner.agreement_makeup_summary a11
,salesmi_dm_owner.agent_appointments a12
,salesmi_dm_owner.products a13
,salesmi_dm_owner.distribution_channels a14
,salesmi_dm_owner.lookup_months a15
,salesmi_dm_owner.product_types a16
where a11.agent_apptmnt_id = a12.agent_apptmnt_id
and a11.prod_id = a13.prod_id
and a12.distrbtn_chnl_id = a14.distrbtn_chnl_id
and a11.month_id = a15.month_id
and a13.prod_type_id = a16.prod_type_id
and a11.month_id in (200207, 200307)
-- and a13.prod_type_id in (0, 2, 1)
and a13.prod_line_id = 1
and a11.agrmt_status_id in 1
group by a12.distrbtn_chnl_id
,a13.prod_type_id
,a11.month_id;
set autotrace off
spool off

DISTRBTN_CHNL_ID DISTRBTN_CHNL PROD_TYPE_ID PROD_TYP MONTH_ID MONTH_NAME YEAR_ID OFACTIVECONT
---------------- ---------------------------------- ------------ -------- ---------- -------------------- ---------- ------------
1 INSURANCE PRODUCERS/PPAS 1 FIXED 200307 July 2003 1
2 INVESTMENT DEALER 1 FIXED 200207 July 2002 12815
2 INVESTMENT DEALER 1 FIXED 200307 July 2003 13188
2 INVESTMENT DEALER 2 VARIABLE 200207 July 2002 441328
2 INVESTMENT DEALER 2 VARIABLE 200307 July 2003 419677
3 REGIONAL WIREHOUSE 1 FIXED 200207 July 2002 3195
3 REGIONAL WIREHOUSE 1 FIXED 200307 July 2003 5674
3 REGIONAL WIREHOUSE 2 VARIABLE 200207 July 2002 94558
3 REGIONAL WIREHOUSE 2 VARIABLE 200307 July 2003 106635
4 FINANCIAL INSTITUTIONS 1 FIXED 200207 July 2002 116803
4 FINANCIAL INSTITUTIONS 1 FIXED 200307 July 2003 149198
4 FINANCIAL INSTITUTIONS 2 VARIABLE 200207 July 2002 175527
4 FINANCIAL INSTITUTIONS 2 VARIABLE 200307 July 2003 185028
5 NATIONWIDE RETIREMENT SOLUTIONS 2 VARIABLE 200207 July 2002 62
5 NATIONWIDE RETIREMENT SOLUTIONS 2 VARIABLE 200307 July 2003 34
6 NATIONWIDE AGENCY 1 FIXED 200207 July 2002 6362
6 NATIONWIDE AGENCY 1 FIXED 200307 July 2003 8938
6 NATIONWIDE AGENCY 2 VARIABLE 200207 July 2002 76470
6 NATIONWIDE AGENCY 2 VARIABLE 200307 July 2003 74561

19 rows selected.

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=13088 Card=12 Bytes=1200)

1 0 SORT (GROUP BY) (Cost=13088 Card=12 Bytes=1200)
2 1 HASH JOIN (Cost=13053 Card=4499 Bytes=449900)
3 2 TABLE ACCESS (FULL) OF 'DISTRIBUTION_CHANNELS' (Cost=1 Card=12 Bytes=288)

4 2 HASH JOIN (Cost=13051 Card=4499 Bytes=341924)
5 4 TABLE ACCESS (FULL) OF 'PRODUCT_TYPES' (Cost=1 Card=3 Bytes=42)

6 4 HASH JOIN (Cost=13049 Card=4499 Bytes=278938)
7 6 HASH JOIN (Cost=12414 Card=4499 Bytes=238447)
8 7 TABLE ACCESS (FULL) OF 'LOOKUP_MONTHS' (Cost=1 Card=2 Bytes=34)

9 7 HASH JOIN (Cost=12412 Card=69734 Bytes=2510424)
10 9 TABLE ACCESS (FULL) OF 'PRODUCTS' (Cost=2 Card=69 Bytes=759)

11 9 PARTITION RANGE (INLIST)
12 11 TABLE ACCESS (FULL) OF 'AGREEMENT_MAKEUP_SUMMARY' (Cost=12408 Card=738780 Bytes=18469500)

13 6 TABLE ACCESS (FULL) OF 'AGENT_APPOINTMENTS' (Cost=617 Card=917037 Bytes=8253333)

Statistics
----------------------------------------------------------
0 recursive calls
92 db block gets
58261 consistent gets
65658 physical reads
0 redo size
2446 bytes sent via SQL*Net to client
537 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
19 sorts (memory)
1 sorts (disk)
19 rows processed

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top