Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

query help - i'm an idiot 3

Status
Not open for further replies.

cLFlaVA

Programmer
Jun 14, 2004
6,450
US
hi people. i am a moron and apparently know nothing about SQL, despite what my employers may think.

i have the query below that is returning values that are sometimes 4x more than they should be. i thought i was handling for this by grouping everything. any idea what i'm doing wrong? i know i've run into this before and have just worked around it... now i'd like to know why it's happening.

Code:
select p.sub_id
     , p.sub_name
     , sum(pyb.amount) py_budget
     , sum(cyb.amount) cy_budget
     , sum(pya.amount) py_actual
     , sum(cya.amount) cy_actual
     , sum(aya.amount) ay_actual
     , sum(cyf.amount) cy_fcst
     , sum(ayf.amount) ay_fcst
  from sub_project p left join drilldown_approved pyb
    on (p.sub_id = pyb.sub_id and pyb.cost_budget_year = 2006
) left join drilldown_approved cyb
    on (p.sub_id = cyb.sub_id and cyb.cost_budget_year = 2007
) left join drilldown_actual pya
    on (p.sub_id = pya.sub_id and pya.cost_budget_year = 2006
) left join drilldown_actual cya
    on (p.sub_id = cya.sub_id and cya.cost_budget_year = 2007
) left join drilldown_actual aya
    on (p.sub_id = aya.sub_id
) left join drilldown_eac cyf
    on (p.sub_id = cyf.sub_id and cyf.cost_budget_year = 2007
) left join drilldown_eac ayf
    on (p.sub_id = ayf.sub_id)
 where p.program_id = 818
group by p.sub_id, p.sub_name

feel free to mock my code and offer insight as to how to improve...

thanks :)



*cLFlaVA
----------------------------
[tt]"quote goes here"[/tt]
[URL unfurl="true"]http://www.coryarthus.com/[/url]
 
Are you sure you posted in the right forum?

Christiaan Baes
Belgium

"In a system where you can define a factor as part of a third factor, you need another layer to check the main layer in case the second layer is not the base unit." - jrbarnett
 
My guess is that you have duplicates on the left side of the join. Are you positive that the sub_id and sub_name combinations are unique?

You might try this if you aren't sure:

Code:
[COLOR=blue]select[/color] p.sub_id
     , p.sub_name
     , sum(pyb.amount) py_budget
     , sum(cyb.amount) cy_budget
     , sum(pya.amount) py_actual
     , sum(cya.amount) cy_actual
     , sum(aya.amount) ay_actual
     , sum(cyf.amount) cy_fcst
     , sum(ayf.amount) ay_fcst
  [COLOR=blue]from[/color] (
[COLOR=blue]select[/color] [COLOR=#FF00FF]distinct[/color] sub_id
	,sub_name
[COLOR=blue]from[/color] sub_project
) p [COLOR=#FF00FF]left[/color] [COLOR=blue]join[/color] drilldown_approved pyb
    [COLOR=blue]on[/color] (p.sub_id = pyb.sub_id and pyb.cost_budget_year = 2006
) [COLOR=#FF00FF]left[/color] [COLOR=blue]join[/color] drilldown_approved cyb
    [COLOR=blue]on[/color] (p.sub_id = cyb.sub_id and cyb.cost_budget_year = 2007
) [COLOR=#FF00FF]left[/color] [COLOR=blue]join[/color] drilldown_actual pya
    [COLOR=blue]on[/color] (p.sub_id = pya.sub_id and pya.cost_budget_year = 2006
) [COLOR=#FF00FF]left[/color] [COLOR=blue]join[/color] drilldown_actual cya
    [COLOR=blue]on[/color] (p.sub_id = cya.sub_id and cya.cost_budget_year = 2007
) [COLOR=#FF00FF]left[/color] [COLOR=blue]join[/color] drilldown_actual aya
    [COLOR=blue]on[/color] (p.sub_id = aya.sub_id
) [COLOR=#FF00FF]left[/color] [COLOR=blue]join[/color] drilldown_eac cyf
    [COLOR=blue]on[/color] (p.sub_id = cyf.sub_id and cyf.cost_budget_year = 2007
) [COLOR=#FF00FF]left[/color] [COLOR=blue]join[/color] drilldown_eac ayf
    [COLOR=blue]on[/color] (p.sub_id = ayf.sub_id)
 [COLOR=blue]where[/color] p.program_id = 818
[COLOR=blue]group[/color] [COLOR=blue]by[/color] p.sub_id, p.sub_name

Hope it helps,

Alex

[small]----signature below----[/small]
I don't do any programming whatsoever

Ignorance of certain subjects is a great part of wisdom
 
without data this is very diffcult to guess, but my guess is that you need to supply date ranges for the other joins

left join drilldown_eac cyf
on (p.sub_id = cyf.sub_id and cyf.cost_budget_year = 2007
) left join drilldown_eac ayf
on (p.sub_id = ayf.sub_id) --*** where is the 2006 AND condition here?????

Denis The SQL Menace
--------------------
SQL Server Code,Tips and Tricks, Performance Tuning
SQLBlog.com, Google Interview Questions
 
alex,

i am positive that the sub_id and sub_name values are unique in the sub_project table.

denis,

sorry, i know it's a pain without data. the year condition does not exist because i'd like a sum of all years data there (ayf = all year forecast).

i do know for a fact that the joined tables have multiple rows, but i thought the sum() call would handle for that.

thanks, guys, for your help.



*cLFlaVA
----------------------------
[tt]"quote goes here"[/tt]
[URL unfurl="true"]http://www.coryarthus.com/[/url]
 
What I would do is run the data without the sums sorted by sub_id and sub_name (and with the complete data for each table). If there are too many records for this, I would find a small subset of ids that you know the sums are wrong for and additonally filter onthis. Once you see the complete data, you will likely see what is being duplicated.

"NOTHING is more important in a database than integrity." ESquared
 
SQLSister,

That's just it though - I know there are duplicates.

Basically, for each record in sub_project (let's say there are three after my selection criteria), I want to retrieve:

- Last Year's budget
- Current Year budget
(accessible in the same view, drilldown_approved)

- Last Year's actual costs
- Current Year's actual costs
- All Year actual costs
(accessible in the same view, drilldown_actual)

- Current Year forecast
- All Year forecast
(accesible in the same view, drilldown_eac)

Each of these views may have several cost records for each project. I assumed the SUM() would return one single row for each. Obviously I am missing something.

Do I need to do this with several subqueries? How do you all do this? Using application coding? I like to try to force the DB to do as much as possible...

thanks again everyone



*cLFlaVA
----------------------------
[tt]"quote goes here"[/tt]
[URL unfurl="true"]http://www.coryarthus.com/[/url]
 
>>> I assumed the SUM() would return one single row for each.

Just to be clear, is your problem inflated sums or too many rows?

Also, (another pin the tail on the donkey) do you need to filter by program_id on the right side of your joins? I may just be extra thick today, but is the sub_id unique per program or across the board?

[small]----signature below----[/small]
I don't do any programming whatsoever

Ignorance of certain subjects is a great part of wisdom
 
there are a handful of sub_ids per program_id. specifying a program id as in the example above returns 3 "subs" (projects). i would then want the data related to those three projects, in columnar form.

> Just to be clear, is your problem inflated sums or too many rows?

inflated sums, perhaps the result of too many rows :(



*cLFlaVA
----------------------------
[tt]"quote goes here"[/tt]
[URL unfurl="true"]http://www.coryarthus.com/[/url]
 
hm... I am definitely extra thick today. The same sub_id won't show up for > 1 project, right?

[small]----signature below----[/small]
I don't do any programming whatsoever

Ignorance of certain subjects is a great part of wisdom
 
and these are totally unique (no possibility that sub_id 137 could show up under *program* (not project) 818 and 915 for example), right?

Sorry to be so dumb!

Alex

[small]----signature below----[/small]
I don't do any programming whatsoever

Ignorance of certain subjects is a great part of wisdom
 
SQL:
Code:
select sub_id
     , sub_name
     , program_id
  from sub_project
 where program_id = 818

Result:
[tt]1483 Global Contracts Repository 818
1563 Contract File Prototype 818
1577 Contract Repository Foundation 818[/tt]



*cLFlaVA
----------------------------
[tt]"quote goes here"[/tt]
[URL unfurl="true"]http://www.coryarthus.com/[/url]
 
Ok, what other information do you have in your views (besides sub_id and amount). Do you have program_id?

would it be too much to assume that this:

Code:
select sub_id
     , sub_name
     , program_id
  from sub_project
 where sub_id = 1483
will only return one row? I just can't shake the feeling that its' something in you join conditions causing too many rows to be returned.


Did you try what SQL Sister suggested, maybe looking at the raw data for just one of the sub_id's?

[small]----signature below----[/small]
I don't do any programming whatsoever

Ignorance of certain subjects is a great part of wisdom
 
Alex,

me said:
That's just it though - I know there are duplicates.

Maybe I'm just misunderstanding the way this works.

more sample data:

Code:
select *
  from drilldown_actual
 where sub_id = 1563
   and cost_budget_year = 2007

[tt]
sub cat be year sppt amount
---------------------------------------
1563 73 9 2007 NULL 85428.9800
1563 outsrc 9 2007 NULL -8500.0000

[/tt]



*cLFlaVA
----------------------------
[tt]"quote goes here"[/tt]
[URL unfurl="true"]http://www.coryarthus.com/[/url]
 
I just was not clear on whether the numbers were duplicated between programs or not.

Is there a particular column in your result that seems to be giving you trouble or is it across the board?





[small]----signature below----[/small]
I don't do any programming whatsoever

Ignorance of certain subjects is a great part of wisdom
 
see that is the data as it is in one table. Now select the same data when joined to the other tables. The sum would summ all the records which show up after the join.

Example:
table1

table1_id table1_value
1 1000
2 2000

table2

table2_id, table1_id, table2_value
1 1 200
2 1 300
3 2 500

When you join these two tables then you would havea results set of

table2_id, table1_id, table2_value table1_id table1_value
1 1 200 1 1000
2 1 300 1 1000
3 2 500 2 2000

Sums for Table1 value fromthe query would be
4000 while the same sum run on just table 1 would be 3000.

That's why I said you could see why it wsa doubling up things if you ran the full data and not just the sums.


"NOTHING is more important in a database than integrity." ESquared
 
let's assume i have three projects:

[tt]
1483 Global Contracts Repository
1563 Contract File Prototype
1577 Contract Repository Foundation
[/tt]

and i have these two tables:

forecast
[tt]
1483 2007 1000
1483 2007 2000
1483 2008 3000
1483 2008 4000
.
.
.
[/tt]

actuals
[tt]
1483 2007 50
1483 2007 75
1483 2008 80
1483 2008 95
.
.
.
[/tt]

i would expect this as a result set:

[tt]
project cy actual ay actual cy budget ay budget
------------------------------------------------
1483 125 300 3000 10000
[/tt]



*cLFlaVA
----------------------------
[tt]"quote goes here"[/tt]
[URL unfurl="true"]http://www.coryarthus.com/[/url]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top