CONDITIONAL SELECT CASE WHEN 2

kjv1611 · Dec 17, 2009

I'm not terribly sure what I should do to combat this situation, and hoped I could find an answer here.

I've got this type of SELECT QUERY already using a CASE WHEN construction for a basic summary query. It works GREAT for everything else, but I need to use a different construction for this, or else add something to the CASE WHEN statment, I suppose, in this scnerio.

Here's the SQL Code:

Code:

SELECT
	SUM(CASE WHEN CurrBal / OrigBal <= .1 THEN 1 ELSE 0 END) AS [<11%]
	,SUM(CASE WHEN CurrBal / OrigBal BETWEEN .1 AND .2 THEN 1 ELSE 0 END) AS [11% - 20%]
	,SUM(CASE WHEN CurrBal / OrigBal BETWEEN .21 AND .3 THEN 1 ELSE 0 END) AS [21% - 30%]
	,SUM(CASE WHEN CurrBal / OrigBal BETWEEN .31 AND .4 THEN 1 ELSE 0 END) AS [31% - 40%]
	,SUM(CASE WHEN CurrBal / OrigBal BETWEEN .41 AND .5 THEN 1 ELSE 0 END) AS [41% - 50%]
	,SUM(CASE WHEN CurrBal / OrigBal > .51 THEN 1 ELSE 0 END) AS [>51%]
FROM	MyTable

The problem is when there are Original Balances (OrigBal) that are $0. That's throwing me the following error:

SQL said:
Msg 8134.... Divide by zero error encountered.

Is there any quick, simple, easy way to combat this? Do I need to somehow strip the $0 origBal records out to begin with? Any thoughts/suggestions?

--

"If to err is human, then I must be some kind of human!" -Me

markros · Dec 17, 2009

Change all code /OrigBal to /NULLIF(OrigBal,0)

Code:

SELECT    COALESCE(SUM(CASE WHEN CurrBal / NULLIF(OrigBal,0) <= .1 THEN 1 ELSE 0 END),0) AS [<11%] 
etc.

kjv1611 · Dec 17, 2009

Hmm, I like that method. I did end up coming up with a different method right after I typed the question, and it seems to work, but I'll compare the results to see if they come up the same.

Here's the other method I used:

Code:

SELECT
	SUM(CASE WHEN CurrBal / OrigBal <= .1 THEN 1 ELSE 0 END) AS [<11%]
	,SUM(CASE WHEN CurrBal / OrigBal BETWEEN .1 AND .2 THEN 1 ELSE 0 END) AS [11% - 20%]
	,SUM(CASE WHEN CurrBal / OrigBal BETWEEN .21 AND .3 THEN 1 ELSE 0 END) AS [21% - 30%]
	,SUM(CASE WHEN CurrBal / OrigBal BETWEEN .31 AND .4 THEN 1 ELSE 0 END) AS [31% - 40%]
	,SUM(CASE WHEN CurrBal / OrigBal BETWEEN .41 AND .5 THEN 1 ELSE 0 END) AS [41% - 50%]
	,SUM(CASE WHEN CurrBal / OrigBal > .51 THEN 1 ELSE 0 END) AS [>51%]
FROM	(
			SELECT	CurrBal, OrigBal
			FROM	MyTable
			WHERE	OrigBal > 0
		)

But regardless of whichever I stick with, that'll be good to keep in mind for any future uses. I don't even think I've ever used nor seen the COALESCE argument. I'll look into that one further..

--

"If to err is human, then I must be some kind of human!" -Me

gmmastros · Dec 17, 2009

Where OrigBal > 0

-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom

markros · Dec 17, 2009

IMHO, you don't need to use derived table here and using WHERE condition directly should work as well.

kjv1611 · Dec 17, 2009

Hmm, okay, a 3rd method. I think now I'm going to have to ask to see which one of these should be the most accurate. Basically, if the original balance was 0, then I don't really want it in the results. I'll try gmmastros' idea in a moment as well.

For the first 2, I got these results:
Embedded SELECT statement:
<11% 11% - 20% 21% - 30% 31% - 40% 41% - 50% >51%
319513 38269 46687 60326 84095 286537

Using COALESCE:
<11% 11% - 20% 21% - 30% 31% - 40% 41% - 50% >51%
320313 38399 46701 60334 84103 299641

Note the differences.... any ideas why the differences?

I'll try the other method, and post back once again...

--

"If to err is human, then I must be some kind of human!" -Me

kjv1611 · Dec 17, 2009

The WHERE Clause took much longer than the other 2. It may just be coincidence, since this is just from one test, and there are at least a few hundred different people who could be running a query against the same source database at any given moment, not to mention a few "system" processes..

Anyhow, here are now the results from all 3:

Embedded SELECT statement - or Derived Table (I'll get the terminology right one of these days):
<11% 11% - 20% 21% - 30% 31% - 40% 41% - 50% >51%
319513 38269 46687 60326 84095 286537

Using Coalesce:
<11% 11% - 20% 21% - 30% 31% - 40% 41% - 50% >51%
320313 38399 46701 60334 84103 299641

Using original query just adding the WHERE Clause:
<11% 11% - 20% 21% - 30% 31% - 40% 41% - 50% >51%
319513 38269 46687 60326 84095 286537

The WHERE clause lines up with the Derived Table method, but also runs slowest of the 3. Why does the Coalesce query give different results than the other two?

--

"If to err is human, then I must be some kind of human!" -Me

kjv1611 · Dec 17, 2009

Oh, the time on the WHERE Clause method was right at 3 minutes, in case that helps any... 3 minutes vs 3 seconds for the other 2.

--

"If to err is human, then I must be some kind of human!" -Me

markros · Dec 17, 2009

Hmm, a bit puzzling why do you have difference in results - may be you have negative balances? If instead of > 0 you would use <>0 what will you get?

You can post the query plan using

SET STATISTICS SHOWPLAN ON command...

gmmastros · Dec 17, 2009

The COALESCE is not the important part. It's the NULLIF that is preventing the division by zero error.

For example:

Select 200/0 -- division by zero
Select 200/NULL -- no error, you get NULL for output

The NULLIF function returns NULL if the first parameter is the same as the 2nd parameter.

Anyway... There are 2 ways you could speed this up:

1. Create a covering index.

It looks like there's about a million rows in this table. I suspect there's a lot more data in this table other than the 2 columns mentioned (CurrBal & OrigBal). If you create an index that has only these two columns, then the query engine would be able to use the index for all the data it needs, without having to go to the actual table data to get it. This will reduce the number of reads tremendously, and will therefore speed up the query.

[tt][blue]
Create Index idx_Table_Name_CurrBal_OrigBal On TableName(CurrBal, OrigBal)
[/blue][/tt]

2. Create a computed column.

You could create a computed column in the table that has CurrBal/NullIf(OrigBal, 0). You could then create an index on the computed column. In this case, you would only have one value in the index, so your reads would be even better.

[tt][blue]
Alter Table TableName Add PercentBal As CurrBal/NullIf(OrigBal)

Create Index idx_TableName_PercentBal On TableName(PercentBal)
[/blue][/tt]

If you decide to go with the computed column, your final query would look like this:

Code:

SELECT
    Count(CASE WHEN PercentBal <= .1 THEN 1 END) AS [<11%]
    ,Count(CASE WHEN PercentBal BETWEEN .1 AND .2 THEN 1 END) AS [11% - 20%]
    ,Count(CASE WHEN PercentBal BETWEEN .21 AND .3 THEN 1 END) AS [21% - 30%]
    ,Count(CASE WHEN PercentBal BETWEEN .31 AND .4 THEN 1 END) AS [31% - 40%]
    ,Count(CASE WHEN PercentBal BETWEEN .41 AND .5 THEN 1 END) AS [41% - 50%]
    ,Count(CASE WHEN PercentBal > .51 THEN 1 END) AS [>51%]
FROM   TableName

-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom

gmmastros · Dec 17, 2009

Because I mentioned indexes, let me explain a little more about that.

In SQL Server, data is stored in 8K pages on the disk. If you have a wide table (lots of columns with lots of data), you will not get very many rows in to an 8K page. When data is read from, or written to disk, it's always done a page at a time. So, reducing the size of a row will increase the number of rows per page.

When you create an index, the same sort of thing occurs. All of the columns included in an index are written to disk, but only those columns. In this case, your index will be very narrow with only a couple numbers in each row. This means there will be a lot less pages to read and your performance should increase.

Make sense?

-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom

markros · Dec 17, 2009

George,

The explanations are great, but how would you explain the results discrepancy between two methods?

kjv1611 · Dec 17, 2009

Actually that makes a lot of sense, gmmastros. Frankly, I've sort of had some understanding of using the indexes to speed things along, but talking about the 8K pages part (something I don't recall learning before) makes it make MUCH more sense, now. And it is true - there are 25 columns in the table. Thankfully, though, this isn't a normal (at least for now) series of queries I'm working on. It's just something we're looking into more on a one-off. It could end up being recurring, though, so I'm always looking to better the performance of my SQL queries. Not only does it make me personally happy, but it helps keep my name out off the "black list" of the database administrators. [wink]

--

"If to err is human, then I must be some kind of human!" -Me

kjv1611 · Dec 17, 2009

As for the negative balances, I ran this SQL, and found the following results:

Code:

CREATE INDEX OrigBal_IXN ON MyTable(OrigBal)

SELECT COUNT(1) AS [Accts with Neg Orig Bal]
FROM MyTable
WHERE	OrigBal < 0

DROP INDEX OrigBal_IXN ON MyTable

[GREEN]--RESULTS - 13,951 of Total 888,531 Records[/GREEN]

--

"If to err is human, then I must be some kind of human!" -Me

markros · Dec 17, 2009

Well, it explains the discrepancy then, since you have negative balances and you were excluding them in your where condition.

Use where OrigBalance <>0 and the results from both methods should match.

kjv1611 · Dec 17, 2009

That definitely explains the discrepancies. I tried it with the <>0, and found the same results.

So, now I just need to find out whether or not I need to discard the $0 and negative balances... I'm thinking yes, but I'm going to make sure. Either way, this has been a good learning exercise. [smarty]

--

"If to err is human, then I must be some kind of human!" -Me

kjv1611 · Dec 17, 2009

Yeah, as I suspected, we're better off just not even looking at the $0 and below accounts. So, all this learning, but didn't really help the task at hand. I find that often happens to me, and then a short time later, I actually DO need what I learned... just a different situation. [smile]

Thanks again to you both.

--

"If to err is human, then I must be some kind of human!" -Me

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

CONDITIONAL SELECT CASE WHEN 2

kjv1611

Active member

markros

Programmer

kjv1611

Active member

gmmastros

Programmer

markros

Programmer

kjv1611

Active member

kjv1611

Active member

kjv1611

Active member

markros

Programmer

gmmastros

Programmer

gmmastros

Programmer

markros

Programmer

kjv1611

Active member

kjv1611

Active member

markros

Programmer

kjv1611

Active member

kjv1611

Active member

Similar threads

Part and Inventory Search

Sponsor