Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

EXCEL - sorting 65000 lines by most used number

Status
Not open for further replies.

Jeanie

Technical User
Jun 14, 2000
86
US
I have a query I have run in a database program for claims data. The query has 65000 lines. Each line is for a medical claim tied to a patients ID number. Each ID number could have any number of lines, depending on the number of claims. I would like to sort the data in an EXCEL spreadsheet with the ID numbers with the most number of lines showing first. I do not want to hide any lines. So, for example, if patient 11111 has 20 lines and patient 22222 has 25 lines and patient 33333 has 1 line I want all the lines for patient 22222 to be listed first, then all the lines for 11111 then all the lines for 33333.

I do not know a whole lot about EXCEL - I can do simple formulas and create a spreadsheet...but do not know how to do the advanced stuff (I will be taking a class!).

Any help would be appreciated.

Thanks,

Jeanie
 
You need to add a count of ID to the data. It is an easy formula but will take a while to calculate in excel so you may be better off adding a COUNT of PatientId to your database query 1st

In excel, if the patient IDs are in A1:A65000, your formula would be
=countif(A1:A65000,A1)
Then copy this down to the last row in the dataset
Your sort would then be 1st on the new column and then on patientID

If you do the count in your dbquery, you can just sort by the count and then patientID - much quicker and easier

Also, as a note, excel only has the capacity for 65536 lines of data - If your query is likely to exceed this, your gonna be looking at doing this in the database or cutting down the scope of the query Rgds
Geoff
"Some cause happiness wherever they go; others whenever they go."
-Oscar Wilde
 
Jeanie,

Another option...

I can assist you in setting up the following... Using Excel's database functions, click a macro button that would:

a) extract a unique list of ID numbers to a separate sheet.

b) generate the total number of claims for each ID.

c) sort this list in order of highest to lowest number of claims.

d) permit you to pick any ID from the list, and have those specific ID records extracted to a separate sheet for viewing and/or printing. The print settings would be set automatically. All you would need to do is to use <Conrol> P <Enter> or use the Print icon.

If you'd like help with this option, feel free to email me.

Regards, ...Dale Watson dwatson@bsi.gov.mb.ca
 
I would sort by patient ID, then subtotal (count) by patient ID, then collapse the subtotals to show counts and sort descending. Then you can blow out the detail again. The only thing is I've never subtotaled with so many rows. I'm not sure how long it would take, or even if Excel can handle that many subtotaled rows.
 
Thanks for the advice...I'm not going to be able to try it however, because we discovered that we need to do something entirely different!

Will keep this suggestion, however, for future use.

Thx,

Jeanie
 
I have done similar and always end-up having columns with formulae like this (ID data in D, eg formula in row 4 col X and repeated from start to finish)

=IF($D3=D$4,X3+1,1)

and in colY

=IF($D5=D$4,X5+1,1)

then in Z

=IF(X+Y>2,X+Y-1)

then I use a macro (or manuall sort) that does
1) a sort on D
2) a cut and paste (Values) from Z to AA
3) a sort of (say) D through AA with AA as primary sort and D as second (and maybe date of claim as 3rd)

simple on paper but 6500 will taske a long time but you can switch off recalculate (in macro or manually)

The reason I use three columns is visibility for checking though you can consolidate it all into one column when confident.

WARNING it needs sacrificial rows above the top and below bottom row with data that are not going to affect the result!!!!
 
I just checked with 100 rows and this works (eg row 9) note testing for blank cells

(col X) =IF(D9=&quot;&quot;,&quot;&quot;,IF(D9=D8,X8+1,1))

(col Y) =IF(D9=&quot;&quot;,&quot;&quot;,IF(D9=D10,Y10+1,1))

(col Z) =IF(OR(X9=&quot;&quot;,Y9=&quot;&quot;),&quot;&quot;,X9+Y9-1)


This macro works provided there is a header row with cells X1 to Z1 left blank
However a dummy entry of &quot;z&quot; (lowercase) is essential to mark the end and the formulae in X to Z must continue past the last row by at least one row

Sub Macro1()
Range(&quot;D:Z&quot;).Select
'sort to get the count of repetitions
Selection.Sort Key1:=Range(&quot;D2&quot;),_
Order1:=xlAscending,_
Header:=xlYes, OrderCustom:=1,_
MatchCase:=False,_
Orientation:=xlTopToBottom
Range(&quot;Z:Z&quot;).Copy
Columns(&quot;AA:AA&quot;).Select
' fix the counts in col AA by pasting as values only
Selection.PasteSpecial Paste:=xlValues,_
Operation:=xlNone, SkipBlanks:= _
False, Transpose:=False
With Application ' switch of re-calulations
.Calculation = xlManual
.MaxChange = 0.001
End With
ActiveWorkbook.PrecisionAsDisplayed = False
Columns(&quot;D:AA&quot;).Select
' sort by counts
Selection.Sort Key1:=Range(&quot;AA2&quot;), Order1:=xlDescending, Key2:=Range(&quot;D2&quot;), _
Order2:=xlDescending, Header:=xlGuess, _
OrderCustom:=1, MatchCase:= _
False, Orientation:=xlTopToBottom
' may not be needed
Calculate
' Switch re-calculate back on
With Application
.Calculation = xlAutomatic
.MaxChange = 0.001
End With
ActiveWorkbook.PrecisionAsDisplayed = False
End Sub

Good Luck
 
There is another way that doesn't require a lot of coding.
Create a pivot table to count each patient's ID number into a new worksheet. Then insert a new column against the Patient ID in the original worksheet and insert a VLOOKUP to the pivot table:
Assuming the ID's are in column A, the lookups in B and the
pivot table is on Sheet2 running from A1 to B12345 (or wherever it ends):
(B2)=VLOOKUP(A2,Sheet2!$A$1:$B$12345,2)

You can now simply sort the original sheet on column B descending. Note that the sort will only work if there are no gaps in the main spreadsheet.

Of course to be truthful I would actually do this in Access which would get over problems if you get >65000 lines which as Geoff points out is very close to Excel's limit.

Simon Rouse
 
Uncomfortably close I would say
I know excel has 65536 row capacity but to be honest, If I ever have a worksheet that is looking at getting over even 40000 lines, I tend to bung it into Access. Excel really doesn't cope that well with looooong lists of data Rgds
Geoff
&quot;Some cause happiness wherever they go; others whenever they go.&quot;
-Oscar Wilde
 
OK Jeannie, Geoff and I agree - perhaps you ought to be going on an Access course as well!
My method definitely worked on about 30 IDs but I wouldn't like to say how well it would work with the sort of numbers you are dealing with, it will probably run like a brick!

I may be a bit old fashioned, but I really don't like working with large spreedsheets - it is so easy to make a major change without realising it.
Simon Rouse
 
Thanks for all the options!! I will try it out when I get a chance.

I actually put the same question on an ACCESS page...I know a tiny bit more about ACCESS than I do about advanced EXCEL - and, yes, I do hope to take an ACCESS course as well.

Thanks again!

Jeanie
 
I hope you got simple advice on how to do it in Access, because it's much easier!
 
well yeah - just add a count of ID and sort by it ;-) Rgds
Geoff
&quot;Some cause happiness wherever they go; others whenever they go.&quot;
-Oscar Wilde
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top