Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Heap Sort

Status
Not open for further replies.

elziko

Programmer
Joined
Nov 7, 2000
Messages
486
Location
GB
Can someone point me towards a good example of a heap sort? Especially one that is well commented?

You see I'm using a bubble sort at the mo at its just too slow!

many thanks


elziko
 
Do you have qb4 ? There is a program there (sortdemo.bas) that demonstrates most of the sorting techniques. I could email to you if you don't have it.
David Paulson


 
Do you have qb4? There is a program (sortdemo.bas) that demonstrates different sorting techniques. I could email if you want.
David Paulson


 
If you could mail it to me I'd be very grateful:

JElzik@eatec
.co.uk

Thanks,

elziko
 
elziko,

be careful what you ask for, you may get it!

Heap sorts are quite fast FOR SOME sorting issues - BUT they are not the best for many issues. Also, if refering to the qb sorts, beware of creating disc files for the heaps. "Way back when" qb was popular, most micros did not have sufficient memory or 'graceful' versions of virtual memory, so many of hte then 'cutting edge' approaches used disc files. Today, that would be a (tragic?) mistake.

As an aside to this thread, it the soting related to your previous inquiry re sorting the strings according to the embeded number?

Also, approx how many items do you need to sort? Is an actual SORT necessary, or could the process be "indexed"?


MichaelRed
redmsp@erols.com

There is never time to do it right but there is always time to do it over
 
At the moment I'm sorting about 1000 items and this number is set to grow. Thats quite a few, I think. I've managed to speed up my bubble sort slightly by quitting out earlier if nor more sorting is necessary but that only knocks about 1/2 a second off a 6 second total.

elziko
 
Mmm, just another thought. Do you think it would be dangerous to do some sort of recursive search on this number of items?
 
?????

6 Seconds for 1K seems like a LONG time. Are you SURE the time is consumed by the SORT?

Not knocking the desire to speed this up, but if it is not the SORT which is the 'bottle neck', you will expend effort in the wrong area! If the process includes 'rotting' the item to be sorted and manipulating the entire entry for each, then much / most of the time may be involved in the other parts of the operation. If, for example, this is like the process I sent you code for recently, it is probable that the majority of the time is spent in 'rotting' the numeric value out of the string, not in the actual sort. Also, note that the code I sent operated on the WHOLE string in individual pieces for each operation. This included NINE different assignments to move an element. Since you only sent the few elements and did not indicate the context, I just responded with the example to do more-or-less EXACTLY what you posted. It was intended only as an example of how to extract the parts so the sorting could be done. Depending on the context, other overall soloutions might (easily) be preferable. For one example, if these fields were in a recordset, two queries would probably be MUCH faster. First query would create a field for the numeric part of the string, using a fnction much like a PART of the function I sent. Second query simply gets all records from the first, sorted on the numeric field created in the first. Of course, I'm foing a 'leap of faith' here that anything including 1K "things" should be put into a recordset to begin with!


MichaelRed
redmsp@erols.com

There is never time to do it right but there is always time to do it over
 
Of course, I'm foing a 'leap of faith' here that anything including 1K "things" should be put into a recordset to begin with!

Then recursive searches become somewhat moot? Just do a self join!




MichaelRed
redmsp@erols.com

There is never time to do it right but there is always time to do it over
 
They aren't in a record set. I'm writing a function that resides in a DLL so I'm just passed an array and must pass one back.

The 'rotting' (!) I do is the way I described in the other thread (converting base 26 to base 10). Without this 'rotting' the process takes about a second less. I think the reason why the sort is slow is because I'm sorting an array with two columns where the column that is used for the sort contains massive numbers. I'm sure a sort of doubles will take longer than one of integers? If this is the case then I suppose that any sort will be slow.
 
Right got it down to 1.67 seconds using a sort that puts the largest item at the bottom of the array and then does the same with the array minus the last item and so on. I dunno what its called but it takes a quarter of the time. Interestingly enough (possibly) its no quicker compiled?!?!

elziko
 
That is called "Bubble Sort". On each pass it moves the highest (or the lowest depending on your compare) to its proper position at the (end/beginning). Re-examining that position is a waste.
i.e. if N + 1 = # of items then you make N compares, followed by N-1,N-2 down to 1 which equals n(N-1)/2 or (N^2 + N) / 2. for 1001 items it comes out to 500,500 compares. If you are sorting strings and swapping it is slooooow.
Try this. Instead of moving items, it sorts the key and produces an array of indexes
Code:
Dim aryI() as long
aryI = SortStrComp(mystrs,TextCompare)
For I = 0 to Ubound(aryI)
    debug.print mystrs(aryI(I))
Next 

Public Function SortStrComp(aryS() As String, Optional ByVal lngCompareType As CompareMethod) As Long()
    ' Sort the Array
    Dim lngN        As Long, _
        lngGAP      As Long, _
        I           As Long, _
        J           As Long, _
        JGap        As Long, _
        lngI        As Long, _
        lngIGap     As Long, _
        lngBound    As Long
    Dim lngSwap     As Long
    Dim aryI()      As Long
       
    lngBound = UBound(aryS)
        
    lngN = UBound(aryS) + 1 ' Get Actual Count including 0
    ReDim aryI(lngBound) As Long
    
    For I = 0 To lngN - 1
        aryI(I) = I
    Next
    
    lngGAP = 1
    Do While (lngGAP < lngN)
        lngGAP = lngGAP * 3 + 1
    Loop
    lngGAP = (lngGAP - 1) \ 3
    
    Do While lngGAP > 0
        For I = lngGAP To lngN - 1
            JGap = I
            J = JGap - lngGAP
            Do While J >= 0
                lngI = aryI(J)
                lngIGap = aryI(JGap)
                                
                Select Case StrComp(aryS(lngI), aryS(lngIGap), lngCompareType)
                Case -1
                    Exit Do
                Case 0
                    If lngI <= lngIGap Then
                            Exit Do
                        End If
                End Select
                                
                aryI(J) = lngIGap
                aryI(JGap) = lngI
                
                JGap = J
                J = J - lngGAP
            Loop
        Next
        If lngGAP <= 1 Then Exit Do
        lngGAP = (lngGAP - 1) \ 3
    Loop
    
    SortStrComp = aryI
    
    Erase aryI
End Function
 
A way to speed up a sort is not to sort it at all. Instead of actually moving the data, have another column of numbers that point to the real values and swap those instead. This can save an awful lot of shuffling about and you only need to re-order it once at the end. Do you see what I mean.

Quicksort is good too. I'm sure I have the algorithm somewhere. Peter Meachem
peter@accuflight.com
 
Had I read it more closely, I would probably have spotted it. Apologies, it is rather late here. Peter Meachem
peter@accuflight.com
 
Is the sort I mentioned really a bubble sort? I thought a bubble sort was just where each value in the array is checked against the previous one and they are swapped if need be. This is then done ubound(array) raised to the power of ubound(array) which takes ages. The sort I mention does a similar thing but its doiing it on sucessively small arrays.

As far as the index swapping methods I dont think I can use it because I'm writing it as part of a DLL. I must accept an array an return a sorted array. I have no control as to what is being passed to or excepted back from my function. If I'm wrong then do tell me!

Well thanks everyone for your input,

elziko
 
Yes, you were doing a &quot;bubble sort&quot;. You were just doing the inefficient one. BTW, it is not the &quot;power&quot; of the UBound its is Ubound^2 (Ubound squared). You just examined each of UBound items, Ubound Times.
Code:
'* Inieffcient form
blnSwap = True
Do While(blnSwap)
    ' On each pass the largest of the remaining
    ' items, &quot;bubbles&quot; to its position
    blnSwap = false
    For I = LBound+1 To Ubound(Ary)
        If Array(I-1) > Ary(I) then
            strHold = Ary(I-1)
            Ary(I-1) = Ary(I)
            Ary(I) = strHold
             blnSwap = true
        End if
    Next
Loop
Code:
'* Twice as fast
blnSwap = True
Do 
    For J = UBound(ary) to 0 Step -1
        ' On each pass the largest of the remaining
        ' items, &quot;bubbles&quot; to its position
        blnswap = false
        For I = LBound+1 To J
            If Array(I-1) > Ary(I) then
                strHold = Ary(I-1)
                Ary(I-1) = Ary(I)
                Ary(I) = strHold
                blnSwap = true
            End if
        Next
        if blnSwap = false then exit do
    Next
Loop
The second method takes (Ubound(ary)^2+Ubound(ary))/2.
If you must rearrange the the actual items and can stand a temporary &quot;doubling&quot; of the array size then change as follows:
Code:
Public Sub SortStrComp(aryS() As String, Optional ByVal lngCompareType As CompareMethod) 
....sort sort sort
Dim  aryW() as string
aryW = aryS
For I = 0 to Ubound(aryS)
    aryS(I) = aryW(ary(I))
Next
End Sub
[code]
You still gain all the speed of not swapping strings.
Swapping strings is extremely slow because of all the memory allocation.
With mine, you get UBound allocations, with string swapping,
only God knows. I put in indexing after enduring string swapping. [URL unfurl="true"]WWW.VBCompare.Com[/URL]
 
You may possibly be interested by F. Balena's article that discusses the pro and cons of different sorting techniques (Bubble, Insert,Quick and Shell). It also provides some rules of thumb where and when to apply them:
_________________________________
In theory, there is no difference between theory and practice. In practice, there is. [attributed to Yogi Berra]
 
Interesting link BUT I believe the example Shell Sort will loop endlessly if there are 0 items to sort.
Code:
    ' distance starts at 0
    Do
        distance = distance * 3 + 1
        'IF numEls = 0 then distance = 1
    Loop Until distance > numEls 
    Do
        ' If distance = 1 then distance = 0 afer following
        distance = distance \ 3
        For index = distance + 1 To numEls
            ..........
        Next
    ' if distance = 0 then endless loop
    Loop Until distance = 1
Eliminate the loop by changing the Loop to
Loop Until distance <= 1
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top