Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Slow Application - Would like to improve. 4

Status
Not open for further replies.

gpalmer711

IS-IT--Management
May 11, 2001
2,445
GB
Hi Guys and Gals,

This post is a little long winded so I apologise now, however I thought that it was important to give you the full situation.

I am currently writing an application in VB6. My client does a lot of selling via Amazon and wants to also have their own website to sell things from, including their full Amazon catalogue. Currently this stands at 2097 items.

I have developed the website and am working on the way to retrieve the amazon catalogue and upload it to the website. Amazon provide a service that allows you to retrieve details from their database, using this I can retrieve XML documents that contain the catalogue. Unfortunatly Amazon limit this to 10 items per XML file, so currently I have to retrieve 210 of them. Then parse the relevant data from the XML documents into arrays, then upload this to the database. I currently have it all working fine. The problem that I have is the time it is taking to complete the task.

Downloading all 210 XMl files takes only a couple of minutes on a broadband connection which is fine. Parsing all the information from them takes less than a minute, again this is also fine. However getting the data into the data base currently takes over 4 hours.

Now the package that the client is using, and can afford, only allows for Access databases. The access database that I have designed contains all information, such as orders placed, customers, etc... So I only need to update one table.

The way that I thought would be the easiest way of doing it would be create a asp document that used INSERT sql statements to add the records to the database. So on to the actual problem - Thanks for staying with me so far.

I have a form in the VB app that has a richtextbox, I then have the following code that populates the richtextbox. It is this that currently takes 4 hours and 11 minutes.

Code:
Private Function CreateSQL()
Dim strStartTime, strEndTime
strStartTime = Time
Dim intCurRec As Integer
For intCurRec = 0 To intRecordsChecked
rtbSQL.Text = rtbSQL.Text & "INSERT INTO catalogue VALUES (,"
rtbSQL.Text = rtbSQL.Text & "'" & colExchangeID(intCurRec) & "',"
rtbSQL.Text = rtbSQL.Text & "'" & colListingID(intCurRec) & "',"
rtbSQL.Text = rtbSQL.Text & "'" & colASIN(intCurRec) & "',"
rtbSQL.Text = rtbSQL.Text & "'" & colTitle(intCurRec) & "',"
rtbSQL.Text = rtbSQL.Text & "'" & colArtist(intCurRec) & "',"
rtbSQL.Text = rtbSQL.Text & "'" & colPrice(intCurRec) & "',"
rtbSQL.Text = rtbSQL.Text & "'" & colCurrency(intCurRec) & "',"
rtbSQL.Text = rtbSQL.Text & "'" & colStartDate(intCurRec) & "',"
rtbSQL.Text = rtbSQL.Text & "'" & colEndDate(intCurRec) & "',"
rtbSQL.Text = rtbSQL.Text & "'" & colStatus(intCurRec) & "',"
rtbSQL.Text = rtbSQL.Text & colQuantity(intCurRec) & ","
rtbSQL.Text = rtbSQL.Text & colQuantityAllocated(intCurRec) & ","
rtbSQL.Text = rtbSQL.Text & "'" & colCondition(intCurRec) & "',"
rtbSQL.Text = rtbSQL.Text & "'" & colSubCondition(intCurRec) & "',"
rtbSQL.Text = rtbSQL.Text & "'" & colFeaturedCategory(intCurRec) & "',"
rtbSQL.Text = rtbSQL.Text & ");"
rtbSQL.Text = rtbSQL.Text & vbCrLf
Next intCurRec
strEndTime = Time
MsgBox "Start Time = " & strStartTime & " - End Time = " & strEndTime
End Function

All variables are declared in the code, if not above in a Module.

intRecordsChecked currently = 2097 but will change as and when the catalogue changes.

Can anyone see a better/quicker way of doing this? Any input would be appreciated.

Greg Palmer
Freeware Utilities for Windows Administrators.
 
It's all in the strings. Making large strings is ALWAYS slow. Try this code instead.

Code:
Private Function CreateSQL()
    
    Dim strStartTime, strEndTime
    strStartTime = Time
    Dim intCurRec As Integer
    [!]Dim arData() As String
    Dim cTemp As String
    
    ReDim arData(intRecordsChecked)[/!]
    
    For intCurRec = 0 To intRecordsChecked
        
        [!]cTemp[/!] = "INSERT INTO catalogue VALUES (,"
        [!]cTemp = cTemp[/!] & "'" & colExchangeID(intCurRec) & "',"
        cTemp = cTemp & "'" & colListingID(intCurRec) & "',"
        cTemp = cTemp & "'" & colASIN(intCurRec) & "',"
        cTemp = cTemp & "'" & colTitle(intCurRec) & "',"
        cTemp = cTemp & "'" & colArtist(intCurRec) & "',"
        cTemp = cTemp & "'" & colPrice(intCurRec) & "',"
        cTemp = cTemp & "'" & colCurrency(intCurRec) & "',"
        cTemp = cTemp & "'" & colStartDate(intCurRec) & "',"
        cTemp = cTemp & "'" & colEndDate(intCurRec) & "',"
        cTemp = cTemp & "'" & colStatus(intCurRec) & "',"
        cTemp = cTemp & colQuantity(intCurRec) & ","
        cTemp = cTemp & colQuantityAllocated(intCurRec) & ","
        cTemp = cTemp & "'" & colCondition(intCurRec) & "',"
        cTemp = cTemp & "'" & colSubCondition(intCurRec) & "',"
        cTemp = cTemp & "'" & colFeaturedCategory(intCurRec) & "',"
        cTemp = cTemp & ");"
        arData(intCurRec) = cTemp
    Next intCurRec
    [!]rtbsql.Text = Join(arData, vbCrLf)[/!]
    strEndTime = Time
    MsgBox "Start Time = " & strStartTime & " - End Time = " & strEndTime
End Function



-George

Strong and bitter words indicate a weak cause. - Fortune cookie wisdom
 
Holy &^%$%

That is great - it now takes 1 second to do the code that you provided.

Thank you so Much

Regards



Greg Palmer
Freeware Utilities for Windows Administrators.
 
Yep! Arrays! Gotta love em.

The longer a string becomes, the slower it is to add more to it. With the array approach, no single string becomes larger than a single SQL Update statement.

I encourage you to do some reading on arrays. Specifically, look up [!]Split[/!], [!]Join[/!], [!]Redim[/!], and [!]Redim Preserve[/!]

-George

Strong and bitter words indicate a weak cause. - Fortune cookie wisdom
 
Private Function ?
Where is one or more return statements?
 
gmmastros,

Thanks for that I will update what I know about arrays asap.

TipGiver,

The return startements are to be added as I update the application, hence the reason for using a function now instead of a sub


Greg Palmer
Freeware Utilities for Windows Administrators.
 
George,

My first reaction to this was that it's mostly the object access that's creating the bottleneck, especially in light of the fact that there's a behind the scenes updating of the control going on.

(As a general rule, one should avoid an object's properties inside a loop. Rather, use a variable, and update the object after exiting the loop. Objects are much slower to access than variables--heap v. stack allocation, marshaling issues, etc.)

So I wondered, George, what would happen if one removed the array logic, and just kept updating ctemp in the loop. I suspected it would be much closer to the speed of your code than the speed of the OP's. So, I just had to find out! Here's my test code that mimics yours:

Code:
Private Sub Command1_Click()
   
    Dim strStartTime, strEndTime
    strStartTime = Timer
    Dim intCurRec As Integer
    Dim arData() As String
    Dim cTemp As String
    
    ReDim arData(5000)
    
    For intCurRec = 0 To 5000
        
        cTemp = "INSERT INTO catalogue VALUES (,"
        cTemp = cTemp & "'" & "Now " & "',"
        cTemp = cTemp & "'" & "is " & "',"
        cTemp = cTemp & "'" & "the " & " ',"
        cTemp = cTemp & "'" & "time " & " ',"
        cTemp = cTemp & "'" & "for " & "',"
        cTemp = cTemp & "'" & "all " & "',"
        cTemp = cTemp & "'" & "good " & "',"
        cTemp = cTemp & "'" & "people " & "',"
        cTemp = cTemp & "'" & "to " & "',"
        cTemp = cTemp & "'" & "come " & "',"
        cTemp = cTemp & "'" & "the " & ","
        cTemp = cTemp & "'" & "aid " & ","
        cTemp = cTemp & "'" & "of " & ","
        cTemp = cTemp & "'" & "their " & ","
        cTemp = cTemp & "'" & "country. " & ","
        cTemp = cTemp & ");"
        arData(intCurRec) = cTemp
    Next intCurRec
    Debug.Print "Elapsed Time: " & Timer - strStartTime
End Sub
[COLOR=blue][Debug window:] Elapsed Time: 0.078125[/color]

Now, I removed the array logic:
Code:
Private Sub Command2_Click()
    Dim strStartTime, strEndTime
    strStartTime = Timer
    Dim intCurRec As Integer
[COLOR=green]'    Dim arData() As String[/color]
    Dim cTemp As String     
[COLOR=green]'    ReDim arData(10000)[/color]
    
    For intCurRec = 0 To 5000
        
        cTemp = [COLOR=red][b]cTemp & [/b][/color]"INSERT INTO catalogue VALUES (,"
        cTemp = cTemp & "'" & "Now " & "',"
        cTemp = cTemp & "'" & "is " & "',"
        cTemp = cTemp & "'" & "the " & " ',"
        cTemp = cTemp & "'" & "time " & " ',"
        cTemp = cTemp & "'" & "for " & "',"
        cTemp = cTemp & "'" & "all " & "',"
        cTemp = cTemp & "'" & "good " & "',"
        cTemp = cTemp & "'" & "people " & "',"
        cTemp = cTemp & "'" & "to " & "',"
        cTemp = cTemp & "'" & "come " & "',"
        cTemp = cTemp & "'" & "the " & ","
        cTemp = cTemp & "'" & "aid " & ","
        cTemp = cTemp & "'" & "of " & ","
        cTemp = cTemp & "'" & "their " & ","
        cTemp = cTemp & "'" & "country. " & ","
        cTemp = cTemp & ");"
'        arData(intCurRec) = cTemp
    Next intCurRec
    Debug.Print "Elapsed Time: " & Timer - strStartTime

End Sub
[COLOR=blue][Debug window:] Elapsed Time: 379.0508[/color]

I first tried all this with 10000 records, and quit after about 10 minutes at nearly 7000 iterations. Clearly, each iteration of the string assignment takes exponentially longer. If I understand string assignment correctly, this is because each time you reassign a string value, you actually allocate a new block of memory and populate it over again. The bigger the block, the longer it takes to allocate and populate.

Just out of interest, I tried a fixed length string, theorizing that if the memory were already allocated, a lot of the reallocation overhead would go away:
[tt]Dim cTemp As String * 32768[/tt]
and got:
[tt]Elapsed Time: 15.83984[/tt]
This result bears out the theory.

In this particular test, the use of arrays over string allocation is 80 or so times faster, assuming a variable length string, but the removal of object access from the loop is also about 40 times faster. So, I believe we can conclude that both string assignment and object access can create orders of magnitude of performance hits. I'll pay more attention to string allocation in future!

Bob
 
Bob,

You got me to thinking. And there's something fishy going on.

First off, your loop goes from 0 to 5000 so there are 5001 iterations. Then length of the string you are building is 145 characters long. So... the resulting string would be 725,145 characters long. Since I don't have much experience with fixed length strings, I tried...

Dim cTemp As String * 725145

Of course, it wouldn't compile. Turns out...
MSDN said:
A fixed-length string can contain 1 to approximately 64K (2^16) characters

So, I kept going. I then tried
[red]Dim cTemp As String * 65000[/red]
[blue]Elapsed Time: 24.15625[/blue]

I then changed it to
[red]Dim cTemp As String * 32000[/red]
[blue]Elapsed Time: 5.546875[/blue]

The problem I have with fixed length strings is that there is no indication that you have exceeded it's length. No error. It seems to me like this could inject some hard to find bugs in your code.

For example...
Code:
Private Sub Command3_Click()
    
    Dim cTemp As String * 15
    
    cTemp = "Now is the time. blah blah, the rest is lost."
    Debug.Print cTemp
    
End Sub


-George

Strong and bitter words indicate a weak cause. - Fortune cookie wisdom
 
Aha. You're right, that IS fishy. My theory is blown.

I did a little more playing around with string lengths, and find the following: if you put any number over 65535, you get a syntax error in the VB IDE. If you put a number between 65527 and 65535 inclusive, you get "Compile error: Invalid length for fixed-length string" as soon as the variable comes into scope. So, fixed length strings are limited to 65526 characters.

In any case, my main point is that the OP's performance problems had two sources, and your code solved both of them.

bob
 
Bob,

I saw the same thing with the length of the string. Weird.

You make a good point about accessing properties of an object from within a loop. (star for that).

I don't use fixed length string. After this little exercise, I probably won't use them, ever!

-George

Strong and bitter words indicate a weak cause. - Fortune cookie wisdom
 
Well, they're required for some API calls. The logic is exactly that of a data field that has only so many characters it can accept, isn't it? The default behavior is to truncate characters that won't fit.

Thx for the star.

Bob
 
Thanks to both of you for continuing this thread it has given me some more useful tips for the future.

Greg Palmer
Freeware Utilities for Windows Administrators.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top