×
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!
  • Students Click Here

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

VB6/XPath XML Shredder speedup

VB6/XPath XML Shredder speedup

VB6/XPath XML Shredder speedup

(OP)
I'm working with a legacy VB6/Xpath XML shredder to load a SQL Server database. It's working, but it's really slow. I'm looking for suggestions to speed it up.

The xml files I'm working with contain only one set of elements, not multiple sets of elements. For example, by analogy to Microsoft's familiar books.xml sample file, my xml files look like this:

CODE -->

<bookstore>
  <book category="cooking">
    <title lang="en">Everyday Italian</title>
    <author>Giada De Laurentiis</author>
    <year>2005</year>
    <price>30.00</price>
  </book>
</bookstore> 

My xml files do **not** look like this:

CODE --> HTML

<bookstore>
  <book category="cooking">
    <title lang="en">Everyday Italian</title>
    <author>Giada De Laurentiis</author>
    <year>2005</year>
    <price>30.00</price>
  </book>
  <book category="children">
    <title lang="en">Harry Potter</title>
    <author>J K. Rowling</author>
    <year>2005</year>
    <price>29.99</price>
  </book>
  <book category="web">
    <title lang="en">XQuery Kick Start</title>
    <author>James McGovern</author>
    <author>Per Bothner</author>
    <author>Kurt Cagle</author>
    <author>James Linn</author>
    <author>Vaidyanathan Nagarajan</author>
    <year>2003</year>
    <price>49.99</price>
  </book>
  <book category="web" cover="paperback">
    <title lang="en">Learning XML</title>
    <author>Erik T. Ray</author>
    <year>2003</year>
    <price>39.95</price>
  </book>
</bookstore> 

As shown in the VB6/XPath pseudo code below, the shredder currently works by sequentially recursively processing each XML file using a Select Case structure within a For/Next structure. That works but it's slow because every element in every XML file is considered. The problem is that I'm working with hundreds of thousands of xml files, each of which contains many hundreds of elements. I'm only interested in about 3 dozen of those hundreds of elements. I know the tags that identify the elements I'm interested in--they're always the same. Are there any obvious ways to speed this up? For example, instead of recursively parsing the entire XML file, can I somehow extract and parse only the 3 dozen or so elements that I am interested in? A complication is that a few of the elements I'm interested in have an indeterminate number of child nodes and I need to extract information from every one of those child nodes.

CODE --> VB6

Public Sub shredXML(ByRef Nodes As MSXML2.IXMLDOMNodeList)

Dim xNode As MSXML2.IXMLDOMNode

  For Each xNode In Nodes

    If xNode.nodeType = NODE_ELEMENT Then

      Select Case xNode.nodeName

        Case "element1"
          extract stuff from element1 & load into database
        Case "element2"
          extract stuff from element2 & load into database
        Case "element3"
          extract stuff from element3 & load into database
        ...
        Case "elementN"
          extract stuff from elementN & load into database

      End Select

    End If

    If xNode.hasChildNodes Then   'parse xml file
     shredXML xNode.childNodes    'recursively
    End If

  Next xNode

End Sub 

RE: VB6/XPath XML Shredder speedup

BRW1,

You can extract all element1, element2, ..., elementN, that may be found in a single document at any depth level with a selectNodes method call for each element.

CODE --> Pseudo

elements = Array("element1", "element2", "element3", "elementN")

for each element in elements

  for each node in xmldoc.selectNodes("//" + element)

   extractAndLoad(node)

  end for each

end for each 

If this helps your extraction to be more efficient or not will depend on the distribution of the elements in the documents, but it's expected that the DOM can optimize the node selection far better than you can optimize any form of traversing the node tree.

RE: VB6/XPath XML Shredder speedup

(OP)
atlopes: Thanks very much. I'll pursue your suggestion and post back if (make that when wink) I get it working. Will probably take me several weeks.

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members! Already a Member? Login

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close