Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Best/easiest way for perl to parse xml structure?

Status
Not open for further replies.

BobMCT

IS-IT--Management
Joined
Sep 11, 2000
Messages
756
Location
US
I've seen several perl modules available to accomplish this feat but short of installing and actually trying out each to make this determination, has anyone has the opportunity to experience what they believe is the "ONE" that will accomplish this?

Please advise with your experiences and opinions?

Thanks
 
In my experience, it's pretty difficult to deal with XML using Perl.

XML::Simple is good for really simple stuff:

Code:
<servers>
   <server name="upsilon">
      <address>168.192.0.1</address>
      <port>80</port>
   </server>
   <server name="epsilon">
      <address>168.192.0.2</address>
      <port>80</port>
   </server>
</servers>

XML::Simple would take that and turn it into a data structure like this:

Code:
$servers = {
   [
      name => 'upsilon',
      address => '168.192.0.1',
      port => 80,
   ],
   [
      name => 'epsilon',
      address => '168.192.0.2',
      port => 80,
   ],
};

However... when you get embedded tags, like in HTML, XML::Parser fails
Code:
<html>
<body>
This sentence has a <b>bold</b> word.
</body>
</html>

returns something like...
Code:
$html = {
   body => [
      "This sentence has a ",
      b => [
        "bold",
      ],
      " word",
   ],
};

Or something along those lines (if you actually test XML::Parser on an HTML file and dump the output, its kinda like that). But long story short, it totally breaks apart all of your tags in such a way that its impossible to do anything with them, because Perl sorts hashes at random order and there's no way to determine where that particular bold word actually was on the XML.

And then XML::Parser is a little bit more powerful, but then you get into the conundrum of... "you can have an <i> inside of a <b>, or a <b> inside of an <i>, but those are two totally different things your program has to check for"

I wonder if anybody has any better luck with XML parsing than me?

-------------
Cuvou.com | The NEW Kirsle.net
 
Kirsle, a former co-worker of mine once recommended XML::DOM over XML::Simple after running into problems with it. I've never had a need for it so I've never tried it, but you might want to check it out.
 
I've used XML::Simple in the past but, unless the XML is in fact simple, I would avoid using it. What does the XML file you're trying to parse look like? And what do you want to do with the info after it's parsed?
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top