Best/easiest way for perl to parse xml structure?

BobMCT · May 20, 2007

I've seen several perl modules available to accomplish this feat but short of installing and actually trying out each to make this determination, has anyone has the opportunity to experience what they believe is the "ONE" that will accomplish this?

Please advise with your experiences and opinions?

Thanks

Kirsle · May 20, 2007

In my experience, it's pretty difficult to deal with XML using Perl.

XML::Simple is good for really simple stuff:

Code:

<servers>
   <server name="upsilon">
      <address>168.192.0.1</address>
      <port>80</port>
   </server>
   <server name="epsilon">
      <address>168.192.0.2</address>
      <port>80</port>
   </server>
</servers>

XML::Simple would take that and turn it into a data structure like this:

Code:

$servers = {
   [
      name => 'upsilon',
      address => '168.192.0.1',
      port => 80,
   ],
   [
      name => 'epsilon',
      address => '168.192.0.2',
      port => 80,
   ],
};

However... when you get embedded tags, like in HTML, XML:

arser fails

Code:

<html>
<body>
This sentence has a <b>bold</b> word.
</body>
</html>

returns something like...

Code:

$html = {
   body => [
      "This sentence has a ",
      b => [
        "bold",
      ],
      " word",
   ],
};

Or something along those lines (if you actually test XML:

arser on an HTML file and dump the output, its kinda like that). But long story short, it totally breaks apart all of your tags in such a way that its impossible to do anything with them, because Perl sorts hashes at random order and there's no way to determine where that particular bold word actually was on the XML.

And then XML:

arser is a little bit more powerful, but then you get into the conundrum of... "you can have an <i> inside of a <b>, or a <b> inside of an <i>, but those are two totally different things your program has to check for"

I wonder if anybody has any better luck with XML parsing than me?

-------------
Cuvou.com | The NEW Kirsle.net

chazoid · May 20, 2007

Kirsle, a former co-worker of mine once recommended XML:

OM over XML::Simple after running into problems with it. I've never had a need for it so I've never tried it, but you might want to check it out.

rharsh · May 20, 2007

I've used XML::Simple in the past but, unless the XML is in fact simple, I would avoid using it. What does the XML file you're trying to parse look like? And what do you want to do with the info after it's parsed?

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Best/easiest way for perl to parse xml structure?

BobMCT

IS-IT--Management

Kirsle

Programmer

chazoid

Technical User

rharsh

Technical User

Similar threads

Part and Inventory Search

Sponsor