Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations wOOdy-Soft on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

xml DOM parser failed to parse the word "ila" or "ilar"

Status
Not open for further replies.

Thameem

Programmer
Sep 18, 2002
30
US
Hi guys,
I have some unusual question. I have written a XML DOM parser to parse xml files and give me the result. The xml file contains data in Romanian language. The parse will work for all situations except "ila" or "ilar" in its data.
Can anybody give me a solution?
 
I can't think of anything special about those two strings. Can you post some code?
Tracy Dryden
tracy@bydisn.com

Meddle not in the affairs of dragons,
For you are crunchy, and good with mustard.
 
here is a portion of the code...

-----------------------
my $file = "test.xml";
my $dom = new XML::DOM::parser;
my $doc = $dom->parsefile($file) or die "unable to parse document";
my $root = $doc->getDocumentElement;

&parser($root);

sub parser
{
my($node) = @_;

if($node->getNodeType == ELEMENT_NODE)
{
if($node->getTagName eq "RES")
{
$totalS = $node->getAttribute("SN");
$totalE = $node->getAttribute("EN");
}

foreach $child($node->getChildNodes())
{
parser($child);
}

}

elsif($node->getNodeType == TEXT_NODE)
{
$text = $node->getParentNode->getTagName;
if($text eq "TM")
{
$totalTime = $node->getData;
}
----------------------
my xml file is like this...

- <GSP VER=&quot;3.1&quot;>
<TM>0.139104</TM>
<Q>ila</Q>
- <CAT SE=&quot;ISO-8859-1&quot;>
<GN>gwd/Top/Regional/North_America/United_States/Georgia/Localities/I/Ila</GN>
<FVN>Top/Regional/North_America/United_States/Georgia/Localities/I/Ila</FVN>
</CAT>
- <RES SN=&quot;1&quot; EN=&quot;10&quot;>
<M>73</M>
<FI />
- <NB>
<NU>/search?q=ila&hl=en&lr=lang_ro&safe=off&output=xml&start=10&sa=N</NU>
</NB>
- <R N=&quot;1&quot; L=&quot;1&quot;>
<U> <T>ANCD - free_ilascu</T>
<RK>10</RK>
<S>FREE ILASCU. <b>...</b> Scurta prezentare: Romanii din Grupul Ilascu si anume. 1)<br> Ilie Ilascu 2) Alexandru Lesco 3) Tudor Popa 4) Andrei Ivantoc,. <b>...</b></S>
- <HAS>
- <DI>
- <CAT SE=&quot;ISO-8859-2&quot;>
<GN>gwd/Top/World/Rom%C3%A2n%C3%A3/Societate/Politic%C4%83</GN>
<FVN>Top/World/Român?/Societate/Politică</FVN>
</CAT>
<DT>Ilie IlaÅŸcu</DT>
<DS>Pagină care militează pentru eliberarea grupului Ilaşcu. Conţine date biografice şi noutăţi...</DS>
</DI>
<L TAG=&quot;link:&quot; />
<C SZ=&quot;31k&quot; TAG=&quot;cache:&quot; />
<RT TAG=&quot;related:&quot; />
</HAS>
</R>
- <R N=&quot;2&quot; L=&quot;1&quot;>
<U> <T>CURIERUL ROMANESC, Apr.-Iun. 2000</T>
<RK>3</RK>
<S><b>...</b> promovate de Forumul Presei Române de Pretutindeni, una privind eliberarea grupului<br> <b>Ila</b>[cu [i alta privind monitorizarea [i mediatizarea înc`lc`rilor <b>...</b></S>
- <HAS>
<L TAG=&quot;link:&quot; />
<C SZ=&quot;4k&quot; TAG=&quot;cache:&quot; />
<RT TAG=&quot;related:&quot; />
</HAS>
</R>
- <R N=&quot;3&quot; L=&quot;2&quot;>
<U> <T>CURIERUL ROMANESC, Oct.-Dec. 1998</T>
<RK>3</RK>
<S><b>...</b> române, arest`rile [i tratamentele inumane aplicate conduc`torilor români. Astfel,<br> Grupul <b>Ila</b>[cu - Ilie <b>Ila</b>[cu, Andrei Ivantoc, Tudor Popa - se g`se[te în <b>...</b></S>
- <HAS>
<L TAG=&quot;link:&quot; />
<C SZ=&quot;6k&quot; TAG=&quot;cache:&quot; />
<RT TAG=&quot;related:&quot; />
</HAS>
<HN>home3.swipnet.se</HN>
</R>
.....

--------
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top