Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations derfloh on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

extra characters being added to output 1

Status
Not open for further replies.

miraclemaker

Programmer
Joined
Oct 16, 2002
Messages
127
Location
GB
Hi guys, I'm translating some XML with XSL, and have come across the following problem:

here's a chunk of my XML:

Code:
<long_description>
Heavy Weapon Deluxe est un jeu d’arcade o&ugrave; se d&eacute;cha&icirc;ne une action tremp&eacute;e d’adr&eacute;naline, &agrave; d&eacute;roulement lat&eacute;ral, sur le th&egrave;me du tir – dans le ...
</long_description>

Here's my XSL for that chunk:

Code:
  <xsl:template match="long_description">
    <p>
      <xsl:value-of select="." disable-output-escaping="yes" />
    </p>
  </xsl:template>

unfortunately this resutls in the following translated document:

Code:
Heavy Weapon Deluxe est un jeu dÂ’arcade où se déchaîne une action trempée dÂ’adrénaline, à déroulement latéral, sur le thème du tir – dans le

- you can see the 'Â' character is being added before secial characters like the quote used and the hyphen. Is it becuase these are double-byte characters outside of the standard ascii range? Any idea how to prevent the extra character from being added?

I'm using PHP / Sablotron to combine my XML and XSL.

Many thanks in advance.
 
Try changing the encoding in your xml declaration.

Jon

"Asteroids do not concern me, Admiral. I want that ship, not excuses.
 
i don't think this is to do with encoding. if I actually look in the source of the transformed document I can see the extra character in there:

dÂ’arcade

- it's not in the original XML document.
 
My mistake. It was an encoding problem. I fixed it by adding this line to my XSL:

<xsl:output encoding="iso-8859-1"/>
 
Yup. Default encoding in XML is UTF-8, which when displayed almost always comes up with the accented capital letter A.

Chip H.


____________________________________________________________________
If you want to get the best response to a question, please read FAQ222-2244 first
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top