Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations derfloh on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

ASP, XML, XSLT (entities problem)

Status
Not open for further replies.

timmcn

Technical User
Joined
Jun 27, 2004
Messages
1
Location
US
I license a large library of several thousand XML documents from a third-party source. (A portion of one such document is appended at the end of this note.)

I would like to pull out a specific small XML-tagged section from each document and display it in a browser.

To do so, I've created an XSLT file that transforms the section of the documents I'm interested into HMTL...and I get exatctly the fomatted output I want when I run the XSLT against the XML source document in the XSLT editing tool I'm using (Stylevision from Altova, if that's important). (The contents of the XSTL document are also appended below.)

Next, I created an ASP that attempts to apply the XSLT to the XML document using Microsoft's XMLDOM object on my server. Here's the code:

strXMLFile = Server.MapPath("xml_leaflets/D00001A1.xml")
strXSLFile = Server.MapPath("Most_important_info.xslt")

'Declare local variables
Dim objXML
Dim objXSL

'Instantiate the XMLDOM Object that will hold the XML file.
set objXML = Server.CreateObject("Microsoft.XMLDOM")
'Turn off asyncronous file loading.
objXML.async = false
'Load the XML file.
objXML.load(strXMLFile)

'Instantiate the XMLDOM Object that will hold the XSL file.
set objXSL = Server.CreateObject("Microsoft.XMLDOM")
'Turn off asyncronous file loading.
objXSL.async = false
'Load the XSL file.
objXSL.load(strXSLFile)

'Use the "transformNode" method of the XMLDOM to apply the
'XSL stylesheet to the XML document. Then the output is
'written to the client.
Response.Write(objXML.transformNode(objXSL))

The page seems to execute (no errors). However, the screen output in the browser is blank.

On further investigation a friend and I have noted that each XML document contains a doctype and entity declaration for infrequent/foreign characters. If we remove this section from the XML (and the entities later in the document)...BINGO, the XSL transformation works and the section gets displayed correctly in the browser.

But, I want to have these uncommon and foreign characters appear correctly to an end-user (and I would rather not edit several thousand XML documents to get that to happen.)

The question I have is: is there anything I can modify either in the XSTL or in the ASP that will allow these entities to appear correctly?

Thanks,

Tim

CONTENTS OF XSTL FILE:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="<xslutput version="1.0" encoding="utf-8" omit-xml-declaration="no" indent="no" media-type="text/html" />
<xsl:template match="/">
<html>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<head />
<body>
<xsl:for-each select="leaflet">
<xsl:for-each select="Important">
<span style="font-family:Arial; font-size:smaller; font-weight:bold; ">Important things you should know about </span>
<xsl:for-each select="/">
<xsl:for-each select="leaflet">
<xsl:for-each select="Generic">
<span style="font-family:Arial; font-size:smaller; font-weight:bold; ">
<xsl:apply-templates />
</span>
</xsl:for-each>
</xsl:for-each>
</xsl:for-each>
<br />
<xsl:for-each select="Para">
<br />
<xsl:if test="position()">
<span style="font-family:Arial; font-size:smaller; ">
<xsl:apply-templates />
</span>
</xsl:if>
<br />
</xsl:for-each>
<xsl:for-each select="Attribute">
<br />
<xsl:if test="position()">
<span style="font-family:Arial; font-size:smaller; ">
<xsl:apply-templates />
</span>
</xsl:if>
<br />
</xsl:for-each>
<br />
</xsl:for-each>
</xsl:for-each>
</body>
</html>
</xsl:template>
</xsl:stylesheet>

REPRESENTATIVE XML DOCUMENT

<?xml version="1.0"?>
<!DOCTYPE leaflet [
<!ENTITY aacute "á">
<!ENTITY Aacute "Á">
<!ENTITY eacute "é">
<!ENTITY Eacute "É">
<!ENTITY iacute "í">
<!ENTITY Iacute "Í">
<!ENTITY oacute "ó">
<!ENTITY Oacute "Ó">
<!ENTITY uacute "ú">
<!ENTITY Uacute "Ú">
<!ENTITY mdash "--">
<!ENTITY ndash "-">
<!ENTITY rdquo """>
<!ENTITY ldquo """>
<!ENTITY rsquo "'">
<!ENTITY iquest "¿">
<!ENTITY ntilde "ñ">
]>

<leaflet author="28" statusid="R" versionnumber="3.05" functionid="11" daterevised="2/13/04 3:57:26 PM" subjectid="d00001" researcher="8">
<Generic>acyclovir (oral)</Generic>
<Pronounce>ay SYE kloe veer</Pronounce>
<Brand>
<genericname ext="2953">acyclovir</genericname>
<brandname ext="1152">Zovirax</brandname>
</Brand>
<H1>What is the most important information I should know about acyclovir?</H1>
<Important>
<Attribute icontype="finish">Take all of the acyclovir that has been prescribed for you even if you begin to feel better.
Your symptoms may start to improve before the infection is completely treated.</Attribute>
<Para>Treatment with acyclovir should be started as soon as possible after the first appearance of
symptoms (e.g. tingling, burning, blisters).</Para>
<Para>Herpes infections are contagious and you can infect other people, even during treatment. Avoid
letting infected areas come into contact with other people. Wash your hands frequently to prevent
transmission.</Para>
</Important>
<H11>What does my medication look like?</H11>
<LookLike>
<Para>Acyclovir is available with a prescription under the brand name Zovirax. Other brand or generic
formulations may also be available. Ask your pharmacist any questions you have about this medication,
especially if it is new to you.</Para>
<List>
<ListItem>
<Para>Zovirax 200 mg&mdash;blue capsules</Para>
</ListItem>
<ListItem>
<Para>Zovirax 400 mg&mdash;white, shield-shaped tablets</Para>
</ListItem>
<ListItem>
<Para>Zovirax 800 mg&mdash;light-blue, oval tablets</Para>
</ListItem>
<ListItem>
<Para>Zovirax suspension 200 mg/5 mL&mdash;off-white,
banana-flavored suspension</Para>
</ListItem>
</List>
</LookLike>
</leaflet>
 
I have no experience with included dtd's like this, but I can make a few guesses.
They all come down to: your source seems a bit smelly.

1 In the dtd-part, characters like 'á' are used, but the xml-declaration does not contain an encoding declaration.

2 <!ENTITY rdquo """> seems to me syntactically strange: shouldn't some escape for double-quote have been used?

3 The dtd-part describes some entities within the element 'leaflet', but it does not at all describe the element leaflet. Therefore, the actual node 'leaflet' seems to me to be not according to the dtd. I'm surprised a parser actually accepts this.

As for a solution: if you can't change the source (as it is 3rd-party) it might not be worthwhile to dig to deep into this dtd-hocus-pocus.
Maybe it pays off to experiment on removing that part, and replacing the escape-characters in a string-manipulation, before loading into DomDocument?

I hope some dtd-guru wanders by to give you some more hints,
good luck
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top