I am trying to encode an xml document I created to be UTF-8 compliant. So far, I keep getting back a string that hasn't been encoded and I am not sure why. Here's part of what I did.
import java.io.*;
import org.w3c.dom.*;
import org.apache.xml.serialize.*;
import org.apache.xerces.parsers.DOMParser;
import org.apache.xerces.dom.*;
import org.xml.sax.InputSource;
/**
* serialize the DOM tree to a String
*/
public static String serializeDOMTree(Document document, int indent) throws Exception {
StringWriter writer = new StringWriter();
OutputFormat outputFormat = new OutputFormat(document, "UTF-8", true);
outputFormat.setIndent(indent);
outputFormat.setIndenting(indent > 0);
outputFormat.setLineWidth(0);
outputFormat.setPreserveSpace(false);
char[] cr = {0x0d, 0x0a};
outputFormat.setLineSeparator(new String(cr));
XMLSerializer serializer = new XMLSerializer(writer, outputFormat);
serializer.serialize(document);
return writer.toString();
}
public static void main(String[] args) {
Document doc = new DocumentImpl();
Element message = doc.createElement("message"
;
doc.appendChild(message);
Element agent = doc.createElement("trial"
;
agent.setAttribute("agentid", "This is only a TEST@"
;
message.appendChild(agent);
try {
String mess = serializeDOMTree(doc,1);
System.out.println(mess);
} catch (Exception e) {
}
}
Why does it keep giving me non-encoded output?
I get:
<?xml version="1.0" encoding="UTF-8"?>
<message>
<trial agentid="This is only a TEST@"/>
</message>
What I should be seeing is something like (I forgot what @ encodes to so I just used %12:
<?xml version="1.0" encoding="UTF-8"?>
<message>
<trial agentid="This+is+only+a+TEST%12"/>
</message>
What am I doing wrong?
import java.io.*;
import org.w3c.dom.*;
import org.apache.xml.serialize.*;
import org.apache.xerces.parsers.DOMParser;
import org.apache.xerces.dom.*;
import org.xml.sax.InputSource;
/**
* serialize the DOM tree to a String
*/
public static String serializeDOMTree(Document document, int indent) throws Exception {
StringWriter writer = new StringWriter();
OutputFormat outputFormat = new OutputFormat(document, "UTF-8", true);
outputFormat.setIndent(indent);
outputFormat.setIndenting(indent > 0);
outputFormat.setLineWidth(0);
outputFormat.setPreserveSpace(false);
char[] cr = {0x0d, 0x0a};
outputFormat.setLineSeparator(new String(cr));
XMLSerializer serializer = new XMLSerializer(writer, outputFormat);
serializer.serialize(document);
return writer.toString();
}
public static void main(String[] args) {
Document doc = new DocumentImpl();
Element message = doc.createElement("message"
doc.appendChild(message);
Element agent = doc.createElement("trial"
agent.setAttribute("agentid", "This is only a TEST@"
message.appendChild(agent);
try {
String mess = serializeDOMTree(doc,1);
System.out.println(mess);
} catch (Exception e) {
}
}
Why does it keep giving me non-encoded output?
I get:
<?xml version="1.0" encoding="UTF-8"?>
<message>
<trial agentid="This is only a TEST@"/>
</message>
What I should be seeing is something like (I forgot what @ encodes to so I just used %12:
<?xml version="1.0" encoding="UTF-8"?>
<message>
<trial agentid="This+is+only+a+TEST%12"/>
</message>
What am I doing wrong?