INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Jobs

using JTidy to parse html when Div and span avalible

using JTidy to parse html when Div and span avalible

using JTidy to parse html when Div and span avalible

(OP)
I have the bellow function after calling it the html get corrupt(notice how the </span> is moved after first line instead  of at the end.). see below. help please.

public ByteArrayOutputStream convertToXHTML(String htmlString){        
        ByteArrayOutputStream xhtmlByteOutStream = new ByteArrayOutputStream();             
        if (htmlString != null && !htmlString.equals("")){
            // Convert HTML to XHTML using JTidy API
            Tidy tidy = new Tidy();
           
            tidy.setXHTML(true);           
            tidy.setDocType("omit");
            tidy.setQuoteMarks(false);
            tidy.setQuoteAmpersand(false);
            tidy.setQuoteNbsp(false);
            tidy.setFixUri(true);
            tidy.setMakeBare(true);
            tidy.setJoinStyles(true);           
            
            tidy.parse(new ByteArrayInputStream(htmlString.toString().getBytes()),
                    xhtmlByteOutStream);            
        }
        return xhtmlByteOutStream;

--------------------
the htmlString that is sent to the above funtion:
<html><head><title>Message Template</title><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"/><style type="text/css">@page { size: letter; margin-top:1.0in;margin-bottom:1.0in;margin-left:1.0in;margin-right:1.0in;} body { line-height:100%;} </style></head><body><span style="font-size: 12px; font-family: Arial;"><span style="font-size: 12px; font-family: Arial;"><span style="font-size: 12px; font-family: Arial;">First line<br><div style="text-align: center;" align="center">second line<br></div>third line</span></span></span></body></html>


AFTER the call to the above function

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta name="generator"
content="HTML Tidy for Java (vers. 26 Sep 2004), see www.w3.org" />
<title>Message Template</title>
<meta http-equiv="Content-Type"
content="text/html; charset=ISO-8859-1" /><style type="text/css">
/*<![CDATA[*/
@page { size: letter; margin-top:1.0in;margin-bottom:1.0in;margin-left:1.0in;margin-right:1.0in;} body { line-height:100%;}
/*]]>*/
</style>
</head>
<body>
<span style="font-size: 12px; font-family: Arial;"><span
style="font-size: 12px; font-family: Arial;"><span
style="font-size: 12px; font-family: Arial;">First line<br />
</span></span></span>
<div style="text-align: center;" align="center">second line<br />
</div>
third line
</body>
</html>
 

RE: using JTidy to parse html when Div and span avalible

I've never worked with JTidy, but I see most people use the parseDOM method, like here

Cheers,
Dian

RE: using JTidy to parse html when Div and span avalible

I think JTidy has done the best guess-work. The point is that span is an "inline" element, whereas div is itself a "block" level element. An inline element could contain inline elements but not block level element even in html. Hence the original html-string passed to it is already not a properly formed html serialized string. Apparently JTidy does what it can to salvage the situation and produces the best-guessed xhtml.

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members!

Resources

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close