Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Need to escape SGML

Status
Not open for further replies.

1DMF

Programmer
Joined
Jan 18, 2005
Messages
8,795
Location
GB
Hello,

Is there a module like URI::Escape for escaping SGML i.e.

& = &
< = &lt;
> = &gt;

I've tried to use =~ s/'/\&#39;/; but it didn't replace my apostophe and my page won't validate with...

Error Line 153 column 195: non SGML character number 146.
...y to configure?<br /><br />If you don¼/strong>?t have the time or technical knowledge

Not sure why it has that weird symbol and half tag it's only a single quote in the text, any ideas why i'm getting this behaviour

thanks,
1DMF


"In complete darkness we are all the same, only our knowledge and wisdom separates us, don't let your eyes deceive you.
 
Offhand, I can't think of anything specifically for what you're looking for.

It's possible that your single quote is not the ASCII-39 single quote, which would be why your regexp doesn't match. I don't know where your data is coming from, but if someone typed that text in a word processor, it may have been replaced with a "prettier" character for the apostrophe.
 
Hi Ishnid,

The text was entered via a standard HTML <textarea> tag and then saved into SQL.

I have a few of these errors on what looked to be simple characters when displayed on the screen, but fail validation on w3c with weird characters, like
Just click the ?Who we are page? to find out more.

the squares are shown on the screen as
“Who we are page”

What can I do to resolve this?

"In complete darkness we are all the same, only our knowledge and wisdom separates us, don't let your eyes deceive you.
 
I copied the text and pasted into a pure ASCII editor and yup you were right, the apostrophe's showed as back ticks, and the quotes " , showed as fancy upside down ones.

I re-entered the text and hey presto
This Page Is Valid XHTML 1.0 Transitional!

I believe I originaly copied the text from an email, so that must have been the problem, although my outlook is set to be HTML and so is my colleagues, oh well, it's fixed now.

:-)

"In complete darkness we are all the same, only our knowledge and wisdom separates us, don't let your eyes deceive you.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top