Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

getting non ascii characters in html

Status
Not open for further replies.

angelblade27

Programmer
Joined
Jul 30, 2006
Messages
10
Location
US
i'm using web mechanize to submit messages to a board and then looking at the submitted page to see if it contains the text the caption i'm looking for is <'Hérrœ & e Paßwört´s'>-thursday
i'm gettin the -thursday but not the first part in the $mechanzie->content()
 
OK, thank you for sharing that with us. Did you have a question?

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
yeah does the mechanize not support non ascii characters?
 
the content() method fetches the raw html from the page. How are you examining the content the method returns?

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
I'm using a regex to get the data between two tags ( where the text should be). THen passing it through a html:parser method which converts the html encoding eg &amp; in to the actual meaning.
if ($caption) {
my ($caption_src) = $mech->content() =~ m|<span\ id="imageCaption">(.*?)</span>|si;
$test->is_eq( $self->decode_source_text($caption_src), $caption, "Image caption matches?" );
}
i printed the caption_src before and have the decode_source() but am still not seeing the non ascii chars. just the day part
 
OK, maybe someone else can help, I don't know what the problem could be.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top