I actually use
Code:
# Remove tags for security
$my_var =~ s/<[^>]*>//gi;
For all user input, which I'm hoping allows nothing through, but please advise if this is not the case!
Looks pretty bulletproof to me:
Code:
[kirsle@firefly ~]$ perl
my $html = q{
<html>
<title>Some <title></title>
<body
bgcolor="black">
<
script
type="text/javascript"
>
window.alert("omg");
<
/script
>
<b>hello <i>world</i></b>
<img
src=""
onerror="alert('omg')"
>
</html>};
$html =~ s/<[^>]*>//gi;
print "What got through: $html\n";
__END__
What got through:
Some <title>
window.alert("omg");
hello world
[kirsle@firefly ~]$
Although just in case, I'd add an "s" along with that "gi" so it'll take a multiline string as though it's a single line (it didn't matter in my example, probably because of the *nix line endings being a simple \n, but I've run into problems in the past where regexp's that need to involve multiple lines didn't work well unless I had the "s" flag on it... possibly cuz it was having to deal with Windows newlines, I dunno).
There's something else I wanna touch on that I discovered at my last job. When using < and > to block HTML, that sometimes will break depending on how the HTML is used. For instance we had this help desk system that was heavily tied into e-mail, and so mail would come from "Some Name <name@domain.com>" in that format. The generated HTML code for our end was sorta like this:
Code:
<a href="ticket.html?id=12345" onMouseOver="showPopup('Some Name <some_name@domain.com>')" onMouseOut="hidePopup()">
Some Name <some_...</a>
The idea was that if the sender had a long name or email address, it would be truncated with "..." for display inside the table, but that a div would pop up when you move your mouse over it, that would display the full e-mail address. Notice there's < and > in there, which seems good, right?
The problem is that Firefox actually rewrites the source code of the page, so when you "view source" you get the original code, but you need to do a "view generated source" (or do a File->Save Page As) to see Firefox's internal view of the source code. In Firefox's view, it was looking like this:
Code:
onMouseOver="showPopup('Some Name <some_name@domain.com>')"
The visible effect was that, in the JavaScript div popup, the e-mail address was invisible and only "Some Name" was displayed, because it was trying to render <some_name@domain.com> as an HTML tag, which naturally doesn't work.
I was only tech support so it wasn't my place to fix this, but I tested it by exploiting this. I sent an e-mail to the help desk with a name that contained <b> tags, to see that in the popup div, my name would appear in bold text. And then the developers fixed it.
This is just an odd case of how some web browsers treat certain elements of pages, but things like this need to be watched out for too. < and > are generally good ideas but you have to keep in mind how they're going to be used.
Cuvou.com | My personal homepage
Code:
perl -e '$|=$i=1;print" oo\n<|>\n_|_";x:sleep$|;print"\b",$i++%2?"/":"_";goto x;'