Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!
  • Students Click Here

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

HTML::Entites - hash copy

HTML::Entites - hash copy

HTML::Entites - hash copy

Does anyone see a problem with doing something
as simple as this with form input.

Object is to create two hashes from one, one for printing back to
client safely (encoded) and the other to store the data in original non-encoded format.
my %in_c = %in;  # copy my hash

foreach $key (keys %in) {
So now I have two hashes;
%in (encoded)
%in_c (decoded/raw)

The idea since I have many print statements through out
the program is to use %in for printing and to save %in_c for our internal use.

Seems to work and better to do at beginning of program and be done with it then to encode/decode multi times through the program.



RE: HTML::Entites - hash copy

Err.. this is a perl question.

HTML::Entities is a perl module.

If you have ever done any perl cgi programming
this should be very obvious and would also
be very clear as to the question.

So if you don't know what you are
responding about... don't respond.


RE: HTML::Entites - hash copy

Your proposal sounds fine to me.  I don't see the point in copying the contents of the hash twice.  Why not encode as you copy it?

CODE --> perl

# assign values to %in_c initially

# skip this
# my %in_c = %in;  # copy my hash

foreach $key (keys %in_c) {

The only disadvantage I can think of is, should you decide in future to manipulate some of the data in this hash, you may forget to update both versions with confusing results.

tgmlify - code syntax highlighting for your tek-tips posts

RE: HTML::Entites - hash copy

"you may forget to update both versions"
 that was a good point I didn't think about, but values are never changed inline, they would be assigned to a variable and modified in the variable.

Since this script generates several dynamic web pages through out
almost 6000 lines of code and about 80% through also
saves data out to a file I was thinking it would be easiest to
have an encoded copy for printing back to client when needed and raw/decoded copy to save out, otherwise I would be encoding/decoding things many times.

It was really this line that worried me....


But seems to work OK.


RE: HTML::Entites - hash copy

What is the $value= assignment for?  Does it work without it?  (I haven't used encode_entities()).

tgmlify - code syntax highlighting for your tek-tips posts

RE: HTML::Entites - hash copy

HTML::Entities is used as part of a safety measure to help fight
against "cross site scripting"

It's possible for someone to enter unscrupulous HTML code and
have it written back to the client browser.

such as...

<IMG SRC="javascript:alert('XSS');">

HTML::Entities will convert the HTML in the input
to their HTML entities which browsers will still display


decoded/raw input = <IMG SRC="javascript:alert('XSS');">

input encoded to HTML entities and wriiten back to client =

&lt;IMG SRC=&quot;javascript:alert(&#39;XSS&#39;);&quot;&gt;

So basically I'm taking the raw form input, key value, encoding it and assigning the encoded value back to the key.
Which will make it safer to display back to client browser.


Hope that made sense.


RE: HTML::Entites - hash copy

I guess you made me think about it a little more...

>What is the $value= assignment for?  Does it work without it?


$in{$key} is a reference, to get the value of the
reference you need to assign it, thus $value=$in{$key}

once you have the value, you can encode the value
encode_entities($value, once encoded you can store the
value back to the reference, $in{$key}=

Errr...something like that.


RE: HTML::Entites - hash copy

$in{$key} is a string, not a reference. If it would, then  $value  would be also. So the assignment in the function call is useless, unless  $value  is used elsewhere.

http://www.xcalcs.com : Online engineering calculations
http://www.megamag.it : Magnetic brakes for fun rides
http://www.levitans.com : Air bearing pads

RE: HTML::Entites - hash copy

So I have a, perhaps silly, question - if the data needs to be encoded such that it won't be executable, why store it in a non-encoded form? Doesn't that create the possibility that someone could innocently use the data later but forget to encode it again? Why not convert it to the 'safe' version straight away and store the data that way for future use?

As far as your code goes, a few comments:
  • Variable Naming
    • It probably doesn't seem like it now, but it would be pretty easy use the 'wrong' hash when printing your information
    • I might suggest something more like %in_dirty, %in_clean
  • Duplicating your hash
    • There doesn't need to be two steps to duplicate the hash. One to make a copy then a second to process each element. See the code below for an example.


foreach my $key (keys %in_dirty) {
  $in_clean{$key} = encode_entities($in_dirty{$key});
And, if you decide to store the 'encoded' data, you don't need the second hash at all. You could encode and store the data in a single hash while you're reading the data in the data.

RE: HTML::Entites - hash copy

Well...... Thanks for all the great input
and the heads up.


This was the line that was troubling me.

 It does work without the "$value"


This was a real good way to do it.

foreach my $key (keys %in_dirty) {
  $in_clean{$key} = encode_entities($in_dirty{$key});

As everyone has been pointing out having duplicate
hashes can cause problems, err... well it has caused
some issues and I now believe it's not worth the trouble.

I'm going to keep one hash and only encode
the HTML entities where user input is being printed back
to the client.

print encode_entities($in{xx});

Makes it simpler, one hash and encoded where the
actual problem may exist.

I can see where people might opt to use "Apache::TaintRequest"
since it overrides the print request and handles it there.


RE: HTML::Entites - hash copy

If you are still worried with the number of times you'll have to repeatedly encode (presumably many thousands, otherwise don't worry), there is an intermediate way to do it: you write your own encode function that, before going to  encode_entities()  checks whether it has been already encoded, something along the lines of


sub my_encode_entities($key){
This requires  %in_clean  and  %in_dirty to be global, but can be easily changed by using references if you mind.

http://www.xcalcs.com : Online engineering calculations
http://www.megamag.it : Magnetic brakes for fun rides
http://www.levitans.com : Air bearing pads

RE: HTML::Entites - hash copy

prex1, thanks for the input.

I decided to go with encoding HTML entities
at the source of concern....


print <INPUT TYPE="hidden" NAME="AD1" VALUE='.encode_entities($in{AD1}).'>;

CGI param then does a good job at decoding
the HTML entities on input.


Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members! Already a Member? Login

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close