Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

perl data typing issue... 1

Status
Not open for further replies.

shadedecho

Programmer
Joined
Oct 4, 2002
Messages
336
Location
US
OK, I am only basically familiar with perl, my specialty is PHP. However, I am trying to use it (enabled by mod_perl) to configure my apache (httpd.conf) instance. I have all that part working, except I've run into a frustrating data-typing problem when trying to construct (from mysql queries, in this case) the right type of arrays/hashes/etc to pass back to the httpd.conf file.

So, rather than this being a question about the httpd.conf mod_perl configuration (from my web searching, not a particularly well known or often used feature as I am trying to do it, apparently), I just need to ask some general questions about how to construct the right data type, and then I can apply it to my code.

So, If I do this code in my httpd.conf

Code:
$temp{'blah'} = [
        {
                a => '123',
                b => '456',
                b => '789'
        },
        {
                d => '789',
                e => '456',
                f => '123'
        }
];

print Dumper $temp{'blah'};

I get:
Code:
$VAR1 = [
          {
            'a' => '123',
            'b' => '456',
            'c' => '789'
          },
          {
            'd' => '789',
            'e' => '456',
            'f' => '123'
          }
        ];
as the output. This particular type of output is the desired data typing.

I'm not sure what to use to refer to $temp or $temp{'blah'}. is this an array, or a hash, or a list, or something else? in PHP, I'd call this a 3-dimensional (associative) array. The first dimension is the associative key "blah", which i could add other entries like "blah2". The second dimension is numerically indexed (the two sets of { } each being an anonymous key'd item added to the "blah" item). The third dimension is another associative array, with the characters "a", "b", etc acting as the indexes, and finally the string values of "123" and so on associated with them.

In any case, the above code I provided, which creates whatever type of data this is, explicitly, I need to recreate this same type of data, but using loops and serial data adding calls, as records are returned one at a time from my DB call.

My strategy was to collect the "a", "b", and "c" items into some sort of temporary variable (perhaps called a hash, of some sort?), and then once it was completely filled, then "push"ing that temp variable onto the end of the $temp{'blah'} "array".

Secondarily, I need a way to, as I eluded to earlier, push some of those temporary variable items onto $temp{'blah2'} for instance as well. My code logic already makes all those appropriate decisions.

Here's some sample code that demonstrates how I *was* trying to do this, and being unsuccessful:
Code:
my @temp_var;

$temp_var = {'a' => '123'};
push @temp_var, {'b' => '456'};
push @temp_var, {'c' => '789'};

push @( $temp{'blah'} }, @temp_var

When I did this, and then did a "print Dumper $temp{'blah'};", i got as output:
Code:
$VAR1 = [
          {
            'a' => '123'
          },
          {
            'b' => '456'
          },
          {
            'c' => '789'
          }
        ];

Notice this data type is different in structure from the desired one in that each {'a' => '123'} item is on the $temp{'blah'} array, rather than each being grouped as one item. Essentially, it's like my data-typing errors/wrong code eliminated one of the intended dimensions.

can anyone help me get a correct set of code to create the kind of structure i want, including how to declare the temporary variables, and use them, and print them out with Dumper so I can effectively test my code?
 
I'm not sure what to use to refer to $temp or $temp{'blah'}. is this an array, or a hash, or a list, or something else?

It's a hash which is an array of anonymous hashes. You could call it a 3 dimensional hash. But the data types are mixed:

hash -> array -> hash

to pull out bits of data:

print $temp{'blah'}->[0]->{a};
print $temp{'blah'}->[1]->{d}


or as loops:

Code:
foreach my $i (@{$temp{'blah'}}) {
   foreach my $key (keys %{$i}) {
      print $i->{$key},"\n";
   }
}

unless the main hash %temp will have more keys, it might be easier to just make that an array to begin with:

Code:
@temp = [
        {
                a => '123',
                b => '456',
                c => '789'
        },
        {
                d => '789',
                e => '456',
                f => '123'
        }
];

and if the hash keys inside the anonymous hashes will be unique there is probably no need to even use the anonymous hashes. That type of structure is good when you want to use repeated instances of the same hash keys:

Code:
@temp = [
        {
                name => 'joe',
                age => '45',
                sex => 'male'
        },
        {
                name => 'jane',
                age => '28',
                sex => 'female'
        }
];

the first three tutorials on this page will also be helpful to understand references and anonymous storage and such:

 
i read through those tutorials, and the stuff about references makes a whole lot of sense, and help ALOT. Thank you. However, I'm still struggling with one thing, though, and I think it has to do (at least partially) with difficulties of variable scoping.

Firstly, is it true (as it appears from my testing) that saying "my $test" outside of an if-block, and then inside the if-block saying it again, actually creates two different variables, one in the "outer" scope and one in the "inner" scope?

I think this is true, at least it strongly appears that way. If it is, it seems like a difficult problem to solve that I need to be able to re-create a hash, inside of an if-block, but doing so in the "outer" scope it was originally created in, so that code outside the if-block can see the changes.

The reason I need to do this is I do in fact need to do, as you stated in your last example, repeated instances of the same hash keys.

my earlier example data structure should have been this:
Code:
#temp =
{ 'blah' => [
             {
                a => '123',
                b => '456',
                c => '789'
             },
             {
                a => 'rrr',
                b => 'sss',
                c => 'ttt'
             }
            ]
};

so, in code, i have been trying (with my new adjustments from what i learned from your first post) to do this:
Code:
my %temp_var;
$temp_var{'a'} = '123';
$temp_var{'b'} = '456';
$temp_var{'c'} = '789';

my %temp;
push @{ $temp{'blah'} }, \%temp_var;

$temp_var{'a'} = 'rrr';
$temp_var{'b'} = 'sss';
$temp_var{'c'} = 'ttt';

push @{ $temp{'blah'} }, \%temp_var;

I realize that because of references, doing that second set of assignments to temp_var is actually modifying the original hash whose reference is sitting in $temp{'blah2'}{}. So, what I really need right before those assignments is a way to recreate the %temp_var hash, but without the scoping issues. If I place "my %temp_var" right before the second set of assignments, my above code works fine, and creates the desired data type.

But, since my *actual* code logic for this project is in a more complicated looping structure that is going through mysql results, the "recreation" of %temp_var with "my %temp_var" won't work because it happens inside of an if-statement, which is inside of the looping block, and therefore because of the scoping observations I mentioned above with "my" inside of an if-block, the change made to the %temp_var variable inside that if-block is *not* seen outside of the if-block.

Furthermore, let me state that it's critical that the second dimension be an array (or anonymous-key hash or whatever), as in my example structure shown. The only way I know how to add anonymously onto the end of an array is with "push". So, my coding logic plan has been to gather all the "a", "b", and "c" type data into a %temp_var hash, with successive loop iterations on the data result-set, and then push that %temp_var hash all at once onto the end of the %temp{'blah'}[] array, then "reset"/"recreate" the %temp_var hash and get some more values from the result set into it, and then push the "new" %temp_var hash onto the end of %temp{'blah'}[] again, and so on.

It bears noting that I have found that also the following code doesn't work, and I don't know why. Perhaps the answer has bearing on what I'm trying to do, I don't know:
Code:
my %temp_var;
$temp_var{'a'} = '123';
$temp_var{'b'} = '456';
$temp_var{'c'} = '789';

push @{ $temp{'blah'} }, \%temp_var;
push @{ $temp{'blah'} }, \%temp_var;

My guess would have been that this just would have put two side-by-side references to the temp_var hash into the array. But it doesn't work that way.
 
ok, well I figured it out, i think... at least, my code is now working (mostly), so i *hope* that means I figured the major part of it out.

instead of:

push @{ $temp{'blah'} }, \%temp_var;

I am now doing:

push @{ $temp{'blah'} }, {%temp_var};

If I understood the tutorials correctly, the difference between them is that the first one pushes the actual reference (address) of temp_var into the array, while the second one with the { } around the %temp_var actually creates another reference (to perhaps a copy, i guess?) of %temp_var. Am I correct?

This change now lets me do two pushes of the same hash onto the array, as i had mentioned above didn't work, plus it lets me push the temp_var, then change it, then push it again.

the *only* thing that remains now it seems is to account for the case where temp_var has "a", "b" and "c" in it, and the next set of iterations only set "a" and "b", but not "c", which *should imply* that "c" is not wanted. If I were actually re-creating temp_var each time, this wouldn't be an issue. But since I'm not, and don't even know how to correctly, old data from a previous iteration is still in the %temp_var hash.

I know I could manually do delete, but what's a good compact way of knowing what's in a hash and feeding that to a delete? Is there a way to have a loop that would delete all the elements in a hash, which would effectively "reset" my hash as I need it to?
 
FWIW:
I solved the copy and delete issues, with simple while loops. my code works fine.

then i realized that I need to preserve order in the hashes, as some apache configuration implies importance on the order of configuration directives.

so I found Tie::IxHash which allows you to tie() your hashes to this class, and it transforms the hash into an order-preserving hash, almost auto-magically. beautiful piece of code.

my pushing of various references onto the arrays and stuff was causing some of the ordered-hash stuff to be lost, so i had to reorganize my code, but i got it so that this part was working.

then I realized that i also need to support duplicate keys in the hash. apache also allows this, and in some cases it does the "overwriting" concept (standard hash), but in some cases it actually accepts multiple configurations with the same name.

so i found Tie::DxHash, which is the same as IxHash except with duplicate key support. sounds great right?

not exactly. i've found a bug in Tie::DxHash, which makes it unusable to me now. basically, once you call "keys" function (or other similar iterator-based functions) on the hash, it is no longer usuable to add things to, and i haven't found a way to force the iterator internal reset.

this bug is *not* present in the IxHash version, so i've had to switch back temporarily, and then wait for someone to fix DxHash so I can fully support the apache configuration.
 
FYI: looks like the bug in Tie::DxHash is somehow related to the iteration functions not correctly updating all internal pointers, because after a call to one of the iteration functions (or feeding the hash to Dumper for instance), I *can* still add more values to the hash, but they do not then show up in a subsequent iteration over the hash.

However, I can access those new items directly, by index, except that I get this warning: "Use of uninitialized value in string eq at /usr/local/lib/perl5/site_perl/5.8.8/Tie/DxHash.pm line 69." So, apparently, something like an "end-of-hash" pointer or something is not being updated appropriately after iteration.

anyone have any ideas? anyone good enough at perl to look at the code for this package and see how to fix this problem?
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top