Using Perl data structures

sirtificate · Jun 2, 2010

I am trying to develop a Perl script for an genealogical application which I plan to Open Source to store collections of up to 15 hash key / value pairs in approx 500 discrete, indexed, array elements. The hash pairs are collected in the first phase of the script which subsequently iterates over the array and processes the retrieved hash collections, sorting by value and processing the sequenced hash keys during the secod phase.

I've included the code I've put together so far that aims to deal with a single array element, but it fails at the point that I try to put a hash collection into an array element.

I'd really appreciate help with the code to manipulate these data structures - or maybe there is a better way to achieve what I'm trying to do.

#!/usr/bin/perl

use strict;
my ($x, $key, %hash, %new_hash, @array);
sub hashValueAscendingNum
{
$hash{$a} <=> $hash{$b};
}

#Create the hash
for($x=0; $x<5; $x++)
{
$hash{$x} = $x-5;
}

#Print the hash
foreach $key (keys %hash)
{
print "Pre-sort xref - " . $key . " weight - " . $hash{$key} . "\n";
}

#Store the hash collection in an array element

@array[1] = %hash;

print "Array = $array[1]\n";

#Retrieve the hash from the array into a new hash

%new_hash = @array[1];

#Print the new hash in ascending order by hash value

foreach $key (sort hashValueAscendingNum (keys(%new_hash)))
{
print "Post-sort xref - " . $key . " weight - " . $new_hash{$key} . "\n";
}

PinkeyNBrain · Jun 2, 2010

I'm not sure I have a good grasp on what you're final goal is, so assisting will be a little ad-hoc. One thing to keep in mind about perl is that it will let you do a lot of really creative things that can be way beyond what you started out doing. While

Code:

@array[1] = %hash;

works, the parser will grumble about it indicating that

Code:

$array[1] = %hash;

**may** be what you really wanted. Perhaps you were unknowingly wanting

Code:

$array[1] = \%hash;

, which if this is the case something like

Code:

sub hashValueAscendingNum { ${$hash_ref}{$a} <=> ${$hash_ref}{$b}; } ;
foreach $hash_ref (@array) {
   foreach $key (sort hashValueAscendingNum (keys(%{$hash_ref}))) {
      print "Post-sort xref - " . $key . " weight - " . ${$hash_ref}{$key} . "\n";
      # I think this will work too
      #print "Post-sort xref - " . $key . " weight - " . $hash_ref->{$key} . "\n";
   };
};

could have potential (I tend to be liberal w/ ;'s. For any of you who don't care for them, please just look the other way)

The project you've outlined above (as I'm reading/understanding it) appears that it can become fairly complex. A suggestion would be to write several rounds of pseudo code first to help plan out what the code is going to do.

sirtificate · Jun 2, 2010

Hi,

Let me take one step backwards and explain what I'm trying to do. I'm a genealogist with a computer science background (non Perl)and I have a theory that it ought to be possible to create a series of family trees from transcribed census data by associating individuals in to family groups.

As a testing sample I have 4,500 records in a MySql database table which I process in enumeration sequence (by houses), I then assign a weighting to each individual in the 'proto-family', aiming to identify the husband / wife and children, I aim to sequence the family members such that the husband, if there is one, appears first and the wife next followed by the children.

Translated to a Perl script, I aim that each family will be a hash, each entry will be an ID and have a weighting and that the hash will be an element in the community array (approx 500 elements in the sample). The end result will be to itterate through the array and sort each hash based on the weighting and output family records through the Gedcom Perl package.

I hope this explains what I'm trying to do in sufficient detail - if not pleasee ask for further clarification.

Thanks

John

PinkeyNBrain · Jun 2, 2010

It's been a while since I worked with census data and I've never worked with Gedcom. Considering your reference to a CS background I'll stay fairly high level to move this along. As I'm starting I'll admit that some of my programming style may not be the most elegant under the first pass.
- First read in the data (is your SQL a single table?). I'm guessing a data structure similar to

Code:

$census_data_raw{$house_id}{$psudo_fam_id}{$member_id}{'age'} = $age ;
$census_data_raw{$house_id}{$psudo_fam_id}{$member_id}{'gender'} = $gender ;

will hold the data fairly well. I'm adding a psudo_fam var as your data may show homes with more than one family unit (live in grand-parents for example). Once read in, loop back over the data using your weighting scheme to build a second hash that looks something like

Code:

$census_data_sorted{$house_id}{$psudo_fam_id}{$weight} = $member_id;

From here use the sorted hash to loop over the data for printing, from within that you can use the member_id to link back into the census_data_raw hash to get age, gender, etc.

sirtificate · Jun 2, 2010

Spot on, I'm aiming for 80:20 or better success automatically rather than going through 200 family groups manually.

Even though thats what I'm planning to do in order to produce stats for a public lecture I'm due to give in August.

John

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Using Perl data structures

sirtificate

Programmer

PinkeyNBrain

IS-IT--Management

sirtificate

Programmer

PinkeyNBrain

IS-IT--Management

sirtificate

Programmer

Similar threads

Part and Inventory Search

Sponsor