I have come across a case where two identical data structrures are stored in two different files using the storable module. The only thing is the filesizes of these two files is different, maybe by as much as 10%.
Any ideas why this is so?
> perl compare.pl destdighash destdighashnew
structures of destdighash and destdighashnew are identical.
Died at compare.pl line 12.
me@linux:~/SECUR/KPI> ls -l destdig*
-rw-r--r-- 1 me users 8872 2007-10-20 02:00 destdighash
-rw-r--r-- 1 me users 7938 2007-10-25 21:25 destdighashnew
here is the script:
use strict;
use Storable;
use Data::Compare;
require('dumpvar.pl');
my $file1=shift; my $file2=shift;#e.g. destdighash, destdighashnew
if(-e "$file1"){ $main::rdestdighash=retrieve("$file1");}else{die "could not find $file1 to get hashref1 from\n";}
if(-e "$file2"){ $main::rdestdighashnew=retrieve("$file2");}else{die "could not find $file2 to get hashref2 from\n";}
print "structures of $file1 and $file2 are ",
Compare($main::rdestdighash, $main::rdestdighashnew) ? "" : "not ", "identical.\n";
die;
My only guess as to the size difference is that the original one was created on an older(though also linux) system.
STill , I cannot figure out why this could happen
My guess would be that the difference is not caused by being created on an older linux system in and of itself, but instead caused by being created with an older version of Storable. It would make sense that the module would maintain backward compatibility for retrieving. But given that Storable is at version 2.16, it's most likely had a few efficiency improvements over the years.
In the end, unless the documentation states that there should be an isomorphic relationship between data structures and their stored file sizes, then I would not expect one. There is not even such a relationship between perl basic data structures such as hashes and their memory usage.
Makes sense. The answer I got from perl-porters was that:
"I see more white space in the second file. I see more non-ASCII characters in the second
file.
I speculate that the systems differ in either or both of: integer size (32-bit vs
64-bit), locale (ASCII vs Unicode). I can't tell from here, though. Some information
about the character of the systems may help someone help you figure it out."
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.