Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Analyze perl script - Recursive calls - crazy code! 6

Status
Not open for further replies.

BrianAtWork

Programmer
Apr 12, 2004
148
US
I found a perl script that gives you the disk usage of a directory and outputs it in a tree structure.
This isn't coded the best, but it does bring up some interesting techniques that I thought PaulTEG, WarBlade, Kevin, and others might enjoy explaining? There were a few lines in this code especially that I was wondering if people could explain to me.
So here's the code:
Code:
#!/usr/bin/perl

@lines = `du -k @ARGV`;
chomp(@lines);
&input($top = pop @lines);
&output($top);
exit $?;

sub input {
    local($root, *kid, $him) = @_[0,0];
    while (@lines && &childof($root, $lines[$#lines])) {
    &input($him = pop(@lines));
    push(@kid, $him);
    }
    @kid = &sizesort(*kid);
}

sub output {
    local($root, *kid, $prefix) = @_[0,0,1];
    local($size, $path) = split(' ', $root);
    $path =~ s!.*/!!;
    $line = sprintf("%${width}d %s", $size, $path);
    print $prefix, $line, "\n";
    $prefix .= $line;
    $prefix =~ s/\d /| /; $prefix =~ s/[^|]/ /g;
    local($width) = $kid[0] =~ /(\d+)/ && length("$1");
    for (@kid) { &output($_, $prefix); };
}

sub sizesort {
    local(*list, @index) = shift;
    sub bynum { $index[$b] <=> $index[$a]; }
    for (@list) { push(@index, /(\d+)/); }
    @list[sort bynum 0..$#list];
}

sub childof {
    local(@pair) = @_;
    for (@pair) { s/^\d+\s+//g; }
    index($pair[1], $pair[0]) >= 0;
}

In the "input" subroutine, there is what I am guessing is a pointer to the array @kid. Can someone explain what happens on the last line of the "input" sub?
And can someone explain the 2nd to last line in the "output" sub? It's a little confusing.

It's a Friday, so I just thought I would generate some discussion. This code is a little hard to read, but I thought the more experienced people could explain some of the "features"?

Thanks in advance!

Brian
 
I think that this is hideous code and I certainly wouldn't have picked it for a teaching example. The author clearly didn't understand the difference between local() and my() and uses bizarre and dangerous syntax for passing arrays around: \@ is preferable to using globs any day. Some of the constucts are simply showing off, eg [tt]local($root, *kid, $him) = @_[0,0];[/tt]. This is hard to read, hard to maintain, arcane for the sake of it and generaly bad coding.

You'll learn much more by throwing this away and writing your own. We'll help if you get stuck.

Yours,

fish

&quot;As soon as we started programming, we found to our surprise that it wasn't as easy to get programs right as we had thought. Debugging had to be discovered. I can remember the exact instant when I realized that a large part of my life from then on was going to be spent in finding mistakes in my own programs.&quot;
--Maurice Wilkes
 
hehehe.... ripped that code a new hole. It is unusual to say the least.

*kid is similar to a reference. It's called a typeglob. typeglobs are faster than references though because there is no dereferencing. But don't be too tempted to use them. For small scripts I see no problem, but they can lead to buggy code if you're not careful. There is a very good chapter in "Advanced Perl Programming" explaining typeglobs and aliases. If I remember correctly, typeglob aliases is the only way to pass filehandles to other sub routines, which is something I never seem to have a need for anyway.

@kid = &sizesort(*kid);

the above line is sending an alias of @kid to the sub routine sizesort();

local($width) = $kid[0] =~ /(\d+)/ && length("$1");

this line just assigns the value of the pattern match (\d+) to $width if length("$1") is true. I guess it's just making sure the match isn't empty. I'm not sure why they author used local($width) though.

 
If I remember correctly, typeglob aliases is the only way to pass filehandles to other sub routines

I think that this used to be true but, these days, I'd use a scalar reference to an IO::File object rather than a typeglob if I wanted to pass a filehandle as an argument.

...which makes me wonder: was this code written before my() appeared in perl?

For those not into globs, *kid is a sort of referece to everything global called kid, including $kid, @kid and %kid. Which thing is being referred to is worked out by the context of the dereference, or explicitly using, eg, *tie{SCALAR} or *tie{ARRAY}.

Why not use them? First of all you arguably shouldn't be using globals without a very good reason anyway (sparsely populated namespaces contain fewer suprises); secondly you have to really know what you are doing to work out what would happen to a scalar variable called $tie if you are mucking round with *tie to get to @tie. It's really not worth the bother and risk to save one dereference per loop unless your code is extremely time-critical.

I used to spend long hours optimising code whereas now I choose to write clear code. If it runs too slowly, I put it on a faster box and justify the expense by the hours saved on development and maintenance and the reduced probability of subtle bugs and their knock-on effects. Bugs in production code can cost a lot of money and uncalculable good-will and if the price of avoiding them is a second processor, then it's a bargain.

</rant>

f

&quot;As soon as we started programming, we found to our surprise that it wasn't as easy to get programs right as we had thought. Debugging had to be discovered. I can remember the exact instant when I realized that a large part of my life from then on was going to be spent in finding mistakes in my own programs.&quot;
--Maurice
 
* for that, problems can be code related, but their solutions sometimes aren't.

If a faster processor, second processor, or RAM is a option, calculate a developers time in relation to the cost of upgrading.

I'm not for a minute advocating the use of sloppy or inefficient code, but suggesting that a more cost effective route to a solution could be to look elsewhere

--Paul

Spend an hour a week on CPAN, helps cure all known programming ailments ;-)
 
Thanks for the star. If you've ever had an entire company sitting twiddling their thumbs waiting for you to sort out a problem caused by some arcane usage that probably felt clever at the time and is now simply incomprehensible, you'd spend a lot to avoid being in the same position again.

I used to think that the "target audience" for the code was primarily the computer. Now I think that it's probably going to be me, three years down the line, so I try and be kinder.

f

PS - are you still cigless?

[&quot;]As soon as we started programming, we found to our surprise that it wasn't as easy to get programs right as we had thought. Debugging had to be discovered. I can remember the exact instant when I realized that a large part of my life from then on was going to be spent in finding mistakes in my own programs.[&quot;]
--Maur
 
No --cigless used to be my sig, now I'm trying to use that space to evangelicize(sp?) CPAN.

I was thinking about giving up the cigarettes once ... I was 12
--Paul

Spend an hour a week on CPAN, helps cure all known programming ailments ;-)
 
I think that this used to be true but, these days, I'd use a scalar reference to an IO::File object rather than a typeglob if I wanted to pass a filehandle as an argument.

Yes, thanks for making that clear.

I think you're right about the code maybe being old.
 
Thanks everyone for the wise words. I totally agree this code is hideous - that's why my subject line said "Crazy Code" - it sure is crazy! I guess it wasn't the best specimen for a teaching example, but when I found this code I saw a few things I hadn't seen in Perl before and thought maybe I could get some collective wisdom and what was going on in the code.

Thanks for the suggestion on re-writing it. I think I'll pass unless I'm really bored - This code doesn't get used, I just found it in the archives and was curious about the strange methods.

And thanks for the explanation on the typeglob - apparently I haven't missed much by not using it before, but I may brush up on them a little just to get a better understanding of Perl.

Thanks everyone for your replies! I appreciate it!

Brian
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top