×
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!
  • Students Click Here

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

Jobs

Sorting

How do I customize Perl's sort function? by Geekatron
Posted: 3 May 04 (Edited 3 May 04)

Managing lists of things is something Perl is exceptionally good at.  As such, it comes with a predefined "sort" function that can be called anywhere, on any kind of list, and can even be reversed using "reverse". (This is covered in any basic Perl reference so I won't go into their uses here.)

However, what if the kind of sorting you need to do is more complex than simply ordering a list numerically or alphabetically?  No worries! Perl comes with a built-in framework by which you can painlessly create custom sort routines and use them just like you would use the basic "sort" function.  Here's how to make use of this often-overlooked but powerful feature of Perl.


                 BUILDING A CUSTOM SORT

IMPORTANT: Custom sort subroutines are not normal subroutines! They have some special rules:

(1)  Sort subroutines do not accept arguments like normal
     subroutines.

(2)  Two values being compared are automatically
     passed-by-reference into the sort subroutine as
     $a and $b. Don't try to change their values or
     override them.

(3)  A sort subroutine can only return one of three
     values: -1 , 0,  or 1.

(4)  Don't call sort routines recursively. The compiler
     will be angry with you.


More on these later, but I want you to keep these in mind while you look at the examples below.


So, lets look at an example. Suppose I have the following hash:

%hashfoo = (
              'eep' , 23,
              'eek' , 4,
              'ack' , 17
           );

If I wanted to print my hash sorted by keys, i'd be home free:

foreach $key (sort keys %hashfoo) {
  print "$key => $hashfoo{$key}\n";
}

But alas, I am a high-maintenance programmer; i want to do it MY way, and sort it by value instead. Here's how I can do that with a subroutine.

As I mentioned, I don't have to pass anything into my sort subroutine - when I call it as a sort, it will automatically get two values passed in to be compared, $a and $b. I don't have to get them out of @_, or do a secret voodoo dance, or anything - they just appear, *POOF*, like magic.

In the example above, sort is getting sent hash keys. So all I need to do is compare the corresponding hash VALUES, and return a signal (-1, 0, or 1) indicating which value is bigger.

sub my_sort {
  # got hash keys $a and $b automatically
  return -1 if $hashfoo{$a} < $hashfoo{$b};
  return 0 if $hashfoo{$a} == $hashfoo{$b};
  return 1 if $hashfoo{$a} > $hashfoo{$b};
}

This will organize my keys neatly, by value, from smallest to largest. However, I've spelled out the last three lines
of the sub so you can see exactly what is happening. Now I want to compact it a bit:

sub my_sort {
  # got hash keys $a and $b automatically
  return $hashfoo{$a} <=> $hashfoo{$b};
}

The <=> operator automatically does for you what the three return lines in the first sub example did, only it's smaller. Small is good!


Ok, Now I have my subroutine. How do I use it?

The "sort" function in Perl can take an extra argument - the name of a subroutine (which we just wrote) to use for
sorting:

sort <subroutine> <list of things to sort>

So now my loop to print them out looks like this:

foreach $key (sort my_sort keys %hashfoo) {
  print "$key => $hashfoo{$key}\n";
}

So, on the first line, "keys %hashfoo" pulls out all of the keys from my hash and returns them as a list. Sort then feeds them to the 'my_sort' sub, one pair at a time, and orders each pair until it has sorted the whole list. Foreach then loops over the sorted keys and prints them out. Woohoo!

Here's the whole script so far:

%hashfoo = (
              'eep' , 23,
              'eek' , 4,
              'ack' , 17
           );

foreach $key (sort my_sort keys %hashfoo) {
  print "$key => $hashfoo{$key}\n";
}

exit;

# custom sort by by hash value
sub my_sort {
  # got hash keys $a and $b automatically
  return $hashfoo{$a} <=> $hashfoo{$b};
}



Now, suppose I am even wierder, and want to do things to those values before I compare them. Easy, I'll just add code to do that in my sub.

sub my_sort {
  # got hash keys $a and $b automatically
  
  # do math silliness to the values
  my $a_val = 100 - $hashfoo{$a};
  my $b_val = 100 - $hashfoo{$b};
  
  # compare the modified values instead of the originals
  return $a_val <=> $b_val;
}

But wait! I want the order reversed from what the <=> gives me. Do I have to split that back up into three statements and switch the < and > rules?

No. I'll just use Perl's handy "reverse" function which reverses a list for me. It works the same with a custom sort as with a normal one:

foreach $key (reverse sort my_sort keys %hashfoo) {
  print "$key => $hashfoo{$key}\n";
}

One final thing: suppose I want to compare two strings instead of two numbers in my subroutine? No problem - just replace the '<=>' operator with 'cmp'.
 
That's all there is to it, really. What you do to sort the values in the privacy of your own sort subroutine is your business; nobody else need ever know.



                                  
       EXAMPLE: SORTING FILES BY LAST MODIFIED DATE

Here's a more useful example of using a custom sort subroutine to print a list of files sorted by last modified date.


#!/usr/bin/perl

# get list of text files in current folder (unix filesystem)
@files = `ls -1 *.txt`;


foreach $file (sort by_last_mod @files) {
  print "$file\n";
}

exit;

# custom sort routine
sub by_last_mod {
  # vars $a and $b automatically passed in

  # perl function 'stat' returns array of info on a file
  # 10th element of the stat array is last modified date,
  # returned as number of seconds since 1/1/1970.
  my $adate = (stat($a))[9]; # get last modified date
  my $bdate = (stat($b))[9]; # get last modified date

  return $adate <=> $bdate;
}


Hope this is useful!

--G

Back to Perl FAQ Index
Back to Perl Forum

My Archive

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close