Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations derfloh on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Sorting Arrays? 1

Status
Not open for further replies.

Guest_imported

New member
Joined
Jan 1, 1970
Messages
0
I have an array of lines, each line containing tab delimited fields, e.g.
array name: @data
$data[0] = "Fred Bloggs Myhouse"
$data[1] = "Tony Blair Downing St"

etc.
How do I sort the array in perl using a particular field? i.e. sort by address?
 
Here's the sort command you need:
Code:
@sorted = sort MySort @data;
BUT, you'll also need the following subroutine to do the actual sorting:
Code:
sub MySort {
   my @a = split("\t", $a);
   my @b = split("\t", $b);
   my $sortfield = 3;
   return $a[$sortfield-1] cmp $b[$sortfield-1];
}
The above will sort on the THIRD field in the array. To sort on something else, change the the value of $sortfield.

It's probably more efficient to hard code the subscripts instead of using $sortfield-1, but I did it that way for to illustrate the the THIRD field is the SECOND element of the array, and to illustrate WHERE you need the sortfield to be in the sort routine.
Tracy Dryden
tracy@bydisn.com

Meddle not in the affairs of dragons,
For you are crunchy, and good with mustard.
 
Another route: The good ol' Schwartzian Transform.

@sorted = map{$_->[1]} sort{$a->[0] cmp $b->[0]} map{my @s=split/\t/,$_;[$s[i],$_]}@array;

Where i is the index of the element you're sorting on.

I haven't tested that, but I think memory serves well enough.

brendanc@icehouse.net
 
Brendan, even I can't make much sense of that! :-) Tracy Dryden
tracy@bydisn.com

Meddle not in the affairs of dragons,
For you are crunchy, and good with mustard.
 
laughWell, it goes som'n like this:
For each element in your array, you do a split and create a a separate array with one element representing the original string, and the other element representing the the item you're sorting on. You then pass that to the sort function and do your cmp on the single-item element, then pass the resulting array back to map() and drop the single-item, leaving you with the same elements you started with, but sorted.

This avoids calling a seperate subroutine for each comparison and thus, saves a lot of time.

Blame Randall Schwartz.

brendanc@icehouse.net
 
The map version worked perfectly.
Just one more quick question? Is there a way to change it to sort by one field first, then another field?
 
Go ahead Brendan - can you do that Schwartzian transform and use TWO sort fields? :-)

The map version may be more efficient, but I think I'll stick to using a sort subroutine. It's a lot easier to understand and maintain.

Would it make my code more efficient if I put the subroutine code in an inline block instead of making it a subroutine?
Tracy Dryden
tracy@bydisn.com

Meddle not in the affairs of dragons,
For you are crunchy, and good with mustard.
 
Yes, you can... In the dummy element for the sort, concatenate the two (or more) fields your sorting on during the first map{} process. So the code is:
Code:
@sorted = map{$_->[1]} sort{$a->[0] cmp $b->[0]} map{my @s=split/\t/,$_;[$s[i].$s[j],$_]}@array;
[code]
As far as inlining [i]all[/i] your code goes.  I think that there are more considerations necessary than just efficiency (as you mentioned).  Readability, organization, reuse, etc.  For something like a sort, where the final code is one line and the logic is simple AND if used with a subroutine, would require that the function be called over and over, I don't bother writing a subroutine...But in most of my programs, you'll see that the main(), if you will, section contains very little logic, but instead, just tracks the variables I use.

Anyway.. hope this helps.

brendanc@icehouse.net
 
Tracy, I may not have read the last portion of your post correctly. If you meant, would sort{} be more efficient than sort Mysort... yes, indeed it would.

Hope this helps,

brendanc@icehouse.net
 
Yes, I was talking about sort { }. I've used that form when the code in the block was not too complex. I use the subroutine form when the code reaches a certain level of complexity. I also use the subroutine form when I was to set a variable to contain the name of the subroutine to be used in the sort. That ability comes in pretty handy.

I've found that perl is so efficient in most of what it does that I rarely have to really worry about which way of doing something is the more efficient. In most cases it doesn't seem to make a significant difference. When that's the case I frequently choose clarity of code over the slight increase in efficiency. Of course, that's just my way of doing things. Fortunately every programmer seems to have his/her own opinion about how things should be done, otherwise we'd all still be programming in COBOL <shudder>.
Tracy Dryden
tracy@bydisn.com

Meddle not in the affairs of dragons,
For you are crunchy, and good with mustard.
 
I use the subroutine form when the code reaches a certain level of complexity

Keep in mind that Perl doesn't care about line breaks and spacing. The inlining can be spaced out just as you might organize a subroutine, yet you'd still keep that performance boost. The Schwartzian transform could just as easily be written as:
Code:
@sorted =
    map {
        return $_->[1]
        } sort {
            $a->[0] cmp $b->[0]
        } map {
            my @s = split/\t/,$_;
            return [$s[i].$s[j],$_]
        }
    } @array
;

I also use the subroutine form when I was to set a variable to contain the name of the subroutine to be used in the sort. That ability comes in pretty handy.

It is. Considering the inherent laziness in programmers, I can't imagine that there's a Perl function that isn't handy in one way or another. :-) Although, for a small set, an if/else statement along with a sort{} for each condition would be speedier. ;-)

Fortunately every programmer seems to have his/her own opinion about how things should be done, otherwise we'd all still be programming in COBOL <shudder>.

'tis true. And though, nowadays, most of what I write strongly emphasizes efficiency and trades my ability to read it quickly for the user's ability to access it quickly, if it weren't for the people that wrote clean, easy on the eyes code, I probably wouldn't have learned to program in the first place.

Take care,

brendanc@icehouse.net
 
All very good points!
Tracy Dryden
tracy@bydisn.com

Meddle not in the affairs of dragons,
For you are crunchy, and good with mustard.
 
Status
Not open for further replies.

Similar threads

Part and Inventory Search

Sponsor

Back
Top