Perl sort Function



Perl sort Function Menu:

1. The syntax forms and a few examples
2. How to sort a hash by keys
3. How to sort a hash by values
4. How to sort an array of arrays
5. How to sort using the Schwartzian Transform
6. How to sort a matrix by multiple columns



Click Below to See the Best

Perl How-to Snippet Collection

and Save Hours of Surfing on Internet!


1. The syntax forms and a few examples


The Perl sort function sorts a LIST by an alphabetical or numerical order and returns the sorted list value. Keep in mind that the argument list remains unchanged while a new sorted list is returned. In this short free tutorial, I’ll give you a few examples about how to use the Perl sort function in your scripts.

The syntax forms of the Perl sort function are as follows:

sort SUBNAME LIST
sort BLOCK LIST
sort LIST

The third syntax form is the simplest of them and is for the standard comparison order. See the following example:

my @array = sort qw(map 23 Perl 101 11 while 1 scalar 102);
print "@array\n";
# it prints: 1 101 102 11 23 Perl map scalar while

@array = sort qw(23 101 11 1 102);
print "@array\n";
# it prints: 1 101 102 11 23

The problem is that capital letters have a lower ASCII numeric value than the lowercase letters so the words beginning with capital letters will be shown first, as you can see in our example. One alternative is to transform all the list elements into lowercase or uppercase letters and than perform the Perl sort function. You’ll see an example below.

As you can see from the second example shown above, even if all the elements of the list are numbers, this simple sort will sort the list in an alphanumerical order.

In practice you need to do some additional performing to make your sort accommodate with your task. And here comes the first and the second syntax forms of the Perl sort function. SUBNAME is the name of a subroutine where you describe how to order the elements of the list. Instead of a subroutine, you can provide a BLOCK as an anonymous in-line subroutine.

This Perl sort function uses two special variables $a and $b and they are any two elements from the list that are compared in pairs by sort to determine how to order the list.

Besides this special variables, the Perl sort function uses two operators: cmp and <=>. So you can sort a list either in an alphanumerical or a numerical order. For this you can use the cmp (string comparison operator) or <=> (the numerical comparison operator). Please recall how this two operators work:

  • cmp returns -1, 0 or 1 depending on whether the left argument is stringwise less than, equal to, or greater than the right argument
  • <=> returns -1, 0 or 1 depending on whether the left argument is numerically less than, equal to, or greater than the right argument
Let’s go back to our previous example where we tried to sort a list of numbers. You can do this either by defining a subroutine where you describe how to order the list or by using an in-line subroutine in a block.

In the first case, you can see how this works in the following example:

# define a subroutine
sub numSort {
  if ($a < $b) { return -1; }
  elsif ($a == $b) { return 0;}
  elsif ($a > $b) { return 1; }
}

# invoke the Perl sort function with the subroutine
my @array = sort numSort qw(23 101 11 1 102);

print "@array\n";
# it prints: 1 11 23 101 102

The previous example is for the first syntax form. You could shorten this code by using the second syntax form of the Perl sort function as in the example below:

my @array = sort {$a <=> $b} qw(23 101 11 1 102);

Please note that if you explicitly use the cmp and <=> comparisons operators, it matters if $a or $b is on the left or right side of the operator. For instance, if you use $a <=> $b the list will be sorted in an ascending numerical order and if you use $b <=> $a the list will be sorted in a descending numerical order.

If you have a mixed list with either numerical or string elements, you can use the following code to sort it:

# Perl sort function with the second syntax form
my @array = sort {$a <=> $b || $a cmp $b}
            qw(map 23 Perl 101 11 while 1 scalar 102);

print "@array\n";
# it prints: Perl map scalar while 1 11 23 101 102

The numbers will be sorted in a numerical order and the strings in an ASCII order. However, for this example don’t use warnings or the –w flag if you want to run it.

If you want to order a list of strings in an alphabetical case-insensitive order, as I mentioned before you can use either the lc or uc function as in the following example:

my @array = sort {lc $a cmp lc $b} qw(map Perl while scalar);

print "@array\n";
# it prints: map Perl scalar while

In this case the strings will be sorted in an ascending lexicographical order, it doesn’t matter if you have capital letters or not. Please notice that the lc function doesn't modify the values assigned to $a or $b, but returns a lowercase version of the values.


2. How to sort a hash by keys


Considering hashes, you must know that Perl uses internally its own way to store the items. So, generally you can’t keep your hash items in a specific order, except if you use the Tie::IxHash Perl module that preserves the order in which the hash elements were added. But you can access its items in any order you want. The following snippet code shows you how to print the elements of a hash in a specific order of the keys:

# define a hash
my %hash = (one => 1, two => 2, three => 3, four =>4);

# using Perl sort function with a foreach loop
foreach my $key (sort keys %hash) {
  print "$key: $hash{$key}\n";
}

The order of the pair elements of the hash will remain unchanged, but we process the hash elements in the order we need. In the above example, the keys function will return a list with the hash keys, the Perl sort function will sort this list in an alphabetical ascending order (the standard format); the foreach loop will traverse the sorted list and the print function will display the pair elements of the hash.

It produces the following output:

four: 4
one: 1
three: 3
two: 2

3. How to sort a hash by values


You can access the hash elements in a specific order of their values. See the following code:

# define a hash
my %hash = (1 => 'compile', 2 => 'binary', 
            3 => 'ascii',   4 => 'digit');

# print the hash ordered pairs 
foreach (sort {$hash{$b} cmp $hash{$a}} keys %hash) {
  print "$_: $hash{$_}\n";
}

Here we used the Perl sort function with the cmp (string comparison) operator and we get the elements of the hash printed in the values alphabetical descending order. The keys function returns a list with the hash keys.

The elements of this list are assigned to $a and $b for comparisons and the notation $hash{$a} means the corresponding value of the hash key assigned to $a. The Perl sort function will return the sorted list as argument to the foreach loop. The foreach loop will iterate through this list using the $_ special variable. The output is as follows:

4: digit
1: compile
2: binary
3: ascii

4. How to sort an array of arrays


If you want to sort an array of arrays by the elements values of the sub-arrays you can use the Perl sort function and do something as in the following code:

# define an array of arrays
my @AOA = ([25, 49, 33, 200], [145, 32], [11, 121, 78]);

# sort and print the @AOA array
foreach my $item1 (@AOA){
  foreach my $item2 (sort {$b <=> $a} @{$item1}){
    print "$item2 ";  
  }
  print "\n";
}

You know that in Perl a multidimensional array is a usual array that has as elements references to other arrays (or hashes or some other objects). An array of arrays is a bi-dimensional array that has as elements references to other arrays.

In our example, [25, 49, 33, 200] returns a reference to the list (25, 49, 33, 200), so our @AOA array has three scalar elements that are respectively references to the following lists: (25, 49, 33, 200), (145, 32), (11, 121, 78). Using the Perl sort function, the above code sorts the elements of each list in a descending numerical order. To do this we used a nested foreach.

In the outer foreach, the $item is assigned in turn to the elements of the @AOA array, so it contains a reference to a list. In the inner foreach, we used the @{$item1} notation to dereference the $item1 reference. The Perl sort function is invoked for the elements of the sub-arrays (the three list presented above).

Finally, each line will be printed on a separated line, as you can notice from the output of the script:

200 49 33 25
145 32
121 78 11



5. How to sort using the Schwartzian Transform


In the following we intend to sort a big string split on more lines in an ascending order after the last field of each line, using the Perl sort function and the algorithm given by the Schwartzian Transform.

Keep in mind that the algorithm is rather long described than complicated. At a first sight it looks a bit complicated, but if you take your time to read this, you’ll see how simple and clear it is. Without taking into consideration that you will become an expert in using the map function.

A sample string consisting of 4 lines is as follows:

three 13  3  1  91 3
one   11  5  1  45 1
two   12  7  1  33 2
four  14  1  1  26 4

and we want to order these lines after the last column (the numbers 3, 1, 2, 4).

Using the Schwartzian Transformation, we execute the following steps:

  1. The string is split into a list whose elements are the lines of the string, except the ending newline; we do this by using the split function
  2. Using the map function, the above list is turned into a list of references; each reference points out to an anonymous array consisting of two elements: the original line and the value of the last field of this line
  3. Using the Perl sort function, the list of references is ascending ordered by the second element of the anonymous array
  4. the map function is used again to get back the original list, but this time having the elements in the desired order
  5. using the join function, the above list is converted back into the original string
Here’s the code:

my $str = "three 13  3  1  91 3
one   11  5  1  45 1
two   12  7  1  33 2
four  14  1  1  26 4";

$str = join "",
       map { $_->[0]."\n" }
       sort { $a->[1] <=> $b->[1] }
       map { [$_, (split)[-1]] }
       split /\n/, $str;

print "$str\n";

And the output:

one   11  5  1  45 1
two   12  7  1  33 2
three 13  3  1  91 3
four  14  1  1  26 4

The join, map, sort and split, like a lot of other functions, work from right to left, returning the result on the left. In our code, the split function will return a list to the second map, the second map will return a list to the Perl sort function, ... and finally the join function will return a string into the $str variable, as you can see in the following diagram:

join <- map <- sort <- map <- split

Below, we’ll describe the main steps of the algorithm, allowing you to see the intermediate results, to better understand what happens at each step of the algorithm.

The script begins with the assigning of the sample string to the $str variable. The string consists of 4 lines, each line ending in a newline character.

The script continues with a compound statement which, as we mentioned before, must be interpreted from right to left (if we write it on a single line, but to make the code more readable, we split this compound statement on a few lines).

Step 1.

split /\n/, $str;

The split function will convert the $str into a list using the newline delimiter; every line of the string, excepting the newline will become an element of this list, so we’ll have a list with 4 elements, for instance the first element of the list is:

three 13  3  1  91 3

Step 2.

map { [$_, (split)[-1]] }

The map function will have as argument the list returned by the split function and it will return a list of references where each reference points out to an anonymous array consisting of two elements: the original line and the last field of the line.

Each element of the argument list will be assigned in turn to $_ and inside the map block the [$_, (split)[-1]] construct returns a reference to a new anonymous array that consists of two elements:

  • $_
  • (split)[-1] where split is called without any argument and that means that it will split the line stored in $_ into a list, by using the whitespace delimiter; [-1] is the index element of this list that it will be returned (if you use -1 as an index that means the last element of the list)
To see how the map function works, I used the following code:

my @list = ("three 13  3  1  91 3", "one   11  5  1  45 1",
            "two   12  7  1  33 2", "four  14  1  1  26 4");

my @refList = map { [$_, (split)[-1]] } @list;

# see what it is in @refList
use Data::Dumper;
print Dumper(@refList);

Here the @List array variable contains the list returned by the split function at the Step 1. The list of references returned by map was stored in the @refList array variable. To see the content of the @refList array, the Data::Dumper module was used.

The output of the above script is as follows:

$VAR1 = [
          'three 13  3  1  91 3',
          '3'
        ];
$VAR2 = [
          'one   11  5  1  45 1',
          '1'
        ];
$VAR3 = [
          'two   12  7  1  33 2',
          '2'
        ];
$VAR4 = [
          'four  14  1  1  26 4',
          '4'
        ];



Step 3.

sort { $a->[1] <=> $b->[1] }

The Perl sort function has as argument the list of references described above and will order this list after the second index of the sub-arrays. Here [1] means the index of the sub-array and <=> is the numerical comparison operator.

Because $a->[1] appears at the left side of the <=>, the Perl sort function will order the array of references in the ascending order of the second index of sub-arrays. To print the list returned by the Perl sort function, we use again the Data::Dumper module and we run the following code:

my $str = "three 13  3  1  91 3
one   11  5  1  45 1
two   12  7  1  33 2
four  14  1  1  26 4";

my @refList = sort { $a->[1] <=> $b->[1] }
              map { [$_, (split)[-1]] }
              split /\n/, $str;

use Data::Dumper;
print Dumper(@refList);

It will produce the following output:

$VAR1 = [
          'one   11  5  1  45 1',
          1
        ];
$VAR2 = [
          'two   12  7  1  33 2',
          2
        ];
$VAR3 = [
          'three 13  3  1  91 3',
          3
        ];
$VAR4 = [
          'four  14  1  1  26 4',
          4
        ];

As you can notice, this time the list of references is sorted in the numerical ascending order by the second element of the sub-arrays.

Step 4.

map { $_->[0]."\n" }

Now we must get rid off the second element of the sub-arrays, and turn the list of references into a simple list having as elements the lines of the initial string.

The list of references returned by the Perl sort function is the argument of this map function. Each element of the array of references, which as you know is a scalar, will be assigned in turn to $_. Here $_->[0] is the first element (of index 0) of the current sub-array and it will be concatenated with the newline character. It will result an ordered list with the lines of the initial string.

See the code:

my $str = "three 13  3  1  91 3
one   11  5  1  45 1
two   12  7  1  33 2
four  14  1  1  26 4";

my @list = map { $_->[0]."\n" }
           sort { $a->[1] <=> $b->[1] }
           map { [$_, (split)[-1]] }
           split /\n/, $str;

use Data::Dumper;
print Dumper(@list);

The output is as follows:

$VAR1 = 'one   11  5  1  45 1
';
$VAR2 = 'two   12  7  1  33 2
';
$VAR3 = 'three 13  3  1  91 3
';
$VAR4 = 'four  14  1  1  26 4
';

You can see the presence of the newline after the last field of each element of this list.

Step 5.

$str = join "",

The second argument of the join function is the list described at the previous step. Using the "" delimiter, the elements of the list will be concatenated into our initial string. Finally, we’ll get as output:

one   11  5  1  45 1
two   12  7  1  33 2
three 13  3  1  91 3
four  14  1  1  26 4

i.e., the lines of the string are ordered ascendingly after the last field of the lines.


6. How to sort a matrix by multiple columns


We intend to give you an example how to use the Perl sort function to sort a matrix by its columns, giving priority to the first column, next to the second and so on. Please note that the following algorithm doesn't change the order of the items in a row, but the order of the rows.

Let’s use the Perl sort function to sort the following matrix, which has as elements either numbers or strings (well, we can’t mix the numbers and strings in the same column):

                    5,   'aaa',  33,  'bbb',  12 
                    11,  'asd',  121, 'bnm',  16 
                    5,   'aaa',  22,  'ewq',  13 
                    5,   'abde', 123, 'aqq',  15 
                    5,   'aaa',  33,  'ccc',  11  
                    5,   'abde', 78,  'azxx', 14

First we’ll store this matrix in an array of arrays and next we’ll sort the array by the indexes of the sub-arrays. The elements of the @AOA array are references to the rows of the matrix, each sub-array having as elements the items contained in a row.

See the code first:

# define an array of anonymous arrays
my @AOA = (
		[5,   'aaa',  33,  'bbb',  12 ],
		[11,  'asd',  121, 'bnm',  16 ],
                [5,   'aaa',  22,  'ewq',  13 ],
		[5,   'abde', 123, 'aqq',  15 ],
                [5,   'aaa',  33,  'ccc',  11 ], 
		[5,   'abde', 78,  'azxx', 14 ]
          );

# using Perl sort function to sort the @AOA array

@AOA = sort {
                $b->[0] <=> $a->[0] ||
                $a->[1] cmp $b->[1] ||
                $a->[2] <=> $b->[2] ||
                $b->[3] cmp $a->[3];
            } @AOA; 
                   

# print the @AOA array 
foreach my $item1 (@AOA){
  foreach my $item2 (@{$item1}){
    print "$item2\t";  
  }
  print "\n";
}

As you can see, we use the Perl sort function to sort the array numerically descending by the first index of the sub-arrays, next alphabetically ascending by the second index, and so on. You don’t need to sort the matrix by all its columns, for instance we don’t use the last column to sort the matrix. We used the || operator to indicate from left to right the priority of the columns in the sort processing.

Finally, the @AOA is printed using a nested foreach.

Here is the output:

11      asd     121     bnm     16
5       aaa     22      ewq     13
5       aaa     33      ccc     11
5       aaa     33      bbb     12
5       abde    78      azxx    14
5       abde    123     aqq     15

I hope this tutorial cleared a few aspects about the implementing of the Perl sort function in your scripts.

Please click here to download the Perl sort script with all the above examples included.

A-N-Y-O-N-E Can Learn and Master Perl!
And That Includes YOU!


Check these how-to tutorial eBooks (PDF format):


Table of Contents:

A Perl Script
Install Perl
Running Perl
Perl Data Types
Perl Variables
Perl Operators
Perl Lists
Perl Arrays
    Array Size
    Array Length
Perl Hashes
Perl Statements
    Perl if
    Perl unless
    Perl switch
    Perl while
    Perl do-while
    Perl until
    Perl do-until
    Perl for
    Perl foreach
Built-in Perl Functions
    Functions by Category
        String Functions
        Regular Expressions and Pattern Matching
        List Functions
        Array Functions
        Hash Functions
        Miscellaneous Functions
    Functions in alphabetical order
        chomp
        chop
        chr
        crypt
        defined
        delete
        each
        exists
        grep
        hex
        index
        join
        keys
        lc
        lcfirst
        length
        map
        oct
        ord
        pack
        pop
        push
        q
        qq
        qw
        reverse
        rindex
        scalar
        shift
        sort (more)
        splice
        split
        sprintf
        substr
        tr
        uc
        ucfirst
        undef
        unpack
        unshift
        values

return from Perl sort function to Perl Basics



Would you like to create your own website like this one?
Hit the Alarm Clock!

Site Build It!