Perl grep function is used to filter a list and to return only those elements that match a certain criteria - in other words it filters out the elements which don’t match a condition. Normally, the function will return a list that contains less elements then the original list (i.e. a sublist of the original list). In this short free tutorial I’ll spend a bit of time to show you a few quick examples about how you can use the Perl grep function in your Perl script.
To begin, let’s start with the syntax forms of the Perl grep function:
grep BLOCK LIST
grep (EXPR, LIST)
where:- BLOCK – contains one ore more statements delimitated by braces; the last statement in the block determines whether the block will be evaluated true or false. The block will be evaluated for each element of the list and if the result is true, that element will be added to the returned list. If you need to apply a more sophisticated filter that consists of multiple code lines, you may consider to use the Perl grep function with a block.
- EXPR – represents any expression that supports $_, in particular a regular expression. The expression is applied against each element of the list and if the result of evaluation is true, the current element will be appended to the returned list.
- LIST – represents a list of elements or an array
Note that in a scalar context the Perl grep function will return the number of times the BLOCK or the EXPR is evaluated true.How does it work? Well, the Perl grep function iterates through the elements of the list and for each iteration step:
- sets $_ to the current element of the list
- evaluates the BLOCK or EXPR against $_
- if the result of evaluation is true, in list context adds the value of $_ to the output list and in scalar context increments the count of matched elements
Because the elements of the list are stored in the special scalar variable $_ (that is an alias to the current element) you can modify them. However, try to avoid this feature if you want to create clear and robust code - to modify the elements of an array by running a particular expression or function against them, you should use the map function instead (keep in mind: grep to filter, map to modify).The Perl grep function is very convenient to use any time you need to loop through a list in order to extract a subset of elements from the list, elements that match a certain condition.
Please look at the following simple example of using Perl grep function:
#!/usr/bin/perl
use strict;
use warnings;
# initialize an array
my @array = qw(3 4 5 6 7 8 9);
# first syntax form:
my @subArray = grep { $_ & 1 } @array;
# second syntax form
#my @subArray = grep $_ & 1, @array;
print "@subArray\n";
# displays: 3 5 7 9
The above script extracts from @array a subset that contains the odd numbers only. The Perl grep function will iterate through @array and at each iteration step it will store the current element of the array in $_; than it will evaluate the condition $_ & 1 and if the result is true (i.e. the number is odd), the element stored in $_ will be appended to @subArray. Instead of using Perl grep function, the same thing could be accomplished with a foreach loop as well:
#!/usr/bin/perl
use strict;
use warnings;
# initialize an array
my @array = qw(3 4 5 6 7 8 9);
my @subArray = ();
foreach (@array) {
push @subArray, $_ if $_ & 1;
}
print "@subArray\n";
# displays: 3 5 7 9
You can note that Perl grep function is somehow a shorthand (you don’t need to explicitly iterate through the array), but keep in mind that a foreach loop in some circumstances could be faster, however benchmark it if you are not sure.The following snippet codes will show you some ways to use the Perl grep function.
| Get the number of the array elements that match a pattern |
As I told you before, in a scalar context, the Perl grep function returns a count of the elements that match the pattern. To illustrate this, look at the following code:#!/usr/bin/perl
use strict;
use warnings;
# initialize an array
my @words = qw(John has selected red blue bed);
my $count = grep /ed$/, @words;
print "\$count = $count\n";
It produces the output:
and represents the number of words that end in ed.
| Looking for a string that contains metacharacters |
First, look at the following example:#!/usr/bin/perl
use strict;
use warnings;
# initialize an array
my @operators = ('+', ' -', '*', '/', '+=',
'-=', '*=', '/=', '**=');
my $count = grep /\Q**=/, @operators;
print "found operator **=\n" if $count>0;
In our example we initialize @operators array with a few Perl operators. The Perl grep function counts how many times the exponentiation assignment operator (**=) was found in our list. Because * is a metacharacter for regex (it causes the precedent character to be matched 0 or more times), one way to do this is by using the \Q sequence. I remind you that \Q tells Perl where to start escaping special characters and the \E tells where to stop. In our example we don’t need to use \E sequence.If you leave \Q out, and your code line will be respectively:
my $count = grep /**=/, @operators;
you’ll get an error message like the next one:Quantifier follows nothing in regex; marked by
<-- HERE in m/* <-- HERE *=/ at …
Going back to our script, the given output is as follows:
| Get the even or odd elements from a list |
Please look at the following code:#!/usr/bin/perl
use strict;
use warnings;
# initialize an array
my @array = qw(1 2 3 4 5 6 7 8);
# get the even elements
my $flag = 0;
my @oddArray = grep { $flag ^= 1} @array;
my @evenArray = grep { ! ($flag ^= 1) } @array;
print "\@oddArray = @oddArray\n";
print "\@evenArray = @evenArray\n";
Please remember that the bitwise exclusive operator ^ sets a bit to 1 if the corresponding bits in its operands are different and 0 otherwise. The ^= operator represents the XOR operator combined with the assignment operator and in our code $flag ^= 1 means $flag = $flag ^ 1. In the above example the code within the grep block toggles between true (1) and false (0) at each iteration of the Perl grep function.The output produced is as follows:
@oddArray = 1 3 5 7
@evenArray = 2 4 6 8
In the above code you could change the line my $flag = 0; with my $flag = 1; and modify the next two lines accordingly:my @oddArray = grep {! ($flag ^= 1) } @array;
my @evenArray = grep { $flag ^= 1 } @array;
The output is the same as before.
| Extract the list elements that occur a given number of times |
I’ll write a subroutine for this. Look at the following example:# initialize an array
my @array = qw(1 21 3 3 3 3 7 21 1 5 5 9);
print "Occur once: @{getNumbers(\@array, 1)}\n";
print "Occur twice: @{getNumbers(\@array, 2)}\n";
sub getNumbers
{
my ($newArray, $n) = (shift, shift);
my %count = ();
my @tempArray = grep {$count{$_} == $n}
grep {++$count{$_} == 1}
@$newArray;
# print the %count hash
print "\%count:\n";
while ( my ($key, $value) = each(%count) ) {
print "$key => $value\n";
}
return \@tempArray;
}
And the output:%count:
1 => 2
21 => 2
3 => 4
7 => 1
9 => 1
5 => 2
Occur once: 7 9
%count:
1 => 2
21 => 2
3 => 4
7 => 1
9 => 1
5 => 2
Occur twice: 1 21 5
The getNumbers subroutine has 2 arguments: a reference to an array and the number of times each element of the array can occur. It returns a reference to a subarray that contains the elements that occur the given number of times. For instance, the code line;print "Occur once: @{getNumbers(\\@array, 1)}\n";
will print: Occur once: 7 9, i.e the elements of the @array that occur once. The first argument passed to the subroutine is a reference to @array (i.e. \@array), and the second argument is the number 1. Because our subroutine returns a reference to the array with the selected elements, in order to print its element we need to dereference it so we used the @ sign and we embraced our reference with curly brackets. Please note that applying the subroutine with the second argument 1 it will give you the unique elements of the array (they occur just once).
Next I’ll say some words about the code within the body of the getNumbers subroutine.
We used shift function to discard the two arguments from @_ and we initialized the %count hash that we use in our algorithm. The %count hash will contain pairs of ($key, $value) where $key represents an element of the @array and $value the number of times this element occurs. To see how it works, I printed the %count hash.
The next two code lines:
grep {++$count{$_} == 1}
@$newArray;
represent the second argument of the first grep (the first argument of the first grep is the block {$count{$_} == $n}), so they will be evaluated in a list context, i.e the second grep will return a list. Thus the two above lines will create the %count hash completely and will return a list (because of the list context) with the unique elements of the @newArray, i.e (1, 21, 3, 7, 5, 9). Next, the first grep will extract from this list only those elements that occur $n times and will store them in @tempArray. After printing the %count hash, the subroutine will return a reference to @tempArray.
Printing the %count hash is not necessary within the subroutine body but I made it to show you how the algorithm works.
I hope I was clear enough with my notes, if not keep looking at the code, test and modify it and you will understand how it works.
As a conclusion, you can use grep anytime you need to select elements from a list or an array.
NEW!!!
Do you want more information about the basic Perl topics?
Check my new "Perl How To" Tutorial eBooks page where I'll answer the most frequent questions regarding some topics :