Perl grep Function

These sites are FOR SALE: miscperlinfo.com, mpihowto.com!
If you are interested, please use my Contact page.




Perl grep function is used to filter a list and to return only those elements that match a certain criteria - in other words it filters out the elements which don’t match a condition.

Normally, the function will return a list that contains less elements then the original list (i.e. a sublist of the original list).

In this short free tutorial I’ll spend a bit of time to show you a few quick examples about how you can use the Perl grep function in your Perl script.


Click Below to See the Best

Perl How-to Snippet Collection

and Save Hours of Surfing on Internet!


To begin, let’s start with the syntax forms of the Perl grep function:

grep BLOCK LIST
grep (EXPR, LIST)

where:
  • BLOCK – contains one ore more statements delimitated by braces; the last statement in the block determines whether the block will be evaluated true or false. The block will be evaluated for each element of the list and if the result is true, that element will be added to the returned list. If you need to apply a more sophisticated filter that consists of multiple code lines, you may consider to use the Perl grep function with a block.
  • EXPR – represents any expression that supports $_, in particular a regular expression. The expression is applied against each element of the list and if the result of evaluation is true, the current element will be appended to the returned list.
  • LIST – is a list of elements or an array


Note that in a scalar context the Perl grep function will return the number of times the BLOCK or the EXPR is evaluated true.

How does it work? Well, the Perl grep function iterates through the elements of the list and for each iteration step:

  • sets $_ to the current element of the list
  • evaluates the BLOCK or EXPR against $_
  • if the result of evaluation is true, in list context adds the value of $_ to the output list and in scalar context increments the count of matched elements
Because the elements of the list are stored in the special scalar variable $_ (that is an alias to the current element) you can modify them. However, try to avoid this feature if you want to create clear and robust code - to modify the elements of an array by running a particular expression or function against them, you should use the map function instead (keep in mind: grep to filter, map to modify).

The Perl grep function is very convenient to use any time you need to loop through a list in order to extract a subset of elements from the list, elements that match a certain condition.

Please look at the following simple example of using Perl grep function:

#!/usr/bin/perl

use strict;
use warnings;

# initialize an array
my @array = qw(3 4 5 6 7 8 9);

# first syntax form:
my @subArray = grep { $_ & 1 } @array;

# second syntax form
#my @subArray = grep $_ & 1, @array; 

print "@subArray\n";
# displays: 3 5 7 9

The above script extracts from @array a subset that contains the odd numbers only. The Perl grep function will iterate through @array and at each iteration step it will store the current element of the array in $_; than it will evaluate the condition $_ & 1 and if the result is true (i.e. the number is odd), the element stored in $_ will be appended to @subArray.

Instead of using Perl grep function, the same thing could be accomplished with a foreach loop as well:

#!/usr/bin/perl

use strict;
use warnings;

# initialize an array
my @array = qw(3 4 5 6 7 8 9);

my @subArray = ();
foreach (@array) {
  push @subArray, $_ if $_ & 1;
}
print "@subArray\n";
# displays: 3 5 7 9

You can note that Perl grep function is somehow a shorthand (you don’t need to explicitly iterate through the array), but keep in mind that a foreach loop in some circumstances could be faster, however benchmark it if you are not sure.



The following code snippets will show you some ways to use the Perl grep function.

Get the number of the array elements that match a pattern


As I told you before, in a scalar context, the Perl grep function returns a count of the elements that match the pattern. To illustrate this, look at the following code:

#!/usr/bin/perl

use strict;
use warnings;

# initialize an array
my @words = qw(John has selected red blue bed);
my $count = grep /ed$/, @words;

print "\$count = $count\n"; 

It produces the output:

$count = 3

and is the number of words that end in ed.

Looking for a string that contains metacharacters


First, look at the following example:

#!/usr/bin/perl

use strict;
use warnings;

# initialize an array
my @operators = ('+', ' -', '*', '/', '+=', 
                 '-=', '*=', '/=', '**=');
my $count = grep /\Q**=/, @operators;
print "found operator **=\n" if $count>0;

In our example we initialize @operators array with a few Perl operators. The Perl grep function counts how many times the exponentiation assignment operator (**=) was found in our list. Because * is a metacharacter for regex (it causes the precedent character to be matched 0 or more times), one way to do this is by using the \Q sequence. I remind you that \Q tells Perl where to start escaping special characters and the \E tells where to stop. In our example we don’t need to use \E sequence.

If you leave \Q out, and your code line will be respectively:

my $count = grep /**=/, @operators;

you’ll get an error message like the next one:

Quantifier follows nothing in regex; marked by 
    <-- HERE in m/* <-- HERE *=/ at … 

Going back to our script, the given output is as follows:

found operator **=

How to toggle through the elements of an array


The following example shows you how to get every other element from an array:

#!/usr/local/bin/perl

use strict;
use warnings;

# initialize an array
my @array = qw(a b c d e f g h i);

my $flag = 0;
my @array1 = grep { $flag ^= 1 } @array;
$flag = 1;
my @array2 = grep { $flag ^= 1 } @array;

print "\@array1 = @array1\n";
print "\@array2 = @array2\n";

Please remember that the bitwise exclusive operator ^ sets a bit to 1 if the corresponding bits in its operands are different and 0 otherwise.

The ^= operator represents the XOR operator combined with the assignment operator and in our code $flag ^= 1 means $flag = $flag ^ 1.

In the above example the code within the grep block toggles between true (1) and false (0) at each iteration of the Perl grep function.

The output produced is as follows:

@array1 = a c e g i
@array2 = b d f h 



Extract the list elements that occur a given number of times


I’ll write a subroutine for this. Look at the following example:

# initialize an array
my @array = qw(1 21 3 3 3 3 7 21 1 5 5 9);

print "Occur once: @{getNumbers(\@array, 1)}\n";
print "Occur twice: @{getNumbers(\@array, 2)}\n";


sub getNumbers
{
  my ($newArray, $n) = (shift, shift);
  my %count = (); 

  my @tempArray = grep {$count{$_} == $n}
                      grep {++$count{$_} == 1} 
                           @$newArray;
  
  # print the %count hash
  print "\%count:\n";
  while ( my ($key, $value) = each(%count) ) {
        print "$key => $value\n";
  }

  return \@tempArray; 
}

And the output:

%count:
1 => 2
21 => 2
3 => 4
7 => 1
9 => 1
5 => 2
Occur once: 7 9
%count:
1 => 2
21 => 2
3 => 4
7 => 1
9 => 1
5 => 2
Occur twice: 1 21 5

The getNumbers subroutine has 2 arguments: a reference to an array and the number of times each element of the array can occur. It returns a reference to a subarray that contains the elements that occur the given number of times. For instance, the code line;

print "Occur once: @{getNumbers(\@array, 1)}\n";

will print: Occur once: 7 9, i.e the elements of the @array that occur once. The first argument passed to the subroutine is a reference to @array (i.e. \@array), and the second argument is the number 1. Because our subroutine returns a reference to the array with the selected elements, in order to print its element we need to dereference it so we used the @ sign and we embraced our reference with curly brackets.

Please note that applying the subroutine with the second argument 1 it will give you the unique elements of the array (they occur just once).

Next I’ll say some words about the code within the body of the getNumbers subroutine.

We used shift function to discard the two arguments from @_ and we initialized the %count hash that we use in our algorithm. The %count hash will contain pairs of ($key, $value) where $key represents an element of the @array and $value the number of times this element occurs. To see how it works, I printed the %count hash.

The next two code lines:

                      grep {++$count{$_} == 1} 
                           @$newArray;

is the second argument of the first grep (the first argument of the first grep is the block {$count{$_} == $n}), so they will be evaluated in a list context, i.e the second grep will return a list. Thus the two above lines will create the %count hash completely and will return a list (because of the list context) with the unique elements of the @newArray, i.e (1, 21, 3, 7, 5, 9).

Next, the first grep will extract from this list only those elements that occur $n times and will store them in @tempArray. After printing the %count hash, the subroutine will return a reference to @tempArray.

Printing the %count hash is not necessary within the subroutine body but I made it to show you how the algorithm works.


How to use grep to return true or false


To see this topic click here to watch a video on youtube where you'll find a complete example.

* * *


I hope I was clear enough with my notes, if not keep looking at the code, test and modify it and you will understand how it works.

As a conclusion, you can use grep anytime you need to select elements from a list or an array.

If you want to download the Perl grep script with all the above examples included, please click here: Script download

Exercises


Through these exercises you have the opportunity to try yourself to write some script code where you can use the Perl grep function. These exercises are completely covered in my Perl Functions for List Data where I show you how to play with this important function in detail.

1. Use grep function to extract from the following array:
my @names = qw(JOHN Peter Alice MARY Michael);
the string values that contains capital letters only.
2. Write a short code where to read a few lines from STDIN. At the end print the number of lines that are non-empty.
3. Use grep to extract from the list: ('red', 'cyan', 'yellow', 'blue' ) the colors that contains any of the 'y' and 'w' letters.
4. Given the following initializations:
my @array = qw(15 25 3 17 14);
$array[10] = 29;
$array[25] = 35;
use grep to return a list with the defined elements of @array.
5. Given the following hash:
my %hash = qw(1 one 2 two 3 three 4);
use grep to return a list of the defined hash values.
6. This exercise ask you to use the grep and map functions to extract elements from a hash whose keys or values match a given regex. Given the following hash:
my %hash = ( 
   Color1 => 'grey', color2 => 'Lightgreen',
   shape1 => 'circle', shape2 => 'rectangle',
   color4 => 'lightblue', Color5 => 'darkblue' );
a) Create a hash from it with the entries whose keys match the word 'color' case sensitive
b) Create a hash from it with the entries whose values match the word 'light' case insensitive
7. Use grep to delete the negative numbers from the following array:
my @array = (12, 25, -7, 19, -25);
8. Given he following arrays:
my @array1 = (1, 2, 3, 5, 7, 23, 8, 14, 95, 19);
my @array2 = (3, 14, 6, 22, 88, 19, 100);
a) find the intersection of the 2 arrays
b) find the difference between @array1 and @array2
c) find the difference between @array2 and @array1
9. This exercise ask you to use grep to find a specific string/substring in all the inner arrays of an array of arrays (@AoA). Given the following array of arrays:
my @AoA = ( 
  [ qw(Yellow Blue LightBlue DeepPink) ],
  [ qw(DarkGreen DarkMagenta LightYellow) ],
  [ 'Red', 'LightCyan', 'LightCoral', 'Green']
);
return a list with the elements that match the word 'light' case insensitive.
10. Given the following array of hashes (@AoH):
my @AoH = (
  {name=>'John',  age=>23},
  {name=>'Alice', age=>31},
  {name=>'Peter', age=>45},);
extract a sub-array where the hash key 'age' has a value greater than 30. Print the sub-array as:
Alice is 31 years old.
Peter is 45 years old.
11. Given the following hash of arrays (%HoA):
my %HoA = ( 
  group1 => ['usr1', 'usr2', 'usr3'],  group2 => ['usr5', 'usr4', 'usr2'],
  group3 => ['usr7', 'usr8', 'usr1'],  group4 => ['usr5', 'usr2', 'usr4'],
);
use the grep function to get and print the groups where usr1 belongs.
12. Given a hash of hashes (%HoH):
my %HoH = (
  1 => { name  => 'John', age => 20 },
  2 => { name  => 'Marry', age => 25 },
  3 => { name  => 'Patricia', age => 30 },
  4 => { name  => 'John', age => 20 },
  5 => { name  => 'John', age => 20 },
  6 => { name  => 'Patricia', age => 30 }
);
where each inner hash has 2 keys: name and age, extract in an array from these inner hashes the unique values associated with the key name (i.e. ('John', 'Marry', 'Patricia')). In the same time, print how many times each name appears.


A-N-Y-O-N-E Can Learn and Master Perl!
And That Includes YOU!


Check these how-to tutorial eBooks (PDF format):


Table of Contents:

A Perl Script
Install Perl
Running Perl
Perl Data Types
Perl Variables
Perl Operators
Perl Lists
Perl Arrays
    Array Size
    Array Length
Perl Hashes
Perl Statements
    Perl if
    Perl unless
    Perl switch
    Perl while
    Perl do-while
    Perl until
    Perl do-until
    Perl for
    Perl foreach
Built-in Perl Functions
    Functions by Category
        String Functions
        Regular Expressions and Pattern Matching
        List Functions
        Array Functions
        Hash Functions
        Miscellaneous Functions
    Functions in alphabetical order
        chomp
        chop
        chr
        crypt
        defined
        delete
        each
        exists
        grep (more)
        hex
        index
        join
        keys
        lc
        lcfirst
        length
        map
        oct
        ord
        pack
        pop
        push
        q
        qq
        qw
        reverse
        rindex
        scalar
        shift
        sort
        splice
        split
        sprintf
        substr
        tr
        uc
        ucfirst
        undef
        unpack
        unshift
        values

return from Perl grep Function to Perl Basics



Would you like to create your own website like this one?
Hit the Alarm Clock!

Site Build It!