Perl substr Function
Perl substr function is one of the most important string functions in the Perl language and is meant to retrieve sub-strings of a given string. But this function is a bit complicated and it does much more than I pointed out above.
In this short tutorial we’ll show you some samples about how to use it for manipulating strings, either you use it alone or in context with other string functions like index or length.
A lot of strings manipulation can be done using the power of regular expressions but in many cases, the built-in string functions are straightforward and take less time to execute.
To begin with, let’s see the syntax forms available for Perl
substr function:
substr EXPR, OFFSET, LENGTH, REPLACEMENT
substr EXPR, OFFSET, LENGTH
substr EXPR, OFFSET
where:
- EXPR is a string expression from which the substring will be extracted
- OFFSET is an index from where the substring to be extracted starts
- LENGTH is the length of the substring to extract
- REPLACEMENT is a string that will replace the substring
Like in the case with other functions, you can use the parentheses or not, do it as you wish. As you can see above, some arguments are mandatory and others are optional. You must mention at least the string expression (EXPR) and the position (OFFSET) from where the substring to be extracted starts.
Before reviewing the Perl substr function parameters, I want to remind you that in Perl the first character of a string has the index 0, the second 1, and so on. Actually, you can modify this by setting the special variable $[ with whatever you want, but be careful however if you decide to change it. For strings $[ is the index of the first character of the string and by default is set to 0.
Take a moment to look at the following example and see a code sample about how to use the Perl substr function:
my $names = "John Peterson Anne Mike";
my $oneName = substr($names, 5, 8);
print "$names\n";
#it prints John Peterson Anne Mike
print "$oneName\n";
#it prints Peterson
Please note that
$names variable value didn’t change after using the Perl
substr function.
We can use Perl substr function either in various comparisons or like a lvalue such as an assignment. In this last case, if the EXPR will be a string variable, the value of the string variable will be modified. See the next block of code for this:
my $names = "Alin Fred John";
substr($names, 5, 4) = "Mary";
print "$names\n";
# it prints Alin Mary John
And now let’s get back to our parameters.
OFFSET could be:
- positive – the substring starts that far from the beginning of the string
- negative – the substring starts that far from the end of the string
- 0 - that means that the substring starts at the first character of the string
If the
OFFSET is outside the string (for instance, a string has 10 characters and
OFFSET is greater than or equal to 10), Perl
substr function will return the
undef value and it will generate a warning error.
If the substring is used like a lvalue, and the OFFSET is entirely outside the expression string, a fatal error will be issued. See the following snippet code to illustrate the cases discussed above:
my $names = "Alin Fred Peter";
my $oneName = substr($names, -10, 4);
print "Name: $oneName\n";
$oneName = substr $names, -100, 4;
print "Name: $oneName\n";
substr($names, 20, 5) = "Alice";
This code will produce as result:
Name: Fred
Name:
substr outside of string at 1.pl line 6.
Well,
1.pl is my script name. You can run the script in a command prompt (on a Windows machine in my case) using the switch
–w and you’ll get the warning errors too (perl -w 1.pl).
LENGTH could be:
- omitted – the function will return all the characters beginning with the OFFSET position up to the end character of the string
- positive – the function will return from the string maximum LENGTH characters beginning with the OFFSET position
- negative – it will return the substring starting with the OFFSET position but without that many characters off the end of the string
- 0 – in this case the returned substring will be empty, no error warning
See the example below where we supply some limit situations too:
my $names = "Alex James Abby Shannon Monica";
my $strNames = substr $names, 11; # length omitted
print "$strNames\n";
# prints Abby Shannon Monica
$strNames = substr $names, 24, 100; # length = 100
print "$strNames\n";
# prints Monica
$strNames = substr $names, 24, -2; # length = -2
print "$strNames\n";
# prints Moni
And now some examples using
substr as a lvalue:
my $names = "Alex James Abby Shannon Monica";
substr($names, 11, 4) = "Alexandra";
print "$names\n";
# prints Alex James Alexandra Shannon Monica
In the example above, the substring
"Abby" (found at offset 11) will be replaced by the substring
"Alexandra" although this substring is longer than 4 (the
LENGTH supplied to
substr function). As you see,
$names has now more characters than it had initially.
Next, look at an example where the substring used to be assigned is shorter than the LENGTH supplied to Perl substr function:
my $names = "Alex James Abby Shannon Monica";
substr($names, 11, 4) = "Tom";
print "$names\n";
# prints Alex James Tom Shannon Monica
After assignment,
$names became shorter that the initial string.
You can play around with these examples to see how the Perl substr function works in other similar situations.
REPLACEMENT
I gave you some examples above about how to use Perl
substr function as a lvalue when you need to replace a substring with another one. Another way to do this is to use the
REPLACEMENT parameter of Perl
substr function, like in the following example:
my $names = "Alex James Abby Shannon Monica";
substr $names, 16, 7, "Alexandra";
# it will replace "Shannon" with "Alexandra"
print "$names\n";
# it prints Alex James Abby Alexandra Monica
As you have seen in the example above,
$names will change the value after the replacement (like in the case of a lvalue).
Finally, I’ll show you a mini script application where you can see how you can use Perl
substr function in connection with other string functions.
| How to use substr to get the column fields from a flat file database |
A flat file database consists of a number of records delimited by a separator, which in most cases is the newline ("
\n") character. In this case we say that each record is specified on a single line. Each record consists by one or more fields, either of fixed width or delimited by some special character like whitespace or comma.
For instance, let’s suppose that each record of the file customers.txt includes the fields Name, Phone and ZipCode and the entire file has only three records, like in the next figure:
| Name | Phone | ZipCode |
| John Abbot | 872-321-1212 | 55416 |
| Clark Eliot | 205-321-1200 | 20037 |
| Johnny Randolph | 345-767-3476 | 33702 |
Fixed-width columns
We’ll examine first the case when the fields have fixed width: Name – 20, Phone – 12 and ZipCode – 5. If we’ll print the file, we’ll get something like this:
John Abbot 872-321-121255416
Clark Eliot 205-321-120020037
Johnny Randolph 345-767-347633702
The following block of code reads the file and prints each record on a single line, the fields being separated by comma:
open FILE, "customers.txt" or die $!;
while (<FILE>)
{
# chomp off the possible ending newline from $_
chomp;
my $name = substr($_,0, 20);
#Trim the end trailling spaces
$name =~ s/ +$//;
my $phone = substr($_,20, 12);
# delete all '-' characters
$phone =~ s/-//g;
my $zipCode = substr($_, 20+12, 5);
print $name, ",",$phone, ",",$zipCode, "\n";
}
close FILE;
Running this snippet code will produce the following output:
John Abbot,8723211212,55416
Clark Eliot,2053211200,20037
Johnny Randolph,34576734763,33702
Columns delimitated by separatorThe next example will illustrate the case when the fields are delimited by a separator character like comma. In this case the content of our file will be:
John Abbot,872-321-1212,55416
Clark Eliot,205-321-1200,20037
Johnny Randolph,345-767-34763,33702
Because we want to show you how you can use the Perl
substr function to access the fields of the record, we’ll not use the
split function to do this (although it looks easier). See the next sample code to see how you could implement it:
open FILE, "customers.txt" or die $!;
while (<FILE>)
{
# chomp off the possible ending newline
chomp;
my $pos1 = index($_, ",");
my $name = substr($_,0, $pos1);
my $pos2 = index $_, ",", $pos1+1;
my $phone = substr($_,$pos1+1, $pos2-$pos1-1);
# delete all - characters
$phone =~ s/-//g;
my $zipCode = substr($_, $pos2+1, length($_)-$pos2);
print $name,",",$phone,",",$zipCode,"\n";
}
close FILE;
The output is the same as in the previous example.
Please click here to download the Perl substr script with all the above examples included.
Table of Contents:
A Perl Script
Install Perl
Running Perl
Perl Data Types
Perl Variables
Perl Operators
Perl Lists
Perl Arrays
Array Size
Array Length
Perl Hashes
Perl Statements
Perl if
Perl unless
Perl switch
Perl while
Perl do-while
Perl until
Perl do-until
Perl for
Perl foreach
Built-in Perl Functions
Functions by Category
String Functions
Regular Expressions and Pattern Matching
List Functions
Array Functions
Hash Functions
Miscellaneous Functions
Functions in alphabetical order
chomp
chop
chr
crypt
defined
delete
each
exists
grep
hex
index
join
keys
lc
lcfirst
length
map
oct
ord
pack
pop
push
q
qq
qw
reverse
rindex
scalar
shift
sort
splice
split
sprintf
substr (more)
tr
uc
ucfirst
undef
unpack
unshift
values
return from Perl substr Function to Perl Basics
Would you like to create your own website like this one?
Hit the Alarm Clock!