Perl pack Function

These sites are FOR SALE: miscperlinfo.com, mpihowto.com!
If you are interested, please use my Contact page.




The Perl pack function has as arguments a LIST of values and a TEMPLATE. It concatenates into a string the list values converted according to the formats specified by the template. It returns the resulting string.

Its mainly purpose is to turn data (numbers and strings) into a sequence of bits that can be easily used by some external applications.

You can use the Perl pack function either to achieve binary data to a file or for network transmission.

The reverse of this function is the unpack function which takes a sequence of bits and converts it into numbers and strings, needed for further processing.

The syntax form of the Perl pack function is as follows:

STRING = pack TEMPLATE, LIST

The TEMPLATE consists of a sequence of characters as shown in the table below. One or more modifiers may follow some letters in the template (for instance, each letter may optionally be followed by a number giving a repeat count; or a * for the repeat count means to use however many items are left).

In this short tutorial I intend to review some of the most frequent template sequences used by the Perl pack function and exemplify them with a few appropriate examples.


Click Below to See the Best

Perl How-to Snippet Collection

and Save Hours of Surfing on Internet!


The following table shows you some of the most frequent template characters:

 a   A string with arbitrary binary data, will be null padded
 A   A text (ASCII) string, will be space padded
 b   A bit string (ascending bit order inside each byte, like vec()) 
 B   A bit string (descending bit order inside each byte)
 c   A signed char (8-bit) value
 C   An unsigned char (octet) value
 d   A double-precision float in the native format
 f   A single-precision float in the native format
   A hex string (low nybble first)
 H   A hex string (high nybble first)
 i   A signed integer value
 I   A unsigned integer value
 l   A signed long (32-bit) value
 L   An unsigned long value
 n   An unsigned short (16-bit) in "network" (big-endian) order
 N   An unsigned long (32-bit) in "network" (big-endian) order
 s   A signed short (16-bit) value
 S   An unsigned short value
 U   A Unicode character number
 v   An unsigned short (16-bit) in "VAX" (little-endian) order
 V   An unsigned long (32-bit) in "VAX" (little-endian) order
 x   A null byte
 X   Back up a byte


 a  A string with arbitrary binary data, will be null padded

The following example shows you how to deal with the Perl pack function and the 'a' template:

#!/usr/local/bin/perl

use strict;
use warnings;

my $str = pack 'a7', '123a';     # "123a\0\0\0"

# split the string into an array of characters
my @array = split //,$str;

# converts the elements of the array into their
# equivalent hex codes
@array = map( sprintf("%x", ord), @array);

# print the array with spaces between elements
print "@array\n";

# it prints: 31 32 33 61 0 0 0

The code begins with the calling of the Perl pack function. The $str is the string where the result will be returned, 'a7' is the template and '123a' is the string to be converted. The 7 digit in the template is a modifier and it means that it will be appended so many null bytes until the resulting string will have 7 characters length.

The following lines of the code allow you to see the content of the $str converted in hexadecimal characters.

If you have a list of strings to be converted, you can use the 'x' (repetition) operator like in the following line of code:

my $str = pack 'a' x 7, '12', '34', '56';     # "135\0\0\0\0"

You can use the Perl pack function with the 'a' template to convert a string into an ASCII string followed by a null, that can be used in a C program:

my $cStr = pack ('ax', $perlStr);

Here the 'x' character will append a null character as the rightmost character of the string.

 A  A text (ASCII) string, will be space padded

This template is similar with the previous one, except that space is used instead of null. See the above example for this. For instance, in one of the previous examples you can use the line:

my $str = pack 'A7', '123a';     # "123a    "

instead of:

my $str = pack 'a7', '123a';     # "123a\0\0\0"

You’ll get as output: 31 32 33 61 20 20 20 where 20 is the hex code for the space character.


 b  A bit string (ascending bit order inside each byte, like vec())

The 'b' format of the Perl pack function packs strings consisting of 0 and 1 characters to bytes. A byte consists of a group of 8 bits as in the following figure:

 1 0 1 1 0 0 1 0 
MSB           LSB

LSB means here the least significant bit and it is sometimes referred as the rightmost bit. MSB is the most significant bit and is sometimes referred as the leftmost bit. In the above example, MSB = 1 and LSB = 0.

The 'b' format means that the bits are specified in increasing order from MSB to LSB. For instance, in the next line of code:

my $nr = ord pack ('b8', '10110010'); 

the $nr variable will be assigned with 77 = 1 + 4 + 8 + 64

In this representation, the count refers to the number of bits to be packed - in the above example the count is 8.

You can use the Perl pack function with the 'b*' format to translate a string of 0’s and 1’s into a bit string, and the unpack function to get back the list of 0’s and 1’s from the bit string. Here’s an example:

#!/usr/local/bin/perl

use strict;
use warnings;

my @bitArray = qw(1 0 0 0 1 1 1 1 0 0 1 1);
my $bitString = pack 'b*', join('', @bitArray);

@bitArray = split(//, unpack('b*', $bitString));
print "@bitArray\n";
# it prints:      1 0 0 0 1 1 1 1 0 0 1 1 0 0 0 0

Please note that our initial array of bits had 12 elements only, so the Perl pack function initialized the last 4 bits of the $bitString with 0.

 B  A bit string (descending bit order inside each byte)

The 'B' template is similar with the 'b' template except that the bits are specified in decreasing order from LSB to MSB. For instance, in the next line of code:

my $nr = ord pack ('B8', '10110010');

the $nr variable will be assigned with 178 = 2 + 16 + 32 + 128

You can use the Perl pack function with the 'B*' format in a similar way as shown in a previous example for the 'b' format:

#!/usr/local/bin/perl

use strict;
use warnings;

my @bitArray = qw(1 0 0 0 1 1 1 1 0 0 1 1);
my $bitString = pack 'B*', join('', @bitArray);

@bitArray = split(//, unpack('B*', $bitString));
print "@bitArray\n";
# it prints:      1 0 0 0 1 1 1 1 0 0 1 1 0 0 0 0

 c  A signed char (8-bit) value

The 'c' template format is for signed char values. The usage is similar with the 'C ' template format – see it above.

 C  An unsigned char (octet) value

The 'C' format is used for unsigned characters. Here're a few examples:

#!/usr/local/bin/perl

use strict;
use warnings;

my $str = pack 'CCCC', 97, 98, 99, 100, 101, 102;
# 97 is the numeric value of the ASCII 'a' character
print "$str\n";     # abcd

# 3 is a count for the number of characters packed
$str = pack 'C3', 97, 98, 99, 100, 101, 102;
print "$str\n";     # abc

# x is the repetition operator
$str = pack 'C' x 5, 97, 98, 99, 100, 101, 102;
print "$str\n";     # abcde

# the '*' is like a wildcard for more of the same.
$str = pack 'C*', 97, 98, 99, 100, 101, 102;
print "$str\n";     # abcdef

The following example shows you how to use the Perl pack function with the 'C*' template in conjunction with other Perl functions. The '*' is like a wildcard for more of the same.

#!/usr/local/bin/perl

use strict;
use warnings;

my $str = pack('C*',map ord,split(//,'This is Perl'));
print "$str\n";
# it prints: This is Perl

The split function will create an array from the string 'This is Perl', each character becomes an element of the array. The map function will run the ord function for each element of the array and it will return a list with the ASCII values of the characters. Finally, the Perl pack function with the 'C*' template (for unsigned characters) is used for all the numbers of the list (if you put an * character inside the template, you don’t need to count the elements of the list argument).


 d  A double-precision float in the native format

The 'd' format of the Perl pack function is for 64 bit floating point in native machine format. Its usage is similar with the 'f' template format (see below).

 f  A single-precision float in the native format

The 'f' format of the Perl pack function is for 32 bit floating point in a native machine format. Because of the variety of floating formats around, it’s possible that floating point data written on one machine may not be readable on another – as in the case that the two machines have different endianness. You can use this format like in the following line of code:

my $float = pack 'f', 23.13421;

where $float will contain the number in a native float format. To extract the number from this string, you need to use the unpack function:

my $nr = unpack 'f', $float;

Or you can use the Perl pack function by following the 'f' specifier with a count, if you know how many floats you want to pack:

my $floats = pack 'f2', 3.14, 2.287;

If you have more single-precision float numbers to pack, you can use the '*' repeat pack-format that will pack all the available float numbers from the list:

#!/usr/local/bin/perl

use strict;
use warnings;

my @floatArray = (23.13421, 112.78, 77.896);
@floatArray = unpack ('f*', pack('f*', @floatArray));
print "@floatArray\n";
# it displays: 23.1342105865479 112.779998779297 77.8960037231445

Here the Perl pack function will return a string with 3 single-precision float numbers packed into the specific native machine format. The unpack function will unpack the 3 numbers from the pack resulting string into an array.

Finally, the array with the result will be printed. As you can notice, the content is equal with the content of the initial array – there are even a few more decimal digits for each unpacked number.

 h  A hex string (low nybble first)

The 'h' template format is for packing a hex string by putting the low nibble first. Its usage is similar with the 'H' template format – see above.

 H  A hex string (high nybble first)

The 'H' template format of the Perl pack function is for packing a hex string by putting the high nibble first. If you want to get back the unaltered value of the string, you can use the unpack function but with the same template format. If you use unpack with 'h' format, you’ll get the bytes in the same order but with their nibbles reversed, as you can notice in the next snippet:

my $str = pack'H*','6162636465';
print unpack ('H*', $str), "\n";  # it prints: 6162636465
print unpack ('h*', $str), "\n";  # it prints: 1626364656

Here I put a * character inside the template, to avoid counting the hex characters of the string argument.

 i  A signed integer value

This template format of the Perl pack function generates a signed integer and you can use it like this:

my $integer = pack 'i', 150;

The number 150 will be converted into the format used to store integers on your machine and the result will be stored into the $integer variable. If you have many integers to pack, you can use the '*' repeat pack-format that will pack all the integers available in the list:

#!/usr/local/bin/perl

use strict;
use warnings;

my @integerArray = (150, 160, 170, 180, 190);
@integerArray = unpack ('i*', pack('i*', @integerArray));
print "@integerArray\n";
# it displays: 150 160 170 180 190

Here the Perl pack function will return a string with 5 integers packed into the specific integer format to your machine. The unpack function will unpack the 5 integers from the pack resulting string into an array. Finally, the array with the result will be printed. As you can notice, the content is equal with the content of the initial array.

But the 'i' format is machine dependent, so if you pack a list of integers into a string and then unpack it to another machine, it’s possible to get back a list of weird things.

 I  An unsigned integer value

If you need to pack unsigned characters, you can use the 'I' template format of the Perl pack function. See above the 'i' format examples, the usage is similar.


 l  A signed long (32-bit) value

The 'l' format generates a signed long format, which generally generates a four-byte number. It depends if the machine is little- or big-endian. See the following lines of code for a short example:

my $str = pack('l', 0x61626364);
print "$str\n";

This code creates a four-byte consisting of either dcba if the machine is little-endian or abcd if the machine is big-endian. Here 61,62,63,64 are the ASCII values for the a,b,c,d characters.

 L  An unsigned long value

The 'L' format of the Perl pack function generates an unsigned long value, its usage is similar with the signed long format. Its length is exactly 32 bits and could differ from the long format of the local C compiler.

 n  An unsigned short (16-bit) in "network" (big-endian) order

The 'n' format tells to the Perl pack function to create an unsigned short in a network byte order. This format is specific to TCP/IP communications and you need to use this format (or 'N' for bigger numbers) if you do certain types of TCP/IP communication. You can use it like in the following line of code:

my $nr = pack 'n', 1234, 235;

Because we didn’t provide any qualifier inside the template, the Perl pack function will pack just the first number and it will return it in the $nr variable. The second number (235) from the list will be lost.

 N  An unsigned long (32-bit) in "network" (big-endian) order

The 'N' format tells to the Perl pack function to create an unsigned long in a network byte order. You can use it similar with the 'n' template format. Here’s a short example:

my $nrs = pack 'N*', 45320..45325;

my @array = unpack 'N*', $nrs;
print "@array\n";
# it displays: 45320 45321 45322 45323 45324 45325

If you use the '*' repeat pack-format, you don’t need to provide the count of the numbers you intend to pack. The unpack function was used to extract the numbers from the packed $nrs string and populate an array with them.

 s  A signed short (16-bit) value

This format is for signed short numbers. If you transfer data across the network or onto a disk of another computer, you must consider the endianness of your computers, because the integers and the floating-point numbers could be stored in memory in different orders. So you must take this into considerations when you use the 's' format. A short example about how to use it:

my $i16 = pack 's*', 21, 77, 100, 256;

In this example the 's' format is associated with '*' that allows you to use the Perl pack function to pack as many short integers as you have in your list. You can determine the endianess of your system by using this format, as you can see in the example below:

#!/usr/local/bin/perl

use strict;
use warnings;

my $v = unpack("h*", pack("s", 1));
if($v =~ /^1/) {
  print "Little endian system\n";
} elsif ($v =~ /01/) {
  print "Big endian system\n";
} else {
  print "Unknown endian format\n";
}
print "$v\n";   
# on my Windows system it displays: 1000

On my local Windows computer, after running this code I received the message: 'Little endian system'. The unpack function was used to unpack the packed number in a hex format.

 S  An unsigned short value

The 'S' format is for unsigned short integers, its usage is similar with the 's' format – see above.


 U  A Unicode character number

The 'U' template format of the Perl pack function allows you to pack a Unicode number into its UTF-8 representation. The Unicode character sets associate characters with integers and the converting of the Unicode characters to UTF-8 format let you store only the bytes that are needed. The most common cases are that when the Unicode characters are encoded in one or two bytes only. For instance, the next example converts into UTF-8 the smile face Unicode character:

my $utfSmiley = pack 'U', 0x263A;
print "length of \$utfSmiley = ", length($utfSmiley),
      ", length of 0x263A = ", length(0x263A), "\n";
# it displays: length of $utfSmiley = 1, length of 0x263A = 4

You can notice the difference of the two item lengths in the memory. To get back the information in a Unicode format, you can use the unpack function.

Because of the endianness of a system, the integers and floating-point numbers are stored in a different order, so if you move binary data across the network, you could expect to meet some format issues. A way to avoid this is by using 'U', the Unicode character number. You can use the Perl pack function to pack a sequence of characters encoded as characters in UTF-8 format on a computer and use the unpack function on another. See the following example where we use the Perl pack function to pack a few integers into an UTF-8 format:

my @integers = (1234, 23, 456, 789);
my $utfIntegers = pack 'U*', @integers;

@integers = unpack 'U*', $utfIntegers;
print "@integers\n";
# it displays: 1234 23 456 789

You can use the 'U' format to encode the Unicode characters of an alphabet. For instance, the Unicode Hebrew alphabet ranges from 0x0590 to 0x05ff. The following example shows you how to pack and unpack the Hebrew Unicode alphabet:

my $utfHebr = pack 'U*', 0x0590..0x05ff;
my @UniHebr = unpack 'U*', $utfHebr;


 v  An unsigned short (16-bit) in "VAX" (little-endian) order

The 'v' format is for 16-bit unsigned short numbers being similar with the 'n' format but refers to a little-endian order. When you need to pack some unsigned short numbers in a little endian format, you should use this format. The next line of code shows you how to use the Perl pack function to pack it:

my $nr = pack 'v', 3167;

To get back the number, you can use the unpack function.

 V  An unsigned long (32-bit) in "VAX" (little-endian) order

The 'V' template format is for unsigned long (32 bit) numbers, its usage is similar with the previous format.

 x  A null byte

You can use the Perl pack function with the 'x' format if you want to pack a null byte. The following example puts a null between the a, b, c characters. The result is stored in the $str variable.

my $str = pack 'CxCxC', 97..99;
print "$str\n";  # a\0b\0c\0
# it displays: a b c


 X  Back up a byte

The 'X' format of the Perl pack function is used to move one byte backwards in the string. Here’s an example:

my $binaryString = pack ('C4X2', 97..105);
print unpack ('C*',$binaryString), "\n"; 
# it displays: 9798

In this code 97..105 are the decimal values of the a-i ASCII characters; the characters 99,100 were removed and the characters 101-105 were not packed at all because there isn’t any specifier for them inside the template. The use of the unpack function tell you that only the first two characters were packed.

Dictionary

  • The big-endian and little-endian are derived from "Big End In" and "Little End In" and refer to the way in which memory is stored. For instance a word like 0x1234 is stored in memory as (0x34 0x12) if the machine is little-endian (in a reverse order) and (0x12 0x34) if the machine is big-endian. The vast majority of Windows is little-endian.
  • A nibble is a single hex digit of four bits (a half byte) and there are two nibbles in a byte.
  • UTF-8 is a variable-length character used for encoding Unicode; it encodes each character in 1 to 4 octets, with the single octet encoded as a 128 US-ASCII character. (from Wikipedia).

See the perldoc perlpacktut for additional information.

Please click here to download the Perl pack script with all the above examples included.

A-N-Y-O-N-E Can Learn and Master Perl!
And That Includes YOU!


Check these how-to tutorial eBooks (PDF format):


Table of Contents:

A Perl Script
Install Perl
Running Perl
Perl Data Types
Perl Variables
Perl Operators
Perl Lists
Perl Arrays
    Array Size
    Array Length
Perl Hashes
Perl Statements
    Perl if
    Perl unless
    Perl switch
    Perl while
    Perl do-while
    Perl until
    Perl do-until
    Perl for
    Perl foreach
Built-in Perl Functions
    Functions by Category
        String Functions
        Regular Expressions and Pattern Matching
        List Functions
        Array Functions
        Hash Functions
        Miscellaneous Functions
    Functions in alphabetical order
        chomp
        chop
        chr
        crypt
        defined
        delete
        each
        exists
        grep
        hex
        index
        join
        keys
        lc
        lcfirst
        length
        map
        oct
        ord
        pack (more)
        pop
        push
        q
        qq
        qw
        reverse
        rindex
        scalar
        shift
        sort
        splice
        split
        sprintf
        substr
        tr
        uc
        ucfirst
        undef
        unpack
        unshift
        values

return from Perl pack function to Perl Basics



Would you like to create your own website like this one?
Hit the Alarm Clock!

Site Build It!