SlideShare a Scribd company logo
PROGRAMMING
55LINUX MAGAZINEIssue 19 • 2002
O
riginally released in 1987, Perl has spread
from niche to niche (including CGI, Databases
and XML), assimilating buzzwords that stray
unwittingly into its path, such as Object Orientation,
Bio-informatics and Aspect Oriented Programming).
For all of the above reasons, Perl is renowned as a
‘glue-language’: it interacts with most popular
applications.
The Comprehensive Perl Archive Network (CPAN)
repository is one of the jewels in Perl’s crown (groan).
It provides a library of language extension modules as
comprehensive as J2EE and the .NET framework,
providing a set of APIs that enables integration with
other languages including C, C++ and Java, to name
but a few.
Perl’s strength has always been its active user
community, which created and maintained sites such
as CPAN (http://www.cpan. org), Perl.com (http://
www.perl.com), use.perl (http://use.perl.org), Perl
Monks (http://www.perlmonks.org) and various
geographically diverse Perl Mongers groups.
As well as having sites devoted to it, Perl runs
some of the busiest sites on the Net, including geek-
havens Slashdot (http://www.slashdot.org) and
Kuro5hin (http://www.kuro5hin.org). In fact Perl is so
widely used on the Web that it’s often referred to as
the duct tape of the Internet.
Perl is a terse but high-level language that removes
the burdens of memory allocation, the distinction
between primitive data types, file handling and the
need to constantly reinvent the wheel. It is because
Perl allows the developer such freedom and
functionality within so few key-strokes that Perl has
the semi-deserved reputation of resembling line-noise.
Perl’s integrated regular expression handling (a
super-set of the POSIX standard) – the variety of
operators provided to manipulate, describe and
access Perl’s data-structures – has meant Perl had to
spill over to lesser-used areas of the keyboard or
adopt a larger, more esoteric vocabulary. Of course,
big words can sometimes obscure meaning – just
take the last sentence as an example – so more
obscure keyboard symbols were instead adopted.
Here’s a snippet of Perl code seen in production
Perl
THINKING IN
LINE NOISE
Perl is a language
steeped in the
history and evolution
of Unix (and by
extension Linux)
platforms, so it’s
only right that it
should have a place
here at Linux
Magazine. Dean
Wilson and Frank
Booth begin our
journey with an
overview of Perl and
its syntax
Getting Perl
Any moderately recent and well-stocked Linux
distribution will come complete with an installed Perl
interpreter, a full set of Perl core modules and the
standard (and copious!) documentation in pod
format.
If your install does not include Perl then there are
two paths open to you; you can either get a binary
distribution from your distro’s package repository or
download and compile your own. As this is a
beginner’s tutorial we will cover getting the package
and installing it rather than compiling your own; this
topic is more than adequately covered in the INSTALL
file in the root of the source code tarball.
Installing the binary package varies more upon
your Linux distribution but can be summarised as:
rpm-based
Step 1: Download the package from either your distro’s repository or from one of
the links at http://www.rpmfind.net or http://www.perl.com.
Step 2: As root, issue the ‘rpm -i <perlpackage>’ command.
Debian
Debian saves you the wasted time fetching the package by hand and instead
allows you to get by with the following:
Step 1: apt-get update.
Step 2: apt-get install perl.
While Debian makes the initial install simpler; for some packages that have
external dependencies you are reliant upon the apt-get mechanism, as an
example modules that use Libmagick or expat (an XML parser) must be installed
via apt-get or will require modification of the source to allow a successful install.
PROGRAMMING
56 LINUX MAGAZINE Issue 19 • 2002
Unlike its predecessor, the above example uses single
quotes which prevents the variable from being
interpolated: it returns the literal value within the
quotes:
$word = $curry . ‘ is ‘; # This sets U
$word to: Chicken Phaal is
$word .= ‘tasty’; # This sets $word to: U
Chicken Phaal is tasty
The dot operator ‘.’ is used to concatenate values
together. In these last two examples the dot operator
is used to append strings to $word; in the latter case
using the same philosophy as the += operator. Note
that concatenating single quoted strings to a variable
does not affect the interpolation of the variable that
is not wrapped in quotes.
Perl allows us to use the string operators on
numbers (it treats the numbers purely as
characters) and strings as numbers (by taking the
numeric part of the string until the first non-
numeric character):
$count = 3; # Set the value of $count to 3
$order = “$count $curry”; # Set $order to: U
3 Chicken Phaal
$count += $order; # $count = 6
In this example the numeric part of the string (3) is
added to the value of $count, the remaining part of
the string $order is ignored.
$order = $count . $curry; # $order is now: U
63 Chicken Phaal
Using concatenation the value of $count is
prepended to $order.
Listing the ways
While scalar variables are useful in day-to-day
programming they alone are not adequate for more
complex programs. Every modern language has
developed more complex data types such as arrays
and hashes; Perl is no exception. Perl’s arrays are
indexed by integers and dynamically sized – you don’t
need to set a maximum size of an array when you
create it and the array will resize itself as elements are
added and removed.
@Foodgroups = (‘curry’, ‘kebabs’, “ice cream”);
In the previous example we create an array called
Foodgroups and populate it with three values, note
that the values can be single or double quoted and
that the rules of scalar quoting apply in the
assignment. All arrays in Perl are indicated by the @
character, indexed by integers and start at 0, so in
the example curry is at position 0 and ice cream is at
position 2.
software that illustrates why Perl’s syntax is so easily
misunderstood and consequently decried:
$/=0; $_=<>; tr/A-Z/a-z/;
%_=map{$_,1}/[a-z0-9_.–]+@[a-z0-9._-
]{3,67}(?=W)/g;@_=sort keys%_;
Although it has to be said that Perl needn’t be
written like this.
Scalars
In case you were wondering, the previous example
finds email addresses in a file, removes duplicates and
sorts them alphabetically. The code is confusing due
to the high frequency of special characters (sigils).
The most common and essential of these in everyday
programming is $.
$ denotes a scalar variable. In Perl, scalar variables
are use to hold numbers, text and many more types
of data. For example:
$percent = 12.7; # Assign 12.7 to the U
variable $percent
$count = 1; # Assign the value 1 to the U
variable $count
$name = ‘Guido’; # Assign the string U
‘Guido’ to $name.
$beast = $name; # Copy the value of U
$name to $beast
Below are the most popular methods to alter numeric
scalars:
$count = $count + 1; # count now equals 2
$count +=1; # count now equals 3
$count++; # count now equals 4
The first example is probably the simplest to understand.
$count is set to the value of $count + 1. The operator
+= in the second example is shorthand for the same
function, it can be applied to the multiply, subtract and
division operators amongst others. The final line of code
uses the post increment operator ++, this adds one to
the existing value of $count. There is also a post
decrement function –– that subtracts one from $count.
Perl has a rich variety of ways to assign and
manipulate strings.
$curry = ‘Chicken’; # This sets $curry U
to Chicken
$curry = “$curry Phaal”; # This sets U
$curry to: Chicken Phaal
In these examples the value of $curry is manipulated
using string operators. As with numeric operators the
strings are assigned using the equals operator. In the
second example the use of a variable inside double
quotes replaces the variable $name with its currently
assigned value, the official term for this is
“interpolation of the variable”.
$mistake = ‘$curry’; # This sets $mistake U
to literally: $curry
The code is
confusing
due to the
high
frequency
of special
characters
PROGRAMMING
57LINUX MAGAZINEIssue 19 • 2002
# Prints “After curry we have ice cream”
print “After $Foodgroups[0] we have U
$Foodgroups[2]n”;
Notice that in the example’s ‘print’ statement we use
the scalar $ sigil rather than the @ for array; this is
because we are accessing a scalar at the position of
the given value, called a subscript, that is in the
square brackets. If you wish to change a value in an
array and you know its position you can use the same
syntax without impacting the rest of the array. If you
try and retrieve a value from an index that does not
exist then an undef will be returned and the size of
the array will not be changed.
$Foodgroups[2] = ‘beer’;
# Prints “After curry we have beer”
print “After $Foodgroups[0] we have U
$Foodgroups[2]n”;
While being able to directly access a value by its
index is useful in many cases for the programmer to
work on the start or the end of the array.
Determining the length on a dynamically sizing array
is easier than you might think using what are known
as negative subscripts:
print $Foodgroups[-1]; # Prints “beer”
If you try and retrieve a value from a non-existent
negative position using a negative subscript then the
undef value is returned and the size of the array is
not modified. If you try and store a value in a non-
existent negative position the Perl interpreter will
generate a fatal error.
While working with arrays is comparatively simple,
an area many people new to Perl find confusing is
the difference between the length (number of
elements) in an array and the last position in the
array. Because the last position is a scalar value again
we use the $.
print $#Foodgroups; # Last position. U
This prints 2
print scalar(@Foodgroups); # Number of U
elements. This prints 3
In the second line of the example we introduce a
new function, ‘scalar’. While Perl is often smart
enough to do automatic conversion of variables to
suit the current context, in places where the usage is
ambiguous and more than one usage may appear
correct we can give the interpreter a helping hand. By
using the ‘scalar’ function we tell Perl to give us the
length, if we run the snippet again without the
‘scalar’ function then we get a completely different
result:
print @Foodgroups; # This prints U
‘currykebabsice cream’
print “@Foodgroups”; # This prints ‘curry U
kebabs ice cream’
$” = ‘ and ‘;
print “@Foodgroups”; # This prints ‘curry U
and kebabs and ice cream’
In the first line of the example we print the array
without telling Perl a context so it picks the most
obvious one (to itself) and prints all of the array’s
literal values. The second line of code wraps the array
in double quotes and the values are printed out in a
more readable form. The spaces that are emitted
from seemingly nowhere are dictated by another of
Perl’s implicit predefined variables, $” or the “List
Separator” as its known in Perl parlance. If you set
this variable directly, as we do in the third line, and
then reprint the array in a double quoted string each
element of the array is printed with the separator
between them.
As arrays are collections of values it is often
desirable to iterate through an array, repeating an
operation for each element. There are two simple
ways of doing this and the first way illustrates one of
the places where inexperienced Perl programmers can
confuse array position and array length. Given below
are four small for loops, two that are valid and do as
expected and two that do not. See if you can pick
out which are which:
for ($i=0; $i < @Foodgroup; $i++) {
print “$Foodgroup[$i]n”;
}
for ($i=0; $i <= $#Foodgroup; $i++) {
print “$Foodgroup[$i]n”;
}
for ($i=0; $i <= @Foodgroup; $i++) {
print “$Foodgroup[$i]n”;
}
for ($i=0; $i < $#Foodgroup; $i++) {
print “$Foodgroup[$i]n”;
}
The first two examples are both valid, they will iterate
through the array incrementing $i on each pass, so
that each indexed value will be printed once.
The final two examples are both incorrect; The
third line of the example executes the loop body for
once too often (if there are three things in
@Foodgroup, the loop executes when $i is 3, which
is incorrect as it’s not a valid position). The final loop
executes the body of the loop one time too few (if
the final element is at position 2, the loop stops after
executing the body of the loop with $i set to 1).
It is common to use either a for loop (shown
above) or a foreach loop to be able to operate on
every item in an array without knowing anything
about the array other than its existence. The most
visible difference between the two is that foreach
loops use an alias for the value rather than storing an
Inexperienced
Perl
programmers
can confuse
array position
and array
length
PROGRAMMING
58 LINUX MAGAZINE Issue 19 • 2002
index. This is useful when it’s unnecessary to know
the index positions:
foreach $Food (@Foodgroups) {
print “$Food is bad for youn”;
}
Or, if you want to make your code a little more
implicit and you call a number of functions in the
loop that use $_ as their default variable you can
execute the loop without an alias yet still have it
process the values:
foreach (@Foodgroups) {
print “$_ is bad for you”;
print length, “n”;
}
The above foreach loop will print out the message
with each value and then print out the length of each
value. This is possible because in the absence of an
argument Perl refers length to $_ and print then
prints the value that length returns.
Perl’s ability to use implicit values is both one of its
benefits and banes, depending on how sensibly it’s
used. The for and foreach loops are almost identical
in functionality and can be used interchangeably; you
should use the version that is easier to read in your
code.
Hashes – associative
arrays for lazy typists
An associative array is a data structure, which is
accessed by a string value called a “key” rather than
an index, as seen in arrays. In Perl, associative arrays
are used so frequently they’re called “hashes” which
is easier to say.
The % symbol denotes that a variable is a hash. As
with arrays, hashes utilise brackets to access
individual elements. For hashes, curly braces are used.
To assign a variable to a hash we need to specify
both the key and the value:
$hash{‘key’} = 10; # %hash now has a key U
with a value 10
Again the $ prefix is used when accessing an element
of the data-structure because the element will be a
scalar variable. So the only thing differentiating the
hash from a normal array is the shape of the braces.
For a hash the curly braces encapsulate the key.
The following example illustrates how those
brackets alter the semantics of the entire line:
%a; # An associative array
@a; # A traditional array
$num = 3; # A numeric scalar value
# Assign a value to each
$a[$num]=’Array’; # Puts “Array” in the U
4th element of @a
$a{$num}=’Hash’; # Associates “Hash” to U
3 in the hash %a
The keys in a hash are unique, so if a value is
assigned to a key, the previous value will be over-
written and lost. At first this seems to be a
disadvantage: it’s one of Perl’s most heavily exploited
features, as we will discussed later.
$hash{six} = 6; # Value of the key ‘six’ to 6
$hash{six} = 9; # Value of the key ‘six’ U
to 9, no longer 6
Initialising a hash is similar to the methods used for
arrays. A hash can be initialised with a full
complement of keys and values. Hashes utilise array
and list operators but the manner in which the data
is manipulated is subtly different.
%numbers = ( ‘one’,1,’two’,2,’three’,3);
This expression assigns the following keys and values
to the hash:
keyvalue
one1
two2
three 3
The hash knows to pick the first element as a key
and the next as a value. Elements are read from the
list and alternately given the role of key or value. Perl
will complain (when run under warnings) if the list
contains an odd number of elements. However, the
last key will be included and its value will be a undef.
The => operator is used to improve the legibility
of list assignments when initialising a hash. It
allows us to quickly differentiate the keys and
values within the list. The value to the left of => is
the key of the element, the item to the right is
the value. Using the => operator also means that
the key needn’t be wrapped in quotes if it’s a
single word.
%hash = ( six => 6, seven => 7, ten => 10);
Hashes have a few explicit functions as well as
borrowing many of the list functions, the most
popular are:
keys which returns the keys found in a hash
as a list
values which returns the values of the hash
as a list
Each of these functions returns a list, which will be in
seemingly random order. If order is needed it must be
imposed using the function ‘sort’.
Perl’s ability
to use
implicit
values is
both one of
its benefits
and banes
PROGRAMMING
59LINUX MAGAZINEIssue 19 • 2002
%a = ( ten=>10, nine=>9, eight=>8, seven=>7);
@b = keys %a # Places keys in array
@c = values %a # Places values in array
The order of elements in @b (the array of key
elements) may be: nine, seven, eight and ten. The
order of elements in @c will then be: 9, 7, 8, 10.
Regardless of the order of elements, key and value
are returned at the same point.
The functions keys and values are frequently used
to traverse the contents of a hash; there are several
methods of accessing every element of a hash:
for my $key (keys %a){
$value = $a{$key};
print “$key => $valuen”;
}
This is probably the most common way of accessing
all elements in an array. Using a for loop in the same
manner, we would use it to access all the elements of
an array.
There are two more functions used with hashes,
the delete and exists functions can also be used with
arrays, but are more commonly seen in code relating
to hashes.
The delete function removes elements from
hashes. Removing an element with the key ‘fred’ is
expressed in the following way:
delete $passwd{‘fred’};
exists is a function that, given a key in a hash, will
return true if that element is present. To make full
use of the exists function we need to use it with a
conditional operator. In the example we use if. When
running under warnings, it is prudent to use exists
before calling an element of a hash if it’s doubtful
that the key is present.
%passwd=( fred => ‘xSSx13A0Oav’, root=>’root’);
if ( exists($passwd{$user}) ){
print “Success, that user was foundn”;
}
else {
print “Sorry, that user was not foundn”;
}
if tests the return value from the exists function; if the
hash element does exist then if will run the code
wrapped in the curly braces that follow it. When a
function evaluates to false the ‘if’ statement disregards
the first set of curly braces and executes the contents
of the curly braces following else if else is present.
Here are some simple examples of the common
uses for hashes in Perl:
● Creating a look-up table of values to substitute:
%dns=(10.3.1.0 =>’firewall’, 10.3.2.0 => U
‘email’, 10.3.0.1 => “bob’s machine”);
print “$dns{$ip-address}n”;
● Removing duplicates from a structure by exploiting
a hash’s use of unique keys:
@a=(1,2,1,2,4,6,7,2,1,10,6,7,8,8); # U
Initialise the array
%a=map{$_=>1}@a; # Make a hash where the U
keys are the elements of @a
@a=keys %a; # Reassign @a so that it U
contains unique values
At the beginning of this example @a contains
(1,2,1,2,4,6,7,2,1,10,6,7,8,8), after being filtered
through the hash @a contains (7,8,1,2,10,4,6).
Focusing on line 2 of the above example, map is
used to create a key value pair for the hash. The
value being assigned is irrelevant. The only important
function occurring is taking place implicitly: for keys
that are already in existence the value will be over-
written (by an identical value) since keys in a hash
are unique.
Perl documentation
Perl has a wealth of documentation which comes with the standard distribution.
It covers every aspect of the Perl language and is viewed using your computer’s
default pager program. Perldoc pages resemble manpages, citing examples of
use and pertinent advice.
There are many parts to the Perl documentation. To list the categories, type
the following command at the shell prompt:
perldoc perl
The page displayed for the previous example has two columns; the left column
lists the mnemonic titles and the right column a description of the topic.
perlsyn Perl syntax
perldata Perl data structures
perlop Perl operators and precedence
perlsub Perl subroutines
To invoke documentation for a subject, simply type perldoc and the mnemonic
for that topic on the command line. The example below will display the
documentation for “Perl Syntax”.
perldoc perlsyn
A further use of perldoc is to read the usage for any of Perl’s functions, this is
done by calling perldoc with the -f option and the function name as an
argument. The following example will display the documentation for the
function map.
perldoc -f map
Perldoc also provides quick access to frequently asked questions about Perl.
perldoc -q punctuation

More Related Content

PDF
newperl5
PPTX
Scalar expressions and control structures in perl
PPT
Perl 101 - The Basics of Perl Programming
PDF
Tutorial perl programming basic eng ver
PDF
perltut
PDF
Perl programming language
PPTX
PPT
Perl Basics with Examples
newperl5
Scalar expressions and control structures in perl
Perl 101 - The Basics of Perl Programming
Tutorial perl programming basic eng ver
perltut
Perl programming language
Perl Basics with Examples

What's hot (16)

ODP
Perl Introduction
PDF
Perl_Part5
PDF
Perl Programming - 02 Regular Expression
PPT
Perl Presentation
PPT
Perl tutorial
PPTX
Subroutines in perl
PDF
Perl_Part6
PPT
Regular expressions in Perl
PPTX
Strings,patterns and regular expressions in perl
ODP
Introduction to Modern Perl
PPTX
Unit 1-introduction to perl
PPTX
Perl names values and variables
PDF
Python regular expressions
PPTX
Unit 1-scalar expressions and control structures
PPTX
Lecture3 php by okello erick
PPTX
LISP:Symbols and packages in lisp
Perl Introduction
Perl_Part5
Perl Programming - 02 Regular Expression
Perl Presentation
Perl tutorial
Subroutines in perl
Perl_Part6
Regular expressions in Perl
Strings,patterns and regular expressions in perl
Introduction to Modern Perl
Unit 1-introduction to perl
Perl names values and variables
Python regular expressions
Unit 1-scalar expressions and control structures
Lecture3 php by okello erick
LISP:Symbols and packages in lisp
Ad

Viewers also liked (16)

PPTX
Cambridge BizServe Corporate Overview
PPTX
лях
PDF
CTE- Express computer , July 2015
PDF
Cambridge Technology posts 3 crore profit in Q1.
PPT
Cтенд 2016
PDF
The Ordinary Millionaire Ian Marais
PPTX
Week van de veiligheid - veilig werken in en om de winkel
PDF
Final Draft (Final Print)
DOCX
Factores que afectan la distribucion de planta
PDF
PPTX
Question 3 of media evaluation
PPT
Ativ 1 conheça me
DOCX
Anit No Need Foe Tha Price Of LORD.Pt.1.html.doc
DOCX
Extreme coarse of action.Pt.2.Pic.doc
DOCX
Insurance Optimization
Cambridge BizServe Corporate Overview
лях
CTE- Express computer , July 2015
Cambridge Technology posts 3 crore profit in Q1.
Cтенд 2016
The Ordinary Millionaire Ian Marais
Week van de veiligheid - veilig werken in en om de winkel
Final Draft (Final Print)
Factores que afectan la distribucion de planta
Question 3 of media evaluation
Ativ 1 conheça me
Anit No Need Foe Tha Price Of LORD.Pt.1.html.doc
Extreme coarse of action.Pt.2.Pic.doc
Insurance Optimization
Ad

Similar to Perl_Part1 (20)

PDF
perltut
PPTX
Perl slid
PDF
Perl_Part4
PDF
newperl5
PPTX
Unit 1-subroutines in perl
ODP
Beginning Perl
PPTX
The Bund language
DOCX
What is the general format for a Try-Catch block Assume that amt l .docx
PPTX
python and perl
PDF
Introduction to PERL Programming - Complete Notes
PPT
PERL - complete_guide_references (1).ppt
PPT
PERL - complete_Training_Modules_Ref.ppt
PPT
Introduction to perl scripting______.ppt
PDF
Perl_Part7
PDF
Perl intro
PPT
Plunging Into Perl While Avoiding the Deep End (mostly)
ODP
Php Learning show
PPT
P H P Part I, By Kian
PPT
Introduction to perl_ a scripting language
PPTX
Shell scripting
perltut
Perl slid
Perl_Part4
newperl5
Unit 1-subroutines in perl
Beginning Perl
The Bund language
What is the general format for a Try-Catch block Assume that amt l .docx
python and perl
Introduction to PERL Programming - Complete Notes
PERL - complete_guide_references (1).ppt
PERL - complete_Training_Modules_Ref.ppt
Introduction to perl scripting______.ppt
Perl_Part7
Perl intro
Plunging Into Perl While Avoiding the Deep End (mostly)
Php Learning show
P H P Part I, By Kian
Introduction to perl_ a scripting language
Shell scripting

Perl_Part1

  • 1. PROGRAMMING 55LINUX MAGAZINEIssue 19 • 2002 O riginally released in 1987, Perl has spread from niche to niche (including CGI, Databases and XML), assimilating buzzwords that stray unwittingly into its path, such as Object Orientation, Bio-informatics and Aspect Oriented Programming). For all of the above reasons, Perl is renowned as a ‘glue-language’: it interacts with most popular applications. The Comprehensive Perl Archive Network (CPAN) repository is one of the jewels in Perl’s crown (groan). It provides a library of language extension modules as comprehensive as J2EE and the .NET framework, providing a set of APIs that enables integration with other languages including C, C++ and Java, to name but a few. Perl’s strength has always been its active user community, which created and maintained sites such as CPAN (http://www.cpan. org), Perl.com (http:// www.perl.com), use.perl (http://use.perl.org), Perl Monks (http://www.perlmonks.org) and various geographically diverse Perl Mongers groups. As well as having sites devoted to it, Perl runs some of the busiest sites on the Net, including geek- havens Slashdot (http://www.slashdot.org) and Kuro5hin (http://www.kuro5hin.org). In fact Perl is so widely used on the Web that it’s often referred to as the duct tape of the Internet. Perl is a terse but high-level language that removes the burdens of memory allocation, the distinction between primitive data types, file handling and the need to constantly reinvent the wheel. It is because Perl allows the developer such freedom and functionality within so few key-strokes that Perl has the semi-deserved reputation of resembling line-noise. Perl’s integrated regular expression handling (a super-set of the POSIX standard) – the variety of operators provided to manipulate, describe and access Perl’s data-structures – has meant Perl had to spill over to lesser-used areas of the keyboard or adopt a larger, more esoteric vocabulary. Of course, big words can sometimes obscure meaning – just take the last sentence as an example – so more obscure keyboard symbols were instead adopted. Here’s a snippet of Perl code seen in production Perl THINKING IN LINE NOISE Perl is a language steeped in the history and evolution of Unix (and by extension Linux) platforms, so it’s only right that it should have a place here at Linux Magazine. Dean Wilson and Frank Booth begin our journey with an overview of Perl and its syntax Getting Perl Any moderately recent and well-stocked Linux distribution will come complete with an installed Perl interpreter, a full set of Perl core modules and the standard (and copious!) documentation in pod format. If your install does not include Perl then there are two paths open to you; you can either get a binary distribution from your distro’s package repository or download and compile your own. As this is a beginner’s tutorial we will cover getting the package and installing it rather than compiling your own; this topic is more than adequately covered in the INSTALL file in the root of the source code tarball. Installing the binary package varies more upon your Linux distribution but can be summarised as: rpm-based Step 1: Download the package from either your distro’s repository or from one of the links at http://www.rpmfind.net or http://www.perl.com. Step 2: As root, issue the ‘rpm -i <perlpackage>’ command. Debian Debian saves you the wasted time fetching the package by hand and instead allows you to get by with the following: Step 1: apt-get update. Step 2: apt-get install perl. While Debian makes the initial install simpler; for some packages that have external dependencies you are reliant upon the apt-get mechanism, as an example modules that use Libmagick or expat (an XML parser) must be installed via apt-get or will require modification of the source to allow a successful install.
  • 2. PROGRAMMING 56 LINUX MAGAZINE Issue 19 • 2002 Unlike its predecessor, the above example uses single quotes which prevents the variable from being interpolated: it returns the literal value within the quotes: $word = $curry . ‘ is ‘; # This sets U $word to: Chicken Phaal is $word .= ‘tasty’; # This sets $word to: U Chicken Phaal is tasty The dot operator ‘.’ is used to concatenate values together. In these last two examples the dot operator is used to append strings to $word; in the latter case using the same philosophy as the += operator. Note that concatenating single quoted strings to a variable does not affect the interpolation of the variable that is not wrapped in quotes. Perl allows us to use the string operators on numbers (it treats the numbers purely as characters) and strings as numbers (by taking the numeric part of the string until the first non- numeric character): $count = 3; # Set the value of $count to 3 $order = “$count $curry”; # Set $order to: U 3 Chicken Phaal $count += $order; # $count = 6 In this example the numeric part of the string (3) is added to the value of $count, the remaining part of the string $order is ignored. $order = $count . $curry; # $order is now: U 63 Chicken Phaal Using concatenation the value of $count is prepended to $order. Listing the ways While scalar variables are useful in day-to-day programming they alone are not adequate for more complex programs. Every modern language has developed more complex data types such as arrays and hashes; Perl is no exception. Perl’s arrays are indexed by integers and dynamically sized – you don’t need to set a maximum size of an array when you create it and the array will resize itself as elements are added and removed. @Foodgroups = (‘curry’, ‘kebabs’, “ice cream”); In the previous example we create an array called Foodgroups and populate it with three values, note that the values can be single or double quoted and that the rules of scalar quoting apply in the assignment. All arrays in Perl are indicated by the @ character, indexed by integers and start at 0, so in the example curry is at position 0 and ice cream is at position 2. software that illustrates why Perl’s syntax is so easily misunderstood and consequently decried: $/=0; $_=<>; tr/A-Z/a-z/; %_=map{$_,1}/[a-z0-9_.–]+@[a-z0-9._- ]{3,67}(?=W)/g;@_=sort keys%_; Although it has to be said that Perl needn’t be written like this. Scalars In case you were wondering, the previous example finds email addresses in a file, removes duplicates and sorts them alphabetically. The code is confusing due to the high frequency of special characters (sigils). The most common and essential of these in everyday programming is $. $ denotes a scalar variable. In Perl, scalar variables are use to hold numbers, text and many more types of data. For example: $percent = 12.7; # Assign 12.7 to the U variable $percent $count = 1; # Assign the value 1 to the U variable $count $name = ‘Guido’; # Assign the string U ‘Guido’ to $name. $beast = $name; # Copy the value of U $name to $beast Below are the most popular methods to alter numeric scalars: $count = $count + 1; # count now equals 2 $count +=1; # count now equals 3 $count++; # count now equals 4 The first example is probably the simplest to understand. $count is set to the value of $count + 1. The operator += in the second example is shorthand for the same function, it can be applied to the multiply, subtract and division operators amongst others. The final line of code uses the post increment operator ++, this adds one to the existing value of $count. There is also a post decrement function –– that subtracts one from $count. Perl has a rich variety of ways to assign and manipulate strings. $curry = ‘Chicken’; # This sets $curry U to Chicken $curry = “$curry Phaal”; # This sets U $curry to: Chicken Phaal In these examples the value of $curry is manipulated using string operators. As with numeric operators the strings are assigned using the equals operator. In the second example the use of a variable inside double quotes replaces the variable $name with its currently assigned value, the official term for this is “interpolation of the variable”. $mistake = ‘$curry’; # This sets $mistake U to literally: $curry The code is confusing due to the high frequency of special characters
  • 3. PROGRAMMING 57LINUX MAGAZINEIssue 19 • 2002 # Prints “After curry we have ice cream” print “After $Foodgroups[0] we have U $Foodgroups[2]n”; Notice that in the example’s ‘print’ statement we use the scalar $ sigil rather than the @ for array; this is because we are accessing a scalar at the position of the given value, called a subscript, that is in the square brackets. If you wish to change a value in an array and you know its position you can use the same syntax without impacting the rest of the array. If you try and retrieve a value from an index that does not exist then an undef will be returned and the size of the array will not be changed. $Foodgroups[2] = ‘beer’; # Prints “After curry we have beer” print “After $Foodgroups[0] we have U $Foodgroups[2]n”; While being able to directly access a value by its index is useful in many cases for the programmer to work on the start or the end of the array. Determining the length on a dynamically sizing array is easier than you might think using what are known as negative subscripts: print $Foodgroups[-1]; # Prints “beer” If you try and retrieve a value from a non-existent negative position using a negative subscript then the undef value is returned and the size of the array is not modified. If you try and store a value in a non- existent negative position the Perl interpreter will generate a fatal error. While working with arrays is comparatively simple, an area many people new to Perl find confusing is the difference between the length (number of elements) in an array and the last position in the array. Because the last position is a scalar value again we use the $. print $#Foodgroups; # Last position. U This prints 2 print scalar(@Foodgroups); # Number of U elements. This prints 3 In the second line of the example we introduce a new function, ‘scalar’. While Perl is often smart enough to do automatic conversion of variables to suit the current context, in places where the usage is ambiguous and more than one usage may appear correct we can give the interpreter a helping hand. By using the ‘scalar’ function we tell Perl to give us the length, if we run the snippet again without the ‘scalar’ function then we get a completely different result: print @Foodgroups; # This prints U ‘currykebabsice cream’ print “@Foodgroups”; # This prints ‘curry U kebabs ice cream’ $” = ‘ and ‘; print “@Foodgroups”; # This prints ‘curry U and kebabs and ice cream’ In the first line of the example we print the array without telling Perl a context so it picks the most obvious one (to itself) and prints all of the array’s literal values. The second line of code wraps the array in double quotes and the values are printed out in a more readable form. The spaces that are emitted from seemingly nowhere are dictated by another of Perl’s implicit predefined variables, $” or the “List Separator” as its known in Perl parlance. If you set this variable directly, as we do in the third line, and then reprint the array in a double quoted string each element of the array is printed with the separator between them. As arrays are collections of values it is often desirable to iterate through an array, repeating an operation for each element. There are two simple ways of doing this and the first way illustrates one of the places where inexperienced Perl programmers can confuse array position and array length. Given below are four small for loops, two that are valid and do as expected and two that do not. See if you can pick out which are which: for ($i=0; $i < @Foodgroup; $i++) { print “$Foodgroup[$i]n”; } for ($i=0; $i <= $#Foodgroup; $i++) { print “$Foodgroup[$i]n”; } for ($i=0; $i <= @Foodgroup; $i++) { print “$Foodgroup[$i]n”; } for ($i=0; $i < $#Foodgroup; $i++) { print “$Foodgroup[$i]n”; } The first two examples are both valid, they will iterate through the array incrementing $i on each pass, so that each indexed value will be printed once. The final two examples are both incorrect; The third line of the example executes the loop body for once too often (if there are three things in @Foodgroup, the loop executes when $i is 3, which is incorrect as it’s not a valid position). The final loop executes the body of the loop one time too few (if the final element is at position 2, the loop stops after executing the body of the loop with $i set to 1). It is common to use either a for loop (shown above) or a foreach loop to be able to operate on every item in an array without knowing anything about the array other than its existence. The most visible difference between the two is that foreach loops use an alias for the value rather than storing an Inexperienced Perl programmers can confuse array position and array length
  • 4. PROGRAMMING 58 LINUX MAGAZINE Issue 19 • 2002 index. This is useful when it’s unnecessary to know the index positions: foreach $Food (@Foodgroups) { print “$Food is bad for youn”; } Or, if you want to make your code a little more implicit and you call a number of functions in the loop that use $_ as their default variable you can execute the loop without an alias yet still have it process the values: foreach (@Foodgroups) { print “$_ is bad for you”; print length, “n”; } The above foreach loop will print out the message with each value and then print out the length of each value. This is possible because in the absence of an argument Perl refers length to $_ and print then prints the value that length returns. Perl’s ability to use implicit values is both one of its benefits and banes, depending on how sensibly it’s used. The for and foreach loops are almost identical in functionality and can be used interchangeably; you should use the version that is easier to read in your code. Hashes – associative arrays for lazy typists An associative array is a data structure, which is accessed by a string value called a “key” rather than an index, as seen in arrays. In Perl, associative arrays are used so frequently they’re called “hashes” which is easier to say. The % symbol denotes that a variable is a hash. As with arrays, hashes utilise brackets to access individual elements. For hashes, curly braces are used. To assign a variable to a hash we need to specify both the key and the value: $hash{‘key’} = 10; # %hash now has a key U with a value 10 Again the $ prefix is used when accessing an element of the data-structure because the element will be a scalar variable. So the only thing differentiating the hash from a normal array is the shape of the braces. For a hash the curly braces encapsulate the key. The following example illustrates how those brackets alter the semantics of the entire line: %a; # An associative array @a; # A traditional array $num = 3; # A numeric scalar value # Assign a value to each $a[$num]=’Array’; # Puts “Array” in the U 4th element of @a $a{$num}=’Hash’; # Associates “Hash” to U 3 in the hash %a The keys in a hash are unique, so if a value is assigned to a key, the previous value will be over- written and lost. At first this seems to be a disadvantage: it’s one of Perl’s most heavily exploited features, as we will discussed later. $hash{six} = 6; # Value of the key ‘six’ to 6 $hash{six} = 9; # Value of the key ‘six’ U to 9, no longer 6 Initialising a hash is similar to the methods used for arrays. A hash can be initialised with a full complement of keys and values. Hashes utilise array and list operators but the manner in which the data is manipulated is subtly different. %numbers = ( ‘one’,1,’two’,2,’three’,3); This expression assigns the following keys and values to the hash: keyvalue one1 two2 three 3 The hash knows to pick the first element as a key and the next as a value. Elements are read from the list and alternately given the role of key or value. Perl will complain (when run under warnings) if the list contains an odd number of elements. However, the last key will be included and its value will be a undef. The => operator is used to improve the legibility of list assignments when initialising a hash. It allows us to quickly differentiate the keys and values within the list. The value to the left of => is the key of the element, the item to the right is the value. Using the => operator also means that the key needn’t be wrapped in quotes if it’s a single word. %hash = ( six => 6, seven => 7, ten => 10); Hashes have a few explicit functions as well as borrowing many of the list functions, the most popular are: keys which returns the keys found in a hash as a list values which returns the values of the hash as a list Each of these functions returns a list, which will be in seemingly random order. If order is needed it must be imposed using the function ‘sort’. Perl’s ability to use implicit values is both one of its benefits and banes
  • 5. PROGRAMMING 59LINUX MAGAZINEIssue 19 • 2002 %a = ( ten=>10, nine=>9, eight=>8, seven=>7); @b = keys %a # Places keys in array @c = values %a # Places values in array The order of elements in @b (the array of key elements) may be: nine, seven, eight and ten. The order of elements in @c will then be: 9, 7, 8, 10. Regardless of the order of elements, key and value are returned at the same point. The functions keys and values are frequently used to traverse the contents of a hash; there are several methods of accessing every element of a hash: for my $key (keys %a){ $value = $a{$key}; print “$key => $valuen”; } This is probably the most common way of accessing all elements in an array. Using a for loop in the same manner, we would use it to access all the elements of an array. There are two more functions used with hashes, the delete and exists functions can also be used with arrays, but are more commonly seen in code relating to hashes. The delete function removes elements from hashes. Removing an element with the key ‘fred’ is expressed in the following way: delete $passwd{‘fred’}; exists is a function that, given a key in a hash, will return true if that element is present. To make full use of the exists function we need to use it with a conditional operator. In the example we use if. When running under warnings, it is prudent to use exists before calling an element of a hash if it’s doubtful that the key is present. %passwd=( fred => ‘xSSx13A0Oav’, root=>’root’); if ( exists($passwd{$user}) ){ print “Success, that user was foundn”; } else { print “Sorry, that user was not foundn”; } if tests the return value from the exists function; if the hash element does exist then if will run the code wrapped in the curly braces that follow it. When a function evaluates to false the ‘if’ statement disregards the first set of curly braces and executes the contents of the curly braces following else if else is present. Here are some simple examples of the common uses for hashes in Perl: ● Creating a look-up table of values to substitute: %dns=(10.3.1.0 =>’firewall’, 10.3.2.0 => U ‘email’, 10.3.0.1 => “bob’s machine”); print “$dns{$ip-address}n”; ● Removing duplicates from a structure by exploiting a hash’s use of unique keys: @a=(1,2,1,2,4,6,7,2,1,10,6,7,8,8); # U Initialise the array %a=map{$_=>1}@a; # Make a hash where the U keys are the elements of @a @a=keys %a; # Reassign @a so that it U contains unique values At the beginning of this example @a contains (1,2,1,2,4,6,7,2,1,10,6,7,8,8), after being filtered through the hash @a contains (7,8,1,2,10,4,6). Focusing on line 2 of the above example, map is used to create a key value pair for the hash. The value being assigned is irrelevant. The only important function occurring is taking place implicitly: for keys that are already in existence the value will be over- written (by an identical value) since keys in a hash are unique. Perl documentation Perl has a wealth of documentation which comes with the standard distribution. It covers every aspect of the Perl language and is viewed using your computer’s default pager program. Perldoc pages resemble manpages, citing examples of use and pertinent advice. There are many parts to the Perl documentation. To list the categories, type the following command at the shell prompt: perldoc perl The page displayed for the previous example has two columns; the left column lists the mnemonic titles and the right column a description of the topic. perlsyn Perl syntax perldata Perl data structures perlop Perl operators and precedence perlsub Perl subroutines To invoke documentation for a subject, simply type perldoc and the mnemonic for that topic on the command line. The example below will display the documentation for “Perl Syntax”. perldoc perlsyn A further use of perldoc is to read the usage for any of Perl’s functions, this is done by calling perldoc with the -f option and the function name as an argument. The following example will display the documentation for the function map. perldoc -f map Perldoc also provides quick access to frequently asked questions about Perl. perldoc -q punctuation