| EIW Fall 2003 Lecture Notes |
|   EIW Home  |   Course Syllabus |
Originally Perl was developed as a tool for report generation, the name PERL stands for "Practical Extraction and Report Language". Perl can now do lots more, the addition of systems programming facilities in the language means that perl can be used to develop applications that require access to system services (including network applications).
Although perl is not a true interpreted language, the perl executable is often referred to as "the perl interpreter".
|
The perl print function sends a string to STDOUT. Although the above example shows the argument inside parenthesis, in perl you can omit the parenthesis when calling functions, so this is also acceptable:
|
To run a perl script you need to get and install the perl distribution
on your PC. The ActiveState Perl distribution for Windows is the
latest and most complete port of perl to Windows (there are other
ports). You need to know where the perl program itself is installed,
chances are it is something like C:\PERL\BIN\PERL. You
may want to add the path to the perl executable to your PATH so you
don't have to type the full name each time - I did this by adding
C:\PERL\BIN to the PATH set in the file
C:\autoexec.bat. Windows NT/2000 users can add perl to the path
using the system administration control in the control panel.
To run a perl script that you've saved in the file "foo.pl" you would
type the following at the DOS command line: perl foo.pl
or possibly c:\perl\bin\perl foo.pl if you didn't change
your DOS PATH. This will start the perl interpreter (the program
perl.exe) and give it your script as input. If there are syntax
problems with your script, perl will print out some error messages and
quit. If your script is OK, perl will go ahead and run it, and any print
statements will result in output sent to the screen. Below is an
example of running the perl script show above (at a DOS prompt):
|
The simplest type of perl variable is a scalar. A scalar variable can hold a single value, although it can hold any kind of value. In other programming languages there are different kinds of variables for holding integers, floating point numbers, and strings - in perl there is just one kind of variable and it can hold any of these.
Every variable has a name that is made up from alphanumeric characters
(you can also use the underscore character '_'). Additionally, scalar
variables all start with the '$' character. The following are valid perl
scalar variables: $foo, $foo_blah, $foofoofoofoofoofoofoofoofoo.
Assigning a value to a scalar variable in perl looks just like C/C++, here are some examples:
|
Like C/C++ each statement ends with a semi-colon. Unlike C/C++, you can assign a string constant to a variable. Remember that perl variables are not associated with any specific data type - you can assign numbers or strings to any scalar variable.
There are other kinds of perl variables, for example there are list variables that are like arrays in C/C++. While scalar variables can hold a single value, lists variables hold multiple values. Whenever we refer to a perl scalar we are talking about a simple constant or variable that can hold a single value (numeric or string).
We've already seen some Perl constants, for example the number
3.141593 and the string "foo" are scalar
constants. Perl supports the same notation for floating point
constants as C/C++, so you can have constants that look like
6.02E23 and -1.5E-3. Integer constants are
kinda obvious, although you have to make sure you don't start an
integer constant with a 0, as perl takes this as a signal that the
constant is represented in octal (base 8) or hexadecimal (base 16).
If you don't know about octal or hexadecimal representation don't
worry, just don't ever start an integer constant with a 0.
String constants can be enclosed in either single or double quotes, so
these are both string constants: 'I am a string' and
"I am a string in double quotes". However, if you are a
C/C++ programmer and used to embedding stuff like newlines
\n or tabs \t in string constants you need
to use double quotes, since perl doesn't interpret backslash as
anything special inside a singly quoted string. For example, the
following string constants are different:
'Hello\n' |
"Hello\n" |
The first string (in single quotes) has 6 characters, the last two characters are '\' and 'n'. The second string has 5 characters, the last being a newline represented by \n. The following table shows some of the special backslash escaped characters recognized by perl (in double quoted strings):
\n |
Newline |
|---|---|
\r |
Return |
\t |
Tab |
\a |
Bell |
\\ |
Backslash |
\" |
Double quote |
Perl supports the usual set of mathematical operators so you can do stuff like this:
|
In addition to the operators +, -,
/ and *, perl supports the **
exponentiation operator (just like Fortran), so the expression
$y**2 is $y to the 2nd power and
10**1.87 is 10 to the power 1.87.
Perl also supports the %modulo operator just like C/C++
Perl supports a string concatenation operator that combines two
strings. The symbol used for this operator is a single period
(.). Here are some examples that show this operator in
action:
|
Perl also supports a string repetition operator. The symbol used for
the repetition operator is x (the letter x). The string
to be repeated is on the left of the operator and an integer
repetition count is on the right, as in the following examples:
| Expression | Value |
|---|---|
"M" x 4 | "MMMM" |
"Hello" x 2 | "HelloHello" |
"joe" x (5-2) | "joejoejoe" |
Perl supports the typical set of comparison operators, although it supports both numeric and string comparisons. Since scalar data can be either string or numeric, you have to tell perl whether to use a numeric or a string comparison operator! All the comparison operators result in a value of True or False (more on how perl represents this later). The following table shows both sets of comparison operators:
| Comparison | Numeric Operator | String Operator |
|---|---|---|
Equal |
== |
eq |
Not Equal |
!= |
ne |
Less than |
< |
lt |
Greater than |
> |
gt |
Less than or equal to |
<= |
le |
Greater than or equal to |
>= |
ge |
When comparing strings, perl uses the ASCII value of each character as the basis of comparison. So the first character of each string is compared, and if they are different the string whose first character has a greater ASCII value than the other is "greater than". You don't really need to know the ASCII value of each character to understand this - since 'a' is less than 'b' is less than 'c' (and so on).
NOTE: '0' is less than '1' is less than '2', ...     This means that "1876" is less than "4"
Perl numeric and comparison operators follow the same rules as in
C/C++. The string concatenation operator . has higher
precedence than the repetition operator x and both are
left associative. Parentheses can be used to force the order of evaluation
(as in C/C++).
We have already seen how to assign a value to a scalar variable using
the = assignment operator. In addition to changing the
value of a variable, an expression involving the assignment operator
itself has a value. So, just like the expression 2+3 has
the value 5, the expression $x = 2 + 3 has a value. The
value of an assignment expression is a reference to the variable
assigned a new value. This makes it possible to do things like
this:
|
Assignment is right-associative, so the first example above is the same as $x = ($y = ($z = 0.0)).
Perl automatically converts values (variables or constants) between
numeric and string depending on the context. For example, if an
expression tries to apply the numeric addition operator +
to a string the string will first be converted to a number. The
following example expressions involve automatic data conversions:
2 * "3.141593" |
The string is converted to a number before the multiplication |
(117 lt 23) |
Both numbers are converted to strings (the string comparison operator forces this). The result is true, since "117" lt "23" (when compared as strings). |
When converting from a string to a number, perl ignores any leading
whitespace and any trailing non-numeric stuff is ignored. For example,
the string " 0.35HiJoe" would be converted to the number
0.35. If there is nothing in the string that looks like a number, the
conversion results in the value 0. For example, the expression
12 * "HiDave" has the value 0 since "HiDave"
would result in the value 0.
Since perl does this data conversion automatically, and doesn't warn you that it's doing anything - you can easily make simple mistakes that are very hard to find. Here is an illustration of a common mistake:
|
Since the mathematical comparison operator == is used in
the above expression, each of the strings are converted to numbers, and
both have the numeric value 0, so they are (numerically) equal. The author of
the above code probably wanted the following:
|
This code uses the string comparison operator eq.
What is the value of each of the following expressions?
"HelloWorld" x "1"
17 + "13"
17 + "thirteen"
("Senator" . " " . "Hillary") == "Senator Hillary"
("Senator" . " " . "Hillary") eq "Senator Hillary"
"987654321" gt "9871654321"
10 x 2
10 . 2
We've already seen that inside double quoted strings perl interprets
some character sequences as special - for example the sequence "\n"
means a newline. Perl also does variable interpolation of
scalar variables inside doubly quoted strings (and not inside singly
quoted strings). Interpolation means that the occurrence of the
variable name is replace by the value of the variable. For example,
suppose we have a variable named $college that currently
has the value "RPI". The string "I go to
$college" would become "I go to RPI" since perl
finds the variable $college inside a doubly quoted
string.
Suppose I have the following variables and corresponding values:
| Variable Name | Current Value |
|---|---|
$num | 17 |
$number | 23 |
What does perl do with the string "The result is $number"?
Perl uses the longest variable name it can match - so in the above
case it would use the variable $number and not the
variable $num. The resulting string would therefore be
"The result is 23". We could force perl to use the
variable $num by putting curly braces around the variable
name: "The result is ${num}ber", in this case the
resulting string would be "The result is 17ber".
If you want to create a string that has a $ before
something that might be a variable name you must do something to tell
perl not to do variable interpolation. For example, suppose you want
to print out the string "The variable name is $foo" and
want exactly that string, not the string that would result from
variable interpolation. There are a couple of ways to deal with this
problem:
you can make sure that any part of the string you don't want to be
interpolated is in single quotes:
"The variable name is" . '$foo'you can also escape the $ character by putting a
backslash in front of it - this tells perl this specific character is
simply a $ and the special significance of $
as the prefix for scalar variable names is ignored. In this case the
string would look like this: "The variable name is \$foo"
- the slash tells perl not to consider that foo refers to a variable
even though it has a leading $.
You can read a scalar value from standard input (typically the
terminal in which you are running your perl program) using the cryptic
notation <STDIN>. You can put this statement
anywhere in an expression where it would be valid to put a scalar
constant or variable reference. Some Examples:
|
Perl reads an entire line of input (stops reading once it hits a newline)
each time it sees <STDIN> in a script. If there is no
input available the perl program will wait for a complete line (it will
also stop trying to read once it sees an End of File marker). If you have
a perl script like this:
|
and the user types the line 17 200 the variable
$age will get the value "17 200" and the
program will wait for another line to assign a value to the variable
$weight.
<STDIN> newlines and chopThe <STDIN> input operator returns an entire line
including the newline character. If you are reading a number
this doesn't matter since when interpreting the string as a number
perl will ignore any trailing stuff in the string that is not numeric. So this
will work:
|
Assuming the user type a single numeric value followed by a newline
(pressing the Enter key) this script will do what we want. The reason
that $age+20 works is that perl converts the string
assigned to the variable $age to a number when it sees the
mathematical plus operator.
When reading strings we often need to get rid of the newline that perl
leaves for us. The chop function will chop the last character of
a string off and throw it away - leaving everything except the last character.
If we don't use chop in the following example we won't get what we want:
|
If the user types Dave and presses Enter, then
Hollinger and presses enter the output of the program
will look like this: (user input shown in italics)
|
We can use chop to get rid of the newlines and get what we want:
|
This would produce the following:
|
You can also use something like chop($name=<STDIN>).
The perl printf function is very similar to C printf function - it creates some output string and sends it to STDOUT. All the printf formatting options from C are available in perl, so you can do stuff like this:
|