[ab] is "either an a or a b".
. is "any character (except newline)".
The entire expression is /[ab]..[ab]/
[^xyz] is "anything except an x, y or a z".
The entire expression is /A[^xyz]/
[0123456789] is "any digit".
[0-9] is another way of saying "any digit".
\d is yet another way of saying "any digit".
The entire expression could be any of the following:
/[0123456789][0123456789][0123456789][0123456789][0123456789]/
/[0-9][0-9][0-9][0-9][0-9]/
/\d\d\d\d\d/
/\d{5}/ (we didn't cover this!)
/\$\w+/
Can't do it without something we haven't covered yet! If you
try to use something like /\s+/ it will match any string that contains any whitespace.
<A HREF=blahblah>).
/<[aA]\s+[hH][rR][eE][fF]=.*>/
/([a-zA-Z])\1/
<H2>Hi
Dave</H2> and so should <TITLE>The Test
Answers</TITLE>, but this should not match
<TITLE>Not a match</H2>.
/<(\w+)>.*<\/\1>/
while (<>) { # read input one line at a time
s/0/zero/g; # replace all "0"s with "zero"
s/1/one/g; # replace all "1"s with "one"
s/2/two/g; # replace all "2"s with "two"
s/3/zero/g; # replace all "3"s with "three"
s/4/one/g; # replace all "4"s with "four"
s/5/two/g; # replace all "5"s with "five"
s/6/zero/g; # replace all "6"s with "six"
s/7/one/g; # replace all "7"s with "seven"
s/8/two/g; # replace all "8"s with "eight"
s/9/nine/g; # replace all "9"s with "nine"
print;
}
|
<H1>,</H1> tag pairs with
<H3>,</H3> tags.
while (<>) { # read input one line at a time
s/<H1>/<H3>/g; # replace all "lt;H1>" with "<H3>"
s/<\/H1>/<\/H3>/g; # replace all "</H1>" with "</H3>"
print;
}
|
Here is a better way! A single expression that can replace start or end tags.
while (<>) { # read input one line at a time
s/<(\/?)H1>/<\1H3>/g; # replace all "<H1>" with "<H3>"
# "</H1> with "</H3>"
print;
}
|
while (<>) { # read input one line at a time
s/<[^>]*>//g; # remove anything that starts with "<"
# and ends with ">"
print;
}
|
<HEAD> tag and the
</HEAD> tag. Keep in mind that in HTML newlines
mean nothing - any part of a document can be split amongst lines any possible way.HINT: It is much easier to read the entire sequence of lines in to a single perl scalar variable. Since there are newlines in the single string that contains the entire document - we need to use the "s" modifier to the substitute command if we want "." to match newline.
@lines = <>; # read everyting until EOF
chop(@lines); # get rid if all newlines
$_ = join("",@lines); # combine lines into one giant string
# remove everything between the first and the last
# we need to use the "s" modifier so the ".*" can match a newline!
s/(.*?)<HEAD>.*<\/HEAD>(.*)/\1\2/s;
#print out whatever remains.
print;
|
Joe Smith\t88\t92\t77\n
Your program will accept input in the form of lines that contain
name, value pairs with an equal sign (=) between the
name and the value. Here is a sample input file:
name = Joe Student test1 = 86 test2 = 77 homework = 33 name = Jane Smith test1 = 98 test2 = 35 homework = 85 |
for this input, the output should be this (\t is a tab):
Joe Student\t86\t77\t33 Jane Smith\t98\t35\t85 |
Here is one way to do this:
!/usr/bin/perl
# read in all the lines
@lines = <>;
# get rid of all newlines
chomp(@lines);
# remove junk from each line
foreach $i (@lines) {
$i =~ s/[^=]+=\s(.*)/\1/;
}
#now loop over all lines, handling 4 at a time
for ($i=0;$i<=$#lines;$i=$i+4) {
print join("\t",@lines[$i..$i+3]), "\n";
}
|