$x[0] = 1; # assign numeric constant $x[1] = "string"; # assign string constant print $m[$y]; # access via variable $x[$c] = $b[$d]; # copy elements $x[$i] = $b[$i]; # $x[$i+$j] = 0; # expressions are okay also $x[$i]++; # increment element $y = $x[$i++]; # increment index (not the element);
You can assign a list literal to an array or to a list of scalars:
($x, $y, $z) = (1, 2, 3); # $x = 1, $y = 2, $z = 3 ($m, $n) = ($n, $m) # this swap actually works @nums = (1..10); # $nums[0]=1, $nums[1] = 2, ... ($x,$y,$z) = (1,2) # $x=1, $y=2, $z is undef @t = (); # t is defined with no elements ($x[1],$x[0])=($x[0],$x[1]); # swap works! @kudomono=('apple','orange'); # list with 2 elements @kudomono=qw/ apple orange /; # same list with 2 elements
Sometimes you can operate over an entire array. Use the @array name:
@x = @y; # copy y to x @y = 1..1000; # parentheses are not required @lines = <STDIN> # very useful! print @lines; # works in Perl 5
@a = ('a','b','c','d'); print @a; abcd
@a = ('a','b','c','d'); print "@a"; a b c d
Generally, if you specify an array in a scalar context, the value returned is the number of elements in the array.
@array1 = ('a',3,'b',4,'c',5); # assign array1 the values of list @array2 = @array1; # assign array2 the values in array1 $m = @array2; # $m now has value 6 $n = $m + @array1; # $n now has the value 12
Perl arrays can be of any size, and the number of elements can very during execution.
my @fruit; # has no elements initially $fruit[0] = "apple"; # now has one element $fruit[1] = "orange"; # now has two elements $fruit[99] = "plum"; # now has 100 elements, most of which are undef
Perl has a special scalar form $#arrayname that returns a scalar value that is equal to the index of the last element in the array.
for($i = 0; $i <= $#arr1; $i++) { print "$arr1[$i]\n"; }
You can also this special scalar form to truncate an array:
@arr = (0..99); # arr has 100 elements $#arr = 9; # now it has 10; print "@arr"; 0 1 2 3 4 5 6 7 8 9
A negative array index is treated as being relative to the end of the array:
@arr = 0..99; print $arr[-1]; # similar to using $arr[$#arr] 99 print $arr[-2]; 98
push @nums, $i; push @answers, "yes"; push @a, 1..5; push @a, @answers; # appends the elements of @answers to @a pop @a; push(@a,pop(@b)); # moves the last element of @b to end of @a @a = (); @b = (); push(@b,pop(@a)); # @b now has one undef value
@a = 0..9; unshift @a, 99; # now @a = (99,0,1,2,3,4,5,6,7,8,9) unshift @a, ('a','b'); # now @a = ('a','b',99,0,1,2,3,4,5,6,7,8,9) $x = shift @a; # now @x = 'a';
You can use foreach to process each element of an array or list. It follows the form:
for each $SCALAR (@ARRAY or LIST) { <statement list> }
(You can also use map for similar purposes.)
foreach $x (@x) { print "$x\n"; } map {print "$_\n";} @a; foreach $item(qw/ apple pear lemon /) { push @fruits, $item; } map {push @fruits, $_} qw/ apple pear lemon /;
$_ is the default variable (and is used in the previous map() examples.) It is used as a default at various times, such as when reading input, writing output, and in the foreach and map constructions.
while(<STDIN>) { print; } $sum = 0; foreach(@arr) { $sum += $_; } map { $sum += $_ } @arr;
Reading from <> causes a program to readin from the files specified on the command line or stdin if no files are specified.
#!/usr/bin/perl -w use strict; while(<>) { print; }
You can use this either with stdin or by naming files as arguments.
There is a built-in array called ARGV which contains the command line arguments passed in by the calling program.
Note that unlike C, $ARGV[0] is the first argument, not the name of the Perl program being invoked.
#!/usr/bin/perl -w # do the equivalent of a shell’s echo: use strict; my $a; while($a = shift @ARGV) { print "$a "; } print "\n";
#!/usr/bin/perl -w # count the number of arguments use strict; my $count = 0; map { $count++ } @ARGV; print "$count\n";
Perl has three interesting operators to affect looping: next, last, and redo.
The next operator starts the next iteration of a loop immediately, much as continue does in C.
#!/usr/bin/perl -w # sum the positive elements of an array to demonstrate next use strict; my $sum = 0; my @arr1 = -10..10; foreach(@arr1) { if($_ < 0) { next; } $sum += $_; } print $sum;
#!/usr/bin/perl -w # read up to 100 items, print their sum use strict; my $sum = 0; my $count = 0; while() { $sum += $_; $count++; if($count == 100) { last; } } print "\$count == $count, \$sum == $sum \n";
The rarely used redo operator goes back to the beginning a loop block, but it does not do any retest of boolean conditions, it does not execute any increment-type code, and it does not change any positions within arrays or lists.
#!/usr/bin/perl -w # demonstrate the redo operator use strict; my @strings = qw/ apple plum pear peach strawberry /; my $answer; foreach(@strings) { print "Do you wish to print '$_'? "; chomp($answer = uc(<>)); if($answer eq "YES") { print "PRINTING $_ ...\n"; next; } if($answer ne "NO") { print "I don't understand your answer '$answer'! Please use either YES or NO!\n"; redo; } }
If used to return a list, then it reverses the input list.
If used to return a scalar, then it first concatenates the elements of the
input list and then reverses all of the characters in that string.
Also, you can reverse a hash, by which the returned hash has the
keys and values swapped from the original hash. (Duplicate value
→ key in the original hash are chosen randomly for the new key →
value.)
#!/usr/bin/perl -w # demonstrate the reverse function use strict; my @strings = qw/ apple plum pear peach strawberry /; print "\@strings = @strings\n"; my @reverse_list = reverse(@strings); my $reverse_string = reverse(@strings); print "\@reverse_list = @reverse_list\n"; print "\$reverse_string = $reverse_string\n";
#!/usr/bin/perl -w # demonstrate the reverse operator use strict; my %strings = ( 'a-key' , 'a-value', 'b-key', 'b-value', 'c-key', 'c-value' ); print "\%strings = "; map {print " ( \$key = $_ , \$value = $strings{$_} ) "} (sort keys %strings); print " \n"; my %reverse_hash = reverse(%strings); print "\%reverse_hash = "; map {print " ( \$key = $_ , \$value = $reverse_hash{$_} ) "} (sort keys %reverse_hash); print " \n ";
#!/usr/bin/perl -w # demonstrate the reverse operator for hash with duplicate values use strict; my %strings = ( 'a-key' , 'x-value', 'b-key', 'x-value', 'c-key', 'x-value' ); print "\%strings = "; map {print " ( \$key = $_ , \$value = $strings{$_} ) "} (sort keys %strings); print " \n"; my %reverse_hash = reverse(%strings); print "\%reverse_hash = "; map {print " ( \$key = $_ , \$value = $reverse_hash{$_} ) "} (sort keys %reverse_hash); print " \n ";
#!/usr/bin/perl -w # demonstrate the reverse operator use strict; my $test = reverse(qw/ 10 11 12 /); print "\$test = $test\n";
The sort function is only defined to work on lists, and will only return sensible items in a list context. By default, sort sorts lexically.
# Example of lexical sorting @list = 1..100; @list = sort @list; print "@list "; 1 10 100 11 12 13 14 15 16 17 18 19 2 20 21 22 23 24 25 26 27 28 29 3 30 31 32 33 34 35 36 37 38 39 4 40 41 42 43 44 45 46 47 48 49 5 50 51 52 53 54 55 56 57 58 59 6 60 61 62 63 64 65 66 67 68 69 7 70 71 72 73 74 75 76 77 78 79 8 80 81 82 83 84 85 86 87 88 89 9 90 91 92 93 94 95 96 97 98 99
You can define an arbitrary sort function. Our earlier mention of the <=> operator comes in handy now:
# Example of numerical sorting @list = 1..100; @list = sort { $a <=> $b } @list; print "@list "; @list = 1..100; @list = sort { $a <=> $b } @list; print "@list"; 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
The $a and $b in the function block are actually package global variables, and should not be declared by you as my variables.
@words = qw/ apples Pears bananas Strawberries cantaloupe grapes Blueberries; @words_alpha = sort @words; @words_noncase = sort { uc($a) cmp uc($b) } @words; print "\@words_alpha = @words_alpha\n"; print "\@words_noncase = @words_noncase\n"; # yields: @words_alpha = Blueberries Pears Strawberries apples bananas cantaloupe grapes; @words_noncase = apples bananas Blueberries cantaloupe grapes Pears Strawberries;
We have already used a few examples of hashes. Let's go over exactly what is happening with them:
$names[12101] = 'James'; $names[12101] = 'Bob'; $name = $names[12101]; $name = $names[11111]; # overwrites value 'James' # retrieve value 'Bob'; # undefined value returns undef %hash = ('1', '1-value', 'a', 'a-value', 'b', 'b-value'); @array = ('a'); print $hash{@array}; # yields 1-value
%names = (1, 'Bob', 2, 'James'); foreach(sort(keys(%names))) { print "$_ --> $names{$_}\n"; } # yields 1 --> Bob 2 --> James map { print "$_ --> $names{$_}\n"; } sort(keys(%names)); # yields 1 --> Bob 2 --> James
As might have been gleaned from before, you can use the % character to refer a hash as a whole:
%new_hash = %old_hash; %fruit_colors = ( 'apple' , 'red' , 'banana' , 'yellow' ); %fruit_colors = ( 'apple' => ’red’ , 'banana' => 'yellow' ); print "%fruit_colors\n"; # only prints '%fruit_colors', not keys @fruit_colors = %fruit_colors; print "@fruit_colors\n"; # now you get output... # yields banana yellow apple red
You can extract just the hash keys into an array with the keys function. You can extract just the hash values into an array with the values function.
%fruit_colors = ( 'apple' => 'red' , 'banana' => 'yellow' ); @keys = keys(%fruit_colors); @values = values(%fruit_colors); print "\@keys = '@keys' , \@values = '@values'\n"; # yields @keys = 'banana apple' , @values = 'yellow red'
Perl has a "stateful" function each that allows you to iterate through the keys or the key-value pairs of a hash.
%fruit_colors = ( 'apple' => 'red' , 'banana' => 'yellow' ); while( ($key, $value) = each(%fruit_colors) ) { print "$key --> $value\n"; }
Note: if you need to reset the iterator referred to by each, you can just make a call to either keys(%fruit_colors) or values(%fruit_colors) – so don’t do that accidentally!
%fruit_colors = ( 'apple' => 'red' , 'banana' => 'yellow' ); while( ($key, $value) = each(%fruit_colors) ) { print "$key --> $value\n"; # ... @k = keys(%fruit_colors); # resets iterator!!! } # yields loop! banana --> yellow banana --> yellow banana --> yellow banana --> yellow banana --> yellow ...
You can check if a key exists in hash with the exists function:
if(exists($hash{'SOMEVALUE'}) { }
You can remove a key-value pair from a hash with delete:
delete($hash{'SOMEVALUE'});
printf in Perl is very similar to that of C.
printf is most useful when when printing scalars. Its first (non-filehandle) argument is the format string, and any other arguments are treated as a list of scalars:
printf "%s %s %s %s", ("abc", "def") , ("ghi", "jkl"); # yields abc def ghi jkl
Some of the common format attributes are
printf "%7d\n", 123; # yields 123 printf "%10s %-10s\n","abc","def"; # yields abc def
printf "%10.5f %010.5f %-10.5f\n",12.1,12.1,12.1; # yields 12.10000 0012.10000 12.10000 $a = 10; printf "%0${a}d\n", $a; # yields 0000000010
Much information can be found at man perlre.
Perl builds support for regular expressions as a part of the language like awk but to a greater degree. Most languages instead simply give access to a library of regular expressions (C, PHP, Javascript, and C++, for instance, all go this route.)
Perl regular expressions can be used in conditionals, where if you find a match then it evaluates to true, and if no match, false.
$_ = "howdy and hello are common"; if(/hello/) { print "Hello was found!\n"; } else { print "Hello was NOT found\n"; } # yields Hello was found!
/abc/ # Matches "abc" /a.c/ # Matches "a" followed by any character (except newline) and then a "c" /ab?c/ # Matches "ac" or "abc" /ab*c/ # Matches "a" followed by zero or more "b" and then a "c" /ab|cd/ # Matches "abd" or "acd" /a(b|c)+d # Matches "a" followed by one or more "b" or "c", and then a "d" /a[bcd]e/ # Matches "abe", "ace", or "ade" /a[a-zA-Z0-9]c/ # Matches "a" followed one alphanumeric followed by "c" /a[^a-zA-Z]/ # Matches "a" followed by anything other than alphabetic character
You can use the following as shortcuts to represent character classes:
You can specify numbers of repetitions using a curly bracket syntax:
a{1,3} # "a", "aa", or "aaa" a{2} # "aa" a{2,} # two or more "a"
Perl regular expression syntax lets you work with context by defining a number of "anchors": \A, ^, \Z, $, \b.
/\ba/ # Matches if "a" appears at the beginning of a word /a$/ # Matches if "a" appears at the end of a line /\Aa$\Z/ # Matches if a line is exactly "a" (uncommon) /^a$/ # Matches if a line is exactly "a" (much more common)
\b refers to a word boundary.
Parentheses are also used to remember substring matches.
Backreferences can be used within the pattern to refer to already matched bits.
Memory variables can be used after the pattern has been matched against.
A backreference looks like \1, \2, etc.
It refers to an already matched memory reference.
Count the left parentheses to determine the back reference number.
/(a|b)\1/ # match "aa" or "bb" /((a|b)c)\1/ # match "acac" or "bcbc" /((a|b)c)\2/ # match "aba" or "bcb" /(.)\1/ # match any doubled characters except newline /\b(\w+)\s+\b\1\s/ # match any doubled words /(['"])(.*)\1/ # match strings enclosed by single or double quotes
For example, consider the last backreference example:
$_ = "asfasdf 'asdlfkjasdf ' werklwerj'"; if(/(['"])(.*)\1/) { print "matches $2\n"; } # yields matches asdlfkjasdf ' werklwerj
A memory variable has the form $1, $2, etc.
It indicates a match from a grouping operator, just as back reference does, but after the regular expression has been executed.
$_ = " the larder "; if(/\s+(\w+)\s+/) { print "match = '$1'\n"; } # yields match = 'the'
Up to this point, we have considered only operations against $_. Any scalar can be tested against with the =~ and !~ operators.
"STRING" =~ /PATTERN/; "STRING" !~ /PATTERN/;
$line = "not an exit line"; if($line !~ /^exit$/) { print "$line\n"; } # yields not an exit line # skip over blank lines... if($line =~ /$^/) { next; }
You don't have to necessarily use explicit backreferences and memory variables. Perl also gives you three default variables that can be used after the application of any regular expression; they refer to the portion of the string matched by the whole regular expression.
$` # refers to the portion of the string before the match $& # refers to the match itself $' # refers to the portion of the string after the match
$_ = "this is a test"; /is/; print "before: $` \n"; print "after: $' \n"; print "match: $& \n"; # yields before: th after: a test match: is
#!/usr/bin/perl -w use strict; while(<>) { /=/; print "$` =: $'\n"; }
You can use other delimiters (some are paired items) rather than just a slash, but you must use the "m" to indicate this. (See man perlop for a good discussion.)
# not so readable way to look for a URL reference if ($s =~ /http:\/\//) # better if ($s =~ m^http://^ )
There are a number of modifiers that you can apply to your regular expression pattern: