< Perl Tutorial

Perl regular expressions

← Previous


Perl Regular Expressions

The term "Regular Expression" (now commonly abbreviated to "RegExp" or even "RE") simply refers to a pattern that follows the rules of syntax outlined in the rest of this chapter. Regular expressions are not limited to perl – Unix utilities such as sed and egrep use the same notation for finding patterns in text. Regular expressions work a little like double-quoted strings; variables and metacharacters are interpolated. This allows us to store patterns in variables and determine what we are matching when we run the program – we don't need to have them hard-coded in.

The basic method for applying a regular expression is to use the pattern binding operators =~ and !~. There are three regular expression operators in Perl:

  1. Match Regular Expression - m//
  2. Substitute Regular Expression - s///
  3. Transliterate Regular Expression - tr///

The following are the characters that are given special meaning within a regular expression, which we will need to backslash if we want to use literally:
.   *   ?   +   [   ]   (   )   {   }   ^   $   |   \    
Any other characters automatically assume their literal meanings.

Match Operator

The match operator, m//, is used to match a string or statement to a regular expression. The following code shows how to use this operator:

$bar = "This is foo and again foo";
if ($bar =~ /foo/){
   print "First time is matching\n";
}else{
   print "First time is not matching\n";
}

$bar = "foo";
if ($bar =~ /foo/){
   print "Second time is matching\n";
}else{
   print "Second time is not matching\n";
}

The above code produces the following output:

First time is matching
Second time is matching

The match operator supports its own set of modifiers. The following table lists all the modifiers.

 Modifier Description 
 i  Makes the match case insensitive.
 m Specifies that if the string has newline or carriage return characters, the ^ and $ operators will now match against a newline boundary, instead of a string boundary. 
 o Evaluates the expression only once. 
 s Allows use of . to match a newline character. 
 x Allows you to use white space in the expression for clarity. 
 g Globally finds all matches. 
 cg  Allows the search to continue even after a global match fails.

Substitution Operator

The substitution operator, s///, is really just an extension of the match operator that allows us to replace the text matched with some new text. The following code shows the usage of substitution operator:

$string = "The cat sat on the mat";
$string =~ s/cat/dog/;

print "$string\n";

The above code produces the following output:

The dog sat on the mat

The following table lists the modifiers that can be used with the Substitution operator:

 Modifier Description 
 i  Makes the match case insensitive.
 m  Specifies that if the string has newline or carriage return characters, the ^ and $ operators will now match against a newline boundary, instead of a string boundary.
 o  Evaluates the expression only once.
 s  Allows use of . to match a newline character.
 x  Allows you to use white space in the expression for clarity.
 g  Replaces all occurrences of the found expression with the replacement text.
 e  Evaluates the replacement as if it were a Perl statement, and uses its return value as the replacement text.

Translation Operator

Translation is similar, but not identical, to the principles of substitution, but unlike substitution, translation (or transliteration) does not use regular expressions for its search on replacement values. The translation operators are as follows:

tr/SEARCHLIST/REPLACEMENTLIST/cds
y/SEARCHLIST/REPLACEMENTLIST/cds

The translation replaces all occurrences of the characters in SEARCHLIST with the corresponding characters in REPLACEMENTLIST. The following example shows the working of translation operator:

$string = 'The cat sat on the mat';
$string =~ tr/a/o/;

print "$string\n";

The above code produces the following output:

The cot sot on the mot.

The following table lists the modifiers that can be used with the Translation Operator:

 Modifier Description 
 c  Complements SEARCHLIST.
 d  Deletes found but unreplaced characters.
 s  Squashes duplicate replaced characters.
← Previous