Perl regex or Regular expresions are the fastest way to search, replace, delete some string in a document. It has proven syntax here we discuss some examples and sytax of perl regex.
Special Characters - what they matches
. any character
\w "word" character (alphanumeric and "_")
\W non-word characters
\s whitespace characters
\S non-whitespace characters
\d digit characters
\b word boundary
\B non-(word boundary)
\D non-digit characters
\t tab
\n newline
\r return
\f formfeed
\a alarm (bell, beep, etc)
\e escape
\021 octal char ( in this case 21 octal)
\xf0 hex char ( in this case f0 hexidecimal)
^ Start of String
$ end of String
Brackets and uses
{} occurance controll
[] type controll - classes
() subexpression -grouping
[0-9] numeric
[a-z] small letter alphabets
[A-Z] capital alphabets
if a '+' follows a character it will match more than one occurance characters
if a '*' follows a character it will match zero or more occurance characters
if a '?' follows a character it will match zero or one occurance of characters
| Alternation, stands for 'or'
\ escape character
[^ ] not in the class
Search modifiers
i - Do case-insensitive pattern matching.
m - Treat string as multiple lines. That is, change ``^'' and ``$'' from matching at only the very start or end of the string to the start or end of any line anywhere within the string,
s - Change ``.'' to match any character whatsoever, even a newline. Treat string as single line.
x - Extend your pattern's legibility by permitting whitespace and comments.
Examples
$url=~/^http:\/\//
above return true if it find a match which begins with http://
$url=~/^http:\/\/(([a-zA-Z0-9]+\-?)+\.)*\w+\.[a-z]{2,4}(\.[a-z]{2,})?/
This returns true if if it is well formed url of domains subdomains etc
explanation
(([a-zA-Z0-9]+\-?)+\.)
the first subgroup matches one or more words ending with '.' and it can have any number,hyphen,upper or lowercase letters
\w+\. matches any word character and ends with a'.'
[a-z]{2,4} matches any tlds like us,co,com,org,info etc
(\.[a-z]{2,}) matches any tlds ending with us,in,uk it must have atleast 2 character.
$url=~s/http\:\/\/\.//i
deletes 'http://.' from url
$url=~s/\.htm$/\.php$/i
replaces file namethat ends with .htm to .php
$str=~ s/sam/john/gi
replaces all occurance of sam with john, case insensitive replacement
|