Python - RegEx

Advertisements

A RegEx, or Regular Expression is a sequence of characters that defines a search pattern. It is used to check whether a string contains specified search pattern or not. Please see the below mentioned search pattern:

^P....n$ The above search pattern can be used to check whether a string contains six characters which starts with P and ends with n. Please note that Python has a in-built module re which need to be imported to work with Regular Expression. Example: In the below example, ^p....n$ search pattern is checked in string called MyString.

import re

MyString = "Python"
x = re.search("^P....n$", MyString) if(x): print("Pattern found.") else: print("Pattern not found.") MyString = "Python!." x = re.search("^P....n$", MyString)
if(x):
print("Pattern found.")
else:
print("Pattern not found.")


Output

Pattern found.

Pattern not found.


MetaCharacters

Metacharacters are the special characters which are interpreted in a different way by RegEx engine. The metacharacters are:

CharacterDescriptionExample
[]To specify a set to characters"[a-z]"
.To specify any character except new line"He..o"
^To specify starts with character(s)"^Hello"
$To specify ends with character(s)"World$"
*To check zero or more occurrences of specified character(s)"Helx*"
+To check one or more occurrences of specified character(s)"Helx+"
{}To check the specified number of occurrences of specified character(s)"Hel{2}"
?To check zero or one occurrences of specified character(s)"He?l"
|To specify either or"go|come"
()To group sub-patterns"(x|y|z)abc"
\To escape various characters including all metacharacters"\$" Special Sequences Metacharacters are the special characters which are interpreted in a different way by RegEx engine. The metacharacters are: Returns a match if the specified characters are at the beginning of the string "\AThe" Returns a match where the specified characters are at the beginning or at the end of a word r"\bain" r"ain\b" Returns a match where the specified characters are present, but NOT at the beginning (or at the end) of a word r"\Bain" r"ain\B" Returns a match where the string contains digits (numbers from 0-9) "\d" Returns a match where the string DOES NOT contain digits "\D" Returns a match where the string contains a white space character "\s" Returns a match where the string DOES NOT contain a white space character "\S" Returns a match where the string contains any word characters (characters from a to Z, digits from 0-9, and the underscore _ character) "\w" Returns a match where the string DOES NOT contain any word characters "\W" Returns a match if the specified characters are at the end of the string "Spain\Z" CharacterDescriptionExample \ATo specify a set to characters"[a-z]" \bTo specify any character except new line"He..o" \BTo specify starts with character(s)"^Hello" \dTo specify ends with character(s)"World$"
\DTo check zero or more occurrences of specified character(s)"Helx*"
\sTo check one or more occurrences of specified character(s)"Helx+"
\STo check the specified number of occurrences of specified character(s)"Hel{2}"
\wTo check zero or one occurrences of specified character(s)"He?l"
\WTo specify either or"go|come"
\ZTo group sub-patterns"(x|y|z)abc"
\To escape various characters including all metacharacters"\\$"