A regular expression is a sequence of symbols that identifies a set of strings. They are made with special characters and sequences. In Python, a regular expression can be used through the re
module.
In this shot, we will learn how to make a simple regex.
.
: matches any text except a newline.
^
: matches the start of the string.
$
: matches end of the string (before a newline).
*
: matches zero or more patterns of a certain regex.
+
: matches one or more patterns of a certain regex.
?
: matches zero or one repetition of the previous regex.
{x}
: matches exactly x
copies of the previous regex. It can be extended specifying n
repetitions with {x, y}
and with ?
.
\
: escapes special characters.
[]
: matches a set of characters.
|
: work as OR
between two regexes.
()
: matches whatever regex is inside it and can be extended with ?
.
A special sequence is identified by a \
. For example, we can use:
\A
: matches the start of the string.
\b
: matches an empty string only if it’s not at the beginning of the string.
\S
: matches any character which is not an empty string.
The re
module also provides different methods to interact with strings, such as:
find()
search()
match()
split()
escape()
More information about regex can be found in the official documentation here.
import rex = re.search("lo", "Hello educative!")print(x)# split at every spacex = re.split("\s", "Hello educative! 2345aa 204 hellow")print(x)