Using Regular Expressions in Python with (re) Module
The built-in re-module makes regular expressions (regex) in Python easy. Regular expressions are powerful tools for matching, searching, and manipulating text. Here’s a guide on how to use the re-module in Python.
- Importing the re-module
import re
Here are commonly used methods in the re-module
- re.match()
The re.match() function checks for a match only at the beginning of the string.
import re pattern = r"hello" text = "hello world" match = re.match(pattern, text) if match: print("Match found:",match.group()) else: print("No match")
Output:
Match found: hello
2. re.search()
The research function searches for the first occurrence of the pattern anywhere in the string.
import re pattern = r"world" text ="hello world " match = re.search(pattern , text) if match: print("Found:", match.group()) else: print("Not found")
Output:
Found: world
3. re. findall()
The re. findall function is used to find all occurrences of the pattern and return them as a list
import re pattern ="\d+" text = "I have 2 apples and 3 bananas." matches = re.findall(pattern, text) print("Matches:", matches)
Output: Matches:['2', '3']
3. re.finditer()
The re. funditer function Returns an iterator of match objects for all occurrences.
import re pattern = r"\W+" text ="python is fun !" matches = re.finditer(pattern, text) for match in matches: print("Match:", match.group())
Output:
Match: Match: Match: !
4. re. sub()
Replaces occurrences of a pattern with a specified string.
import re pattern = r"\s+" text ="python is fun" result = re.sub(pattern," ",text) print("Result:", result)
Output:
Result : python is fun
Pattern syntax
- . : Matches any character (except newline).
- ^: Matches the start of the string.
- $: Matches the end of the string.
- *: Matches 0 or more repetitions.
- +: Matches 1 or more repetitions.
- ?: Matches 0 or 1 repetition.
- {n,m}: Matches between n and m repetitions.
- []: Matches any character inside the brackets.
- |: Logical OR
- \: Escapes special characters
5. Flags
you can use flags to modify regex behavior:
- re.IGNORECASE or re.I: Case-insensitive matching.
- re.MULTILINE or re.M: Matches across multiple lines.
- re.DOTALL or re.S: . Matches the newline as well.
Example with a flag:
import re pattern =r"python " text = "python is amazing" match = re.search(pattern, text, re.IGNORECASE) if match: print("Match found :", match.group())
Output:
Match found : python
6. compiling regular expressions
For efficiency, you can compile a pattern to reuse it.
import re pattern = re.compile(r"\d+") text = "There are 12 eggs and 34 apples." matches = pattern.findall(text) print("Matches:", matches")
Output:
Matches:['12', '34']
7. Practical Example
Extracting email addresses:
import re text = "contact us at [email protected] or [email protected]" pattern = r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}" emails = re.findall(pattern,text) print("Emails found:", emails)
Output:
Emails found :['[email protected]', '[email protected]']