re.sub() method in Python parts of a string that match a given regular expression pattern with a new substring. This method provides a powerful way to modify strings by replacing specific patterns, which is useful in many real-life tasks like text processing or data cleaning.
Python
import re
a = "apple orange apple banana"
pattern = "apple"
repl = "grape"
# Replace all occurrences of "apple" with "grape"
result = re.sub(pattern, repl, a)
print(result)
Outputgrape orange grape banana
Syntax of re.sub()
re.sub(pattern, repl, string, count=0, flags=0)
Parameters
- pattern: The regular expression pattern we want to match.
- repl: The string that will replace each match.
- string: The string where replacements will be made.
Return
- The return type of re.sub() is a string.
Using Groups in re.sub()
If regular expression has capture groups(defined by parentheses()), we can use the groups in the replacement string using \1,\3,etc., or by using replacement function.
Python
import re
a = "John 25, Jane 30, Jack 22"
# Match name and age
pattern = r"(\w+) (\d+)"
# Use age first, then name
repl = r"\2 years old, \1"
# Swap names and ages
result = re.sub(pattern, repl, a)
print(result)
Output25 years old, John, 30 years old, Jane, 22 years old, Jack
Explanation:
- This code uses re.sub() to find all matches of a name followed by an age (e.g., "John 25") in the string and swaps the order, placing the age first followed by the name.
- The replacement string \2 years old, \1 uses the second capture group (age) and the first capture group (name) to format the output.
Limiting the Number of Replacements
To limit the number of replacements in re.sub(), use the count parameter. By default, count=0, meaning all occurrences are replaced. Specifying a positive integer limits the number of replacements to that value.
Python
import re
a = "apple orange apple banana"
pattern = "apple"
repl = "grape"
# Replace only the first occurrence of "apple"
result = re.sub(pattern, repl, a, count=1)
print(result)
Outputgrape orange apple banana
Explanation:
- This code uses
re.sub()
to replace only the first occurrence of the word "apple"
in the string a
with "grape"
, as specified by the count=1
parameter. - Subsequent occurrences of
"apple"
are left unchanged in the result.