Sometimes, we want to remove punctuation with Python Pandas.
In this article, we’ll look at how to remove punctuation with Python Pandas.
How to remove punctuation with Python Pandas?
To remove punctuation with Python Pandas, we can use the DataFrame’s str.replace
method.
For instance, we write:
import pandas as pd
df = pd.DataFrame({'text': ['a..b?!??', '%hgh&12', 'abc123!!!', '$$$1234']})
df['text'] = df['text'].str.replace(r'[^\w\s]+', '')
print(df)
We call replace
with a regex string that matches all punctuation characters and replace them with empty strings.
Therefore, df
is:
import pandas as pd
df = pd.DataFrame({'text': ['a..b?!??', '%hgh&12', 'abc123!!!', '$$$1234']})
df['text'] = df['text'].str.replace(r'[^\w\s]+', '')
print(df)
replace
returns a new DataFrame column and we assign that to df['text']
.
Therefore, df
is:
text
0 ab
1 hgh12
2 abc123
3 1234
Conclusion
To remove punctuation with Python Pandas, we can use the DataFrame’s str.replace
method.