What is remove_punct() clean-text in Python?

In this shot, we will learn about the remove_punct() method in Python.

The remove_punct() method is an in-built function that is provided by the clean-text library in Python.

We can use it to clean our data, which has punctuations in it.

We need to install the clean-text from pip to use it in our programs:

pip install clean-text

Syntax

from cleantext import clean

clean(text, no_punct=True, replace_with_punct="$")

Where:

  • clean is the function that is provided by the cleantext.
  • To remove the punctuations, we will set the parameter no_punct to True. Then, it will call the in-built remove_punct() function.
  • If we want to replace the punctuations instead of removing them, we can use the parameter replace_with_punct. Here, we will pass to replace the punctuations in the given text.

Let’s look at an example.

Code example

#import the clean function
from cleantext import clean
text = "This, is a sample text! which [has) punctuations."
#display the text after removing the punctuations
print(clean(text, no_punct=True))

Code explanation

  • Line 2: We import the clean function, which is provided by the cleantext module.
  • Line 4: We provide the text that contains the punctuation.
  • Line 7: We remove the punctuations from the given text with the clean function.

We can replace the punctuations with a string, instead of removing them. We will replace the punctuations with the character in the following code. We will do this by passing a character or string to the replace_with_punct parameter:

#import the clean function
from cleantext import clean
text = "This, is a sample text! which [has) punctuations."
#display the text after removing the punctuations
print(clean(text, no_punct=True, replace_with_punct="$"))

Free Resources