Content moderation is an important aspect of maintaining a safe and respectful online environment. OpenAI’s Moderation API provides a tool to automatically check whether content complies with OpenAI’s usage policies. In this Answer, we’ll explore how to implement content moderation using OpenAI’s API in Python.
OpenAI’s Moderation API is designed to identify content that violates specific categories, such as hate speech, harassment, self-harm, sexual content, and violence. The API classifies content into various categories, each with a specific description. For example:
Hate: Content that expresses or promotes hate based on race, gender, etc.
Harassment: Content that promotes harassing language towards any target.
Self-harm: Content that promotes or depicts acts of self-harm.
Sexual: Content meant to arouse sexual excitement.
Violence: Content that depicts death, violence, or physical injury.
The moderation endpoint is free to use when monitoring the inputs and outputs of OpenAI APIs, and it’s continuously being improved for accuracy.
Before we start, you’ll need to have Python installed on your system and the OpenAI Python client library. You can install the latter using pip:
pip install openai
You’ll also need to obtain an API key from OpenAI, which will be used to authenticate your requests.
First, import the OpenAI library and set your API key.
import openaiopenai.api_key = 'your-api-key'
You can create a moderation request by calling the Moderation.create
method and passing the text you want to check.
response = openai.Moderation.create(input="Educative is a website for developers")
The response will contain information about whether the content is flagged and the categories it violates.
import openaiimport osopenai.api_key = os.environ["SECRET_KEY"]response = openai.Moderation.create(input="Input text goes here")flagged = response['results'][0]['flagged']categories = response['results'][0]['categories']category_scores = response['results'][0]['category_scores']print("Flagged:", flagged)print("Categories:", categories)print("Category Scores:", category_scores)
The flagged
field will be set to true
if the content violates OpenAI’s policies, and false
otherwise. The categories
dictionary contains binary flags for each category, and the category_scores
dictionary contains the model’s confidence scores.
For higher accuracy, especially with longer pieces of text, you may consider splitting the text into smaller chunks, each less than 2,000 characters.
OpenAI’s Moderation API offers an easy solution for content moderation, allowing developers to ensure that user-generated content aligns with community guidelines and usage policies. By integrating this API into your applications, you can create a safer and more respectful online environment.
Free Resources