Models / Text / Profanity Detection

Rule-based Profanity Detection

Overview

The Profanity Detection model detects unwanted, hateful, sexual and toxic content in any user-generated text: comments, messages, posts, reviews, usernames etc.

This model is part of the Text Moderation API, along with other models such as the PII detection model. Each text item that is submitted to the API gets checked for profanity. You instantly receive a description of any toxic language found.

This model uses rule-based pattern matching. If you are looking for deep-learning models, head to the text classification models.

Categories of Profanity

For each detected profanity, the API returns the corresponding category to help you adapt actions to the type of content found.

Category	Description	Example
sexual	term or expression that refers to sexual acts, sexual organs, body parts or bodily fluids typically associated with sexual acts.	here's my d*ck
discriminatory	discriminatory and derogatory content. Mostly hate speech that instigates violence or hate against groups based on specific characteristics such as religion, national or ethnic origin, sexual orientation or gender identity.	he's a tard
insult	words or phrases that undermine the dignity or honor of an individual, that are signs of disrespect and are generally used to refer to someone	you fatso!
inappropriate	inappropriate language: swear words, slang, familiar/informal or socially inappropriate/unacceptable words or phrases to describe something, or to talk to someone	what the cr@p?
grawlix	string of typographical symbols that are typically used in place of obscenity or profanity	you #@$%!!

Profanity Intensity

To help you sort the bad from the very bad, the API also returns an intensity score for each profanity. The intensity is a way to rank toxic language from mild to extreme. As a rule of thumb, low intensity profanity might be acceptable in many contexts, whereas high intensity profanity will almost always be problematic.

Intensity	Description
high	The highest level of profanity, with words and expressions that are problematic in most if not all contexts.
medium	Medium level profanity. Might be acceptable in some adult-only circles, while being unacceptable in public or in front of children.
low	Lowest level of profanity. Mild language.

Language support

English is the default language used for text moderation. This means that if you do not specify anything, the API engine will assume that the text is in english and will process it as such.

If you know with high confidence what language is used in the message, for instance because your users tend to mostly speak one language, you can set the language with the lang parameter. To do so, use the ISO 639-1 codes for languages:

Language	Code
English (default)	en
Chinese	zh
Danish	da
Dutch	nl
Finnish	fi
French	fr
German	de
Italian	it
Korean	ko
Norwegian	no
Polish	pl
Portuguese	pt
Russian	ru
Spanish	es
Swedish	sv
Tagalog / Filipino	tl
Turkish	tr

If you are unsure about the language used by a user, you can specify multiple languages as a comma-separated list. For instance en,fr,es for users that might write in english, french or spanish. The API will then automatically detect the language and apply the corresponding rules. We recommend specifying the shortest possible list as this will yield better results both in speed and accuracy.

Other languages are available upon request. Please get in touch regarding your language needs.

Detection strength

The Text Moderation API is a lot stronger than word-based filters. It uses advanced language analysis to detect objectionable content, even when users specifically attempt to circumvent your filters.

As an example, for each word, the API will be looking up millions of variations that might be used to evade filtering, while smartly ignoring all situations that might generate false positives. Here is a partial list of the situations that are covered:

biiiiitttch

Repetitions

Characters being repeated to avoid basic word filtering

$#!t

Replacements

Replacement of characters with typographical symbols

B__* 0 __ 0 -- B__s

Insertions

Adding spaces, punctuation and more within words

ℙ🅤ᵴṨɏ

Obfuscation and Special characters

Unusual non-ASCII characters used to evade basic word filters

phok yu

Spelling mistakes & Phonetic Variations

Changing word spellings while retaining their original meaning or pronunciation

3as¯|¯AR|)

Leet speak

Replacing some alphabetical characters with a combination of punctuation, digits and letters

123FuckBlablah

Smart embeddings

Catching profanity based embeddings, while smartly ignoring potential false positives such as bassguitar amass...

Use the model

Let's say you want to moderate the following text item:

Have s_*_x or be a t@rd

Simply send a POST request containing the UTF-8 formatted text along with the ISO 639-1 language code (such as en for english). Here is an example:


curl -X POST 'https://api.sightengine.com/1.0/text/check.json' \
  -F 'text=Have s_*_x or be a t@rd' \
  -F 'lang=en' \
  -F 'mode=rules' \
  -F 'api_user={api_user}' \
  -F 'api_secret={api_secret}'


# this example uses requests
import requests
import json

data = {
  'text': 'Have s_*_x or be a t@rd',
  'mode': 'rules',
  'lang': 'en',
  'api_user': '{api_user}',
  'api_secret': '{api_secret}'
}
r = requests.post('https://api.sightengine.com/1.0/text/check.json', data=data)

output = json.loads(r.text)


$params = array(
  'text' => 'Have s_*_x or be a t@rd',
  'lang' => 'en',
  'mode' => 'rules',
  'api_user' => '{api_user}',
  'api_secret' => '{api_secret}',
);

// this example uses cURL
$ch = curl_init('https://api.sightengine.com/1.0/text/check.json');
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $params);
$response = curl_exec($ch);
curl_close($ch);

$output = json_decode($response, true);


// this example uses axios and form-data
const axios = require('axios');
const FormData = require('form-data');

data = new FormData();
data.append('text', 'Have s_*_x or be a t@rd');
data.append('lang', 'en');
data.append('mode', 'rules');
data.append('api_user', '{api_user}');
data.append('api_secret', '{api_secret}');

axios({
  url: 'https://api.sightengine.com/1.0/text/check.json',
  method:'post',
  data: data,
  headers: data.getHeaders()
})
.then(function (response) {
  // on success: handle response
  console.log(response.data);
})
.catch(function (error) {
  // handle error
  if (error.response) console.log(error.response.data);
  else console.log(error.message);
});

See request parameter description

Parameter	Type	Description
text	string	UTF-8 encoded text to moderate
mode	string	comma-separated list of modes. Modes are rules for the rule-based model or ml for ML models
categories	string	comma-separated list of categories to check. Possible values: profanity, personal, link, drug, weapon, violence, self-harm, medical, extremism, spam, content-trade, money-transaction (optional)
lang	string	comma-separated list of target languages
opt_countries	string	comma-separated list of target countries for phone number detection (optional)
list	string	id of a custom list to be used for rule-based moderation (optional)
api_user	string	your API user id
api_secret	string	your API secret

The JSON response contains a description of profanities with positions within the text string.


{
  "status": "success",
  "request": {
    "id": "req_6cujQglQPgGApjI5odv0P",
    "timestamp": 1471947033.92,
    "operations": 1
  },
  "profanity": {
    "matches": [
      {
          "type": "sexual",
          "intensity": "medium",
          "match": "sex",
          "start": 5,
          "end": 9
      },
      {
          "type": "discriminatory",
          "intensity": "high",
          "match": "tard",
          "start": 19,
          "end": 22
      }
    ]
  },
  "personal": {
    "matches": []
  },
  "link": {
    "matches": []
  },
}

Any other needs?

See our full list of Text models for details on other filters and checks you can run on your text content. You might also want to check our Image & Video models to moderate images and videos. This includes moderation of text in images/videos.

Was this page helpful?

Products

MODERATION

REDACTION

REFERENCE

Rule-based Profanity Detection

Table of contents

Overview

Categories of Profanity

Profanity Intensity

Language support

Detection strength

Use the model

Any other needs?