The Rule Based Detection model detects unwanted, hateful, sexual and toxic content in any user-generated text: comments, messages, posts, reviews, usernames etc.
This model uses advanced rules and pattern-matching to detect unwanted content. Its detection strength has been designed to prevent obfuscation attempts by users. Texts are moderated with a very low latency. The moderation results contain descriptions of any flagged words or expressions.
This model uses rule-based pattern matching. If you are looking for deep-learning models, head to the text classification models.
The rules are grouped into categories, to help you implement custom filters based on the type of flagged content.
Category | Description |
profanity | The profanity category contains following types of terms and expressions:
|
personal (pii) | The personal category contains following types of terms and expressions:
|
link | URLs to external websites and pages. We can flag domains known to host unsafe or unwanted content read more |
extremism | words, expressions or slogan related to extremist ideologies, people or events read more |
weapon | names or terms that related to guns, rifles and firearms read more |
medical | names related to medical drugs read more |
drug | names related to recreational drugs read more |
self-harm | terms related to suicide and self-inflected injuries |
violence | expressions of violence such as kicking, punching or harming someone, or threatening to do so |
spam | expressions commonly associated with spam or with circumvention, i.e. attempts to send or lure the user to another platform |
content-trade | requests or messages encouraging users to send, exchange or sell photos or videos of themselves |
money-transaction | requests or messages encouraging users to send money |
blacklist (custom) | custom list of terms and expressions read more |
To help you sort the bad from the very bad, the API also returns an intensity score for each profanity. The intensity is a way to rank toxic language from mild to extreme. As a rule of thumb, low intensity profanity might be acceptable in many contexts, whereas high intensity profanity will almost always be problematic.
Intensity | Description |
high | The highest level of profanity, with words and expressions that are problematic in most if not all contexts. |
medium | Medium level profanity. Might be acceptable in some adult-only circles, while being unacceptable in public or in front of children. |
low | Lowest level of profanity. Mild language. |
English is the default language used for text moderation. This means that if you do not specify anything, the API engine will assume that the text is in english and will process it as such.
If you know with high confidence what language is used in the message, for instance because your users tend to mostly speak one language, you can set the language with the lang parameter. To do so, use the ISO 639-1 codes for languages:
Language | Code |
English (default) | en |
Chinese | zh |
Danish | da |
Dutch | nl |
Finnish | fi |
French | fr |
German | de |
Italian | it |
Norwegian | no |
Polish | pl |
Portuguese | pt |
Spanish | es |
Swedish | sv |
Tagalog / Filipino | tl |
Turkish | tr |
If you are unsure about the language used by a user, you can specify multiple languages as a comma-separated list. For instance en,fr,es for users that might write in english, french or spanish. The API will then automatically detect the language and apply the corresponding rules. We recommend specifying the shortest possible list as this will yield better results both in speed and accuracy.
Other languages are available upon request. Please get in touch regarding your language needs.
The Text Moderation API is a lot stronger than word-based filters. It uses advanced language analysis to detect objectionable content, even when users specifically attempt to circumvent your filters.
As an example, for each word, the API will be looking up millions of variations that might be used to evade filtering, while smartly ignoring all situations that might generate false positives. Here is a partial list of the situations that are covered:
Characters being repeated to avoid basic word filtering
Replacement of characters with typographical symbols
Adding spaces, punctuation and more within words
Unusual non-ASCII characters used to evade basic word filters
Changing word spellings while retaining their original meaning or pronunciation
Replacing some alphabetical characters with a combination of punctuation, digits and letters
Catching profanity based embeddings, while smartly ignoring potential false positives such as bassguitar amass...
Let's say you want to moderate the following text item:
You are ṣẗ_ȕ_ṕıď
Simply send a POST request containing the UTF-8 formatted text along with the ISO 639-1 language code (such as en for english). Here is an example:
curl -X POST 'https://api.sightengine.com/1.0/text/check.json' \
-F 'text=You are ṣẗ_ȕ_ṕıď' \
-F 'lang=en' \
-F 'categories=profanity,personal,link,drug,weapon,spam,content-trade,money-transaction,extremism,violence,self-harm,medical' \
-F 'mode=rules' \
-F 'api_user={api_user}' \
-F 'api_secret={api_secret}'
# this example uses requests
import requests
import json
data = {
'text': 'You are ṣẗ_ȕ_ṕıď',
'mode': 'rules',
'lang': 'en',
'categories': 'profanity,personal,link,drug,weapon,spam,content-trade,money-transaction,extremism,violence,self-harm,medical',
'api_user': '{api_user}',
'api_secret': '{api_secret}'
}
r = requests.post('https://api.sightengine.com/1.0/text/check.json', data=data)
output = json.loads(r.text)
$params = array(
'text' => 'You are ṣẗ_ȕ_ṕıď',
'lang' => 'en',
'categories' => 'profanity,personal,link,drug,weapon,spam,content-trade,money-transaction,extremism,violence,self-harm,medical',
'mode' => 'rules',
'api_user' => '{api_user}',
'api_secret' => '{api_secret}',
);
// this example uses cURL
$ch = curl_init('https://api.sightengine.com/1.0/text/check.json');
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $params);
$response = curl_exec($ch);
curl_close($ch);
$output = json_decode($response, true);
// this example uses axios and form-data
const axios = require('axios');
const FormData = require('form-data');
data = new FormData();
data.append('text', 'You are ṣẗ_ȕ_ṕıď');
data.append('lang', 'en');
data.append('categories', 'profanity,personal,link,drug,weapon,spam,content-trade,money-transaction,extremism,violence,self-harm,medical');
data.append('mode', 'rules');
data.append('api_user', '{api_user}');
data.append('api_secret', '{api_secret}');
axios({
url: 'https://api.sightengine.com/1.0/text/check.json',
method:'post',
data: data,
headers: data.getHeaders()
})
.then(function (response) {
// on success: handle response
console.log(response.data);
})
.catch(function (error) {
// handle error
if (error.response) console.log(error.response.data);
else console.log(error.message);
});
See request parameter description
Parameter | Type | Description |
text | string | UTF-8 encoded text to moderate |
mode | string | comma-separated list of modes. Modes are rules for the rule-based model or ml for ML models |
categories | string | comma-separated list of categories to check. Possible values: profanity, personal, link, drug, weapon, violence, self-harm, medical, extremism, spam, content-trade, money-transaction (optional) |
lang | string | comma-separated list of target languages |
opt_countries | string | comma-separated list of target countries for phone number detection (optional) |
list | string | id of a custom list to be used for rule-based moderation (optional) |
api_user | string | your API user id |
api_secret | string | your API secret |
The JSON response contains a description of matches with positions within the text string.
{
"status": "success",
"request": {
"id": "req_gjEHkUh921AwUFKLWLllJ",
"timestamp": 1716557224.116726,
"operations": 1
},
"profanity": {
"matches": [
{
"type": "insult",
"intensity": "low",
"match": "stupid",
"start": 8,
"end": 15
}
]
},
"personal": {
"matches": []
},
"link": {
"matches": []
},
"medical": {
"matches": []
},
"weapon": {
"matches": []
},
"extremism": {
"matches": []
},
"drug": {
"matches": []
},
"self-harm": {
"matches": []
},
"violence": {
"matches": []
},
"content-trade": {
"matches": []
},
"money-transaction": {
"matches": []
},
"spam": {
"matches": []
}
}
See our full list of Text models for details on other filters and checks you can run on your text content. You might also want to check our Image & Video models to moderate images and videos. This includes moderation of text in images/videos.
Was this page helpful?