Text Moderation / Guides

Text Moderation - ML models BETA

Overview

By taking context into account, machine learning models can detect problematic content in situations that would otherwise have been missed or incorrectly flagged by rule-based models.

As an example, words such as dick, kill or failure will be understood in context:

this is my Dick ❌	I grew up reading Dick Tracy's comics ✔️
I read a book to kill my neighbor ❌	I read a book to kill time ✔️
you are such a failure ❌	I'm not used to failure ✔️

The ML models are part of the Text Moderation API, along with the pattern-matching model for Profanity Detection and other models such as PII Detection.

When submitting a text item to the API, you instantly receive a score for each available class. Scores are between 0 and 1, and they reflect how likely it is that someone would find the text problematic. Higher scores are therefore usually associated with more problematic content.

Available classes

Class availability depends on the language of the submitted text.

Class	Description
sexual	detects references to sexual acts, sexual organs or any other content typically associated with sexual activity
discriminatory	detects hate speech directed at individuals or groups because of specific characteristics of their identity (origin, religion, sexual orientation, gender, etc.)
insulting	detects insults undermining the dignity or honor of an individual, signs of disrespect towards someone
violent	detects threatening content, i.e. with an intention to harm / hurt, or expressing violence and brutality
toxic	detects whether a text is unacceptable, harmful, offensive, disrespectful or unpleasant

One given text may obtain scores returned by the API indicating a match for multiple classes: a text can be toxic and insulting at the same time for instance.

Examples

Sexual class

Category	Examples
Sexual activity (sexual intercourse, masturbation, foreplay, etc.)	wanna lick that pussy you and me, fucking in your car his big knob was penetrating me she loves making love with me
Sex toys	put that toy inside her hole look at this dildo
Pornography	watching porn as usual I often go on pornhub
Sexual body parts, body fluids	what a beautiful ass here’s my cum
Nudity	he was completely naked in the bed please send nudes
Lingerie / underwear	she wears beautiful lingerie take those panties off
Sexualizing text expressing sexual desire	are you horny? she was so wet last night this girl is such a slut

The sexual class will also detect non-consensual sexual content related to topics such as rape, prostitution or pedophilia. For example:

I will rape her

gets a very high sexual score, along high scores for the violent and the toxic classes.

Discriminatory class

Category	Examples
Origin, nationality, skin color	I hate black people go away sand nigger
Religion	I really have an issue with muslims
Sexual orientation / gender	he is such a faggot gay people are so annoying women are all sluts
Physical appearance	fat people are not nice
Disability	disabled people are useless

Insulting class

Category	Examples
Intelligence	he is so dumb
Physical appearance	these girls are ugly
Popularity	stop talking to me, loser
Women	she is such a bitch
Family	I want you to leave, son of a bitch
General insults	go get a life shut up her boyfriend is a piece of shit

Violent class

Category	Examples
Direct threat	I will find you and kill you I hit him the last time I saw him
Threat encouragement	please stab this guy don't you think they should all be dead?
Description using violent language	the man shoot himself in the street

Toxic class

Toxic content might also match the description of one of the other classes.

Category	Examples
sexual	show me your ass
discriminatory	fat people are always complaining
insulting	what a bastard
violent	I wish you would be dead

Language support

The following languages are currently supported for Text Classification:

Language	Code
English	en
French	fr
Italian	it
Portuguese	pt
Spanish	es
Russian	ru
Turkish	tr

You are expected to specify the language or list of languages you want to support by setting their ISO 639-1 codes in the lang parameter.

If you are unsure about the language used by a user, or if users mix different languages in their messages you can specify multiple languages as a comma-separated list.

For instance set lang=en,fr,es for users that might write in english, french or spanish. We strongly recommend specifying the shortest possible list as this will yield better results both in speed and accuracy. If your users mostly speak english, specifying en will yield better results.

Other languages are available upon request. Please get in touch regarding your language needs.

Use the models

Let's say you want to moderate the following text item:

I love you so much

Simply send a POST request containing the UTF-8 formatted text along with the ISO 639-1 language code (such as en for english). Here is an example:


curl -X POST 'https://api.sightengine.com/1.0/text/check.json' \
  -F 'text=I love you so much' \
  -F 'lang=en' \
  -F 'mode=ml' \
  -F 'api_user={api_user}' \
  -F 'api_secret={api_secret}'


# this example uses requests
import requests
import json

data = {
  'text': 'I love you so much',
  'mode': 'ml',
  'lang': 'en',
  'api_user': '{api_user}',
  'api_secret': '{api_secret}'
}
r = requests.post('https://api.sightengine.com/1.0/text/check.json', data=data)

output = json.loads(r.text)


$params = array(
  'text' => 'I love you so much',
  'lang' => 'en',
  'mode' => 'ml',
  'api_user' => '{api_user}',
  'api_secret' => '{api_secret}',
);

// this example uses cURL
$ch = curl_init('https://api.sightengine.com/1.0/text/check.json');
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $params);
$response = curl_exec($ch);
curl_close($ch);

$output = json_decode($response, true);


// this example uses axios and form-data
const axios = require('axios');
const FormData = require('form-data');

data = new FormData();
data.append('text', 'I love you so much');
data.append('lang', 'en');
data.append('mode', 'ml');
data.append('api_user', '{api_user}');
data.append('api_secret', '{api_secret}');

axios({
  url: 'https://api.sightengine.com/1.0/text/check.json',
  method:'post',
  data: data,
  headers: data.getHeaders()
})
.then(function (response) {
  // on success: handle response
  console.log(response.data);
})
.catch(function (error) {
  // handle error
  if (error.response) console.log(error.response.data);
  else console.log(error.message);
});

As an example, here is the JSON request that you would receive for the above request:


{
  "status": "success",
  "request": {
    "id": "req_6cujQglQPgGApjI5odv0P",
    "timestamp": 1471947033.92,
    "operations": 1
  },
  "moderation_classes": {
    "available": [
      "sexual",
      "discriminatory",
      "insulting",
      "violent",
      "toxic"
    ],
    "sexual": 0.01,
    "discriminatory": 0.01,
    "insulting": 0.01,
    "violent": 0.01,
    "toxic": 0.01
  }
}

Was this page helpful?

Products

MODERATION

REDACTION

REFERENCE