Products

SIGN UPLOG IN

Text Moderation / Guides

Text Moderation - ML models BETA

Overview

By taking context into account, machine learning models can detect problematic content in situations that would otherwise have been missed or incorrectly flagged by rule-based models.

As an example, words such as dick, kill or failure will be understood in context:

this is my Dick
I grew up reading Dick Tracy's comics
✔️
I read a book to kill my neighbor
I read a book to kill time
✔️
you are such a failure
I'm not used to failure
✔️

The ML models are part of the Text Moderation API, along with the pattern-matching model for Profanity Detection and other models such as PII Detection.

When submitting a text item to the API, you instantly receive a score for each available class. Scores are between 0 and 1, and they reflect how likely it is that someone would find the text problematic. Higher scores are therefore usually associated with more problematic content.

Available classes

Class availability depends on the models chosen:

General model general

ClassDescription
sexual detects references to sexual acts, sexual organs or any other content typically associated with sexual activity
discriminatory detects hate speech directed at individuals or groups because of specific characteristics of their identity (origin, religion, sexual orientation, gender, etc.)
insulting detects insults undermining the dignity or honor of an individual, signs of disrespect towards someone
violent detects threatening content, i.e. with an intention to harm / hurt, or expressing violence and brutality
toxic detects whether a text is unacceptable, harmful, offensive, disrespectful or unpleasant

Self-harm model self-harm

ClassDescription
self-harm detects whether a text contains mentions or references to self-harm

One given text may obtain scores returned by the API indicating a match for multiple classes: a text can be toxic and insulting at the same time for instance.

Language support

The following languages are currently supported for Text Classification:

LanguageCode
Englishen
Frenchfr
Italianit
Portuguesept
Spanishes
Russianru
Turkishtr

You are expected to specify the language or list of languages you want to support by setting their ISO 639-1 codes in the lang parameter.

If you are unsure about the language used by a user, or if users mix different languages in their messages you can specify multiple languages as a comma-separated list.

For instance set lang=en,fr,es for users that might write in english, french or spanish. We strongly recommend specifying the shortest possible list as this will yield better results both in speed and accuracy. If your users mostly speak english, specifying en will yield better results.

Other languages are available upon request. Please get in touch regarding your language needs.

Use the models

Let's say you want to moderate the following text item using both the general and self-harm models:

I love you so much

Simply send a POST request containing the UTF-8 formatted text along with the ISO 639-1 language code (such as en for english). Here is an example:


curl -X POST 'https://api.sightengine.com/1.0/text/check.json' \
  -F 'text=I love you so much' \
  -F 'lang=en' \
  -F 'models=general,self-harm', \
  -F 'mode=ml' \
  -F 'api_user={api_user}' \
  -F 'api_secret={api_secret}'


# this example uses requests
import requests
import json

data = {
  'text': 'I love you so much',
  'mode': 'ml',
  'lang': 'en',
  'models': 'general,self-harm',
  'api_user': '{api_user}',
  'api_secret': '{api_secret}'
}
r = requests.post('https://api.sightengine.com/1.0/text/check.json', data=data)

output = json.loads(r.text)


$params = array(
  'text' => 'I love you so much',
  'lang' => 'en',
  'models' => 'general,self-harm',
  'mode' => 'ml',
  'api_user' => '{api_user}',
  'api_secret' => '{api_secret}',
);

// this example uses cURL
$ch = curl_init('https://api.sightengine.com/1.0/text/check.json');
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $params);
$response = curl_exec($ch);
curl_close($ch);

$output = json_decode($response, true);


// this example uses axios and form-data
const axios = require('axios');
const FormData = require('form-data');

data = new FormData();
data.append('text', 'I love you so much');
data.append('lang', 'en');
data.append('models', 'general,self-harm');
data.append('mode', 'ml');
data.append('api_user', '{api_user}');
data.append('api_secret', '{api_secret}');

axios({
  url: 'https://api.sightengine.com/1.0/text/check.json',
  method:'post',
  data: data,
  headers: data.getHeaders()
})
.then(function (response) {
  // on success: handle response
  console.log(response.data);
})
.catch(function (error) {
  // handle error
  if (error.response) console.log(error.response.data);
  else console.log(error.message);
});

See request parameter description

ParameterTypeDescription
textstringUTF-8 encoded text to moderate
modestringcomma-separated list of modes. Modes are rules for the rule-based model or ml for ML models
modelsstringcomma-separated list of models to apply for ML moderation
langstringcomma-separated list of target languages
api_userstringyour API user id
api_secretstringyour API secret

As an example, here is the JSON request that you would receive for the above request:


{
  "status": "success",
  "request": {
    "id": "req_6cujQglQPgGApjI5odv0P",
    "timestamp": 1471947033.92,
    "operations": 2
  },
  "moderation_classes": {
    "available": [
      "sexual",
      "discriminatory",
      "insulting",
      "violent",
      "toxic",
      "self-harm"
    ],
    "sexual": 0.01,
    "discriminatory": 0.01,
    "insulting": 0.01,
    "violent": 0.01,
    "toxic": 0.01,
    "self-harm": 0.01
  }
}

Was this page helpful?