Text Moderation / Guides

Text Moderation - ML models BETA

Overview

By taking context into account, machine learning models can detect problematic content in situations that would otherwise have been missed or incorrectly flagged by rule-based models.

As an example, words such as dick, kill or failure will be understood in context:

this is my Dick ❌	I grew up reading Dick Tracy's comics ✔️
I read a book to kill my neighbor ❌	I read a book to kill time ✔️
you are such a failure ❌	I'm not used to failure ✔️

The ML models are part of the Text Moderation API, along with the pattern-matching model for Profanity Detection and other models such as PII Detection.

When submitting a text item to the API, you instantly receive a score for each available class. Scores are between 0 and 1, and they reflect how likely it is that someone would find the text problematic. Higher scores are therefore usually associated with more problematic content.

Available classes

Class availability depends on the models chosen:

General model general

Class	Description
sexual	detects references to sexual acts, sexual organs or any other content typically associated with sexual activity
discriminatory	detects hate speech directed at individuals or groups because of specific characteristics of their identity (origin, religion, sexual orientation, gender, etc.)
insulting	detects insults undermining the dignity or honor of an individual, signs of disrespect towards someone
violent	detects threatening content, i.e. with an intention to harm / hurt, or expressing violence and brutality
toxic	detects whether a text is unacceptable, harmful, offensive, disrespectful or unpleasant

Self-harm model self-harm

Class	Description
self-harm	detects whether a text contains mentions or references to self-harm

One given text may obtain scores returned by the API indicating a match for multiple classes: a text can be toxic and insulting at the same time for instance.

Language support

The following languages are currently supported for Text Classification:

Language	Code
English	en
French	fr
Italian	it
Portuguese	pt
Spanish	es
Russian	ru
Turkish	tr

You are expected to specify the language or list of languages you want to support by setting their ISO 639-1 codes in the lang parameter.

If you are unsure about the language used by a user, or if users mix different languages in their messages you can specify multiple languages as a comma-separated list.

For instance set lang=en,fr,es for users that might write in english, french or spanish. We strongly recommend specifying the shortest possible list as this will yield better results both in speed and accuracy. If your users mostly speak english, specifying en will yield better results.

Other languages are available upon request. Please get in touch regarding your language needs.

Use the models

Let's say you want to moderate the following text item using both the general and self-harm models:

I love you so much

Simply send a POST request containing the UTF-8 formatted text along with the ISO 639-1 language code (such as en for english). Here is an example:


curl -X POST 'https://api.sightengine.com/1.0/text/check.json' \
  -F 'text=I love you so much' \
  -F 'lang=en' \
  -F 'models=general,self-harm' \
  -F 'mode=ml' \
  -F 'api_user={api_user}' \
  -F 'api_secret={api_secret}'


# this example uses requests
import requests
import json

data = {
  'text': 'I love you so much',
  'mode': 'ml',
  'lang': 'en',
  'models': 'general,self-harm',
  'api_user': '{api_user}',
  'api_secret': '{api_secret}'
}
r = requests.post('https://api.sightengine.com/1.0/text/check.json', data=data)

output = json.loads(r.text)


$params = array(
  'text' => 'I love you so much',
  'lang' => 'en',
  'models' => 'general,self-harm',
  'mode' => 'ml',
  'api_user' => '{api_user}',
  'api_secret' => '{api_secret}',
);

// this example uses cURL
$ch = curl_init('https://api.sightengine.com/1.0/text/check.json');
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $params);
$response = curl_exec($ch);
curl_close($ch);

$output = json_decode($response, true);


// this example uses axios and form-data
const axios = require('axios');
const FormData = require('form-data');

data = new FormData();
data.append('text', 'I love you so much');
data.append('lang', 'en');
data.append('models', 'general,self-harm');
data.append('mode', 'ml');
data.append('api_user', '{api_user}');
data.append('api_secret', '{api_secret}');

axios({
  url: 'https://api.sightengine.com/1.0/text/check.json',
  method:'post',
  data: data,
  headers: data.getHeaders()
})
.then(function (response) {
  // on success: handle response
  console.log(response.data);
})
.catch(function (error) {
  // handle error
  if (error.response) console.log(error.response.data);
  else console.log(error.message);
});

See request parameter description

Parameter	Type	Description
text	string	UTF-8 encoded text to moderate
mode	string	comma-separated list of modes. Modes are rules for the rule-based model or ml for ML models
models	string	comma-separated list of models to apply for ML moderation
lang	string	comma-separated list of target languages
api_user	string	your API user id
api_secret	string	your API secret

As an example, here is the JSON request that you would receive for the above request:


{
  "status": "success",
  "request": {
    "id": "req_6cujQglQPgGApjI5odv0P",
    "timestamp": 1471947033.92,
    "operations": 2
  },
  "moderation_classes": {
    "available": [
      "sexual",
      "discriminatory",
      "insulting",
      "violent",
      "toxic",
      "self-harm"
    ],
    "sexual": 0.01,
    "discriminatory": 0.01,
    "insulting": 0.01,
    "violent": 0.01,
    "toxic": 0.01,
    "self-harm": 0.01
  }
}

Was this page helpful?

Products

MODERATION

REDACTION