Products

SIGN UPLOG IN

Models / Text / Extremism Detection

Extremism Detection

Overview

Extremism Detection is an optional capability that is made available as part of Sightengine's Text Moderation APIs. This capability is useful to detect if user-generated texts (comments, messages, posts, reviews, usernames, etc.) contain words related to extremist ideologies with the purpose of promoting hate, violence or terror acts.

Other categories are also available through the Text Moderation API, for instance Drug Detection, Medical Term Detection and Weapon Detection.

Keep in mind that hate can also be expressed through discriminatory or hateful words and expressions. To detect such instances, you should use the ML-based hate detection model or the rule-based Profanity Detection in addition to Extremism Detection.

Detected Extremism

The words and phrases that will be flagged for being extremist-related or terrorist-related can be names of people, groups or movements, slogans or known keywords. Here is an overview of the different types:

TypeDescriptionExample

people

individuals linked to past or present extremist or terrorist organizations or events

mussolini, bin laden

group / movement

organizations known to promote or incite hate

al qaeda

keyword

words frequently used to promote hate or describe extremist practices or theories

holohoax, mein Kampf

slogan

catch phrases used by extremist people or organizations to promote hateful ideas

6mwe, acab

Extremist content ranges across ideologies that are considered extremist. Those include, but are not limited to, the following ideologies:

  • islamic extremism: names of people or groups, keywords or slogans referring to extremist beliefs and practices of individuals and groups that are associated with the Islamic religion
  • white supremacism: names of people or groups, keywords or slogans that refer to beliefs considering that white or lighter-skinned people are naturally superior to other racial groups
  • antisemitism: names of people or groups, keywords or slogans that are related to discrimination and hostility against Jews
  • anti-government: names of people or groups, keywords or slogans related to opposition and resistance to governmental authority
  • left-wing extremism: names of people or groups, keywords or slogans referring to ideas based on social equality, as they are found in anarchist or communist ideologies
  • right-wing extremism: names of people or groups, keywords or slogans characterized by nationalist, antisemitic, racist or xenophobic ideas

Detection strength and language support

The API is a lot stronger than simple word-based filters. It catches not only exact extremism-related words but also all kinds of variations (millions of them) that might be used to evade filtering while smartly ignoring false positives.

Here are a few examples of the types of obfuscations that will be caught (not exhaustive):

ObfuscationExample
Repetitions

naaaaazzzziiii

Insertions

a_c*a - b

Obfuscation and Special characters

ʰi͛tͭ£🄴®

Spelling mistakes and phonetic variations

ku kluks klan

Leet speak

/-\C/-\|3

Smart embeddings

lovehamas but not bahamas

Language support

The Extremism category can be activated for all languages supported by the Text Moderation API. For most languages, detection will focus on international names, slogans and expressions, and additional language-specific words and expressions are added for English.

How to use this

The Extremism Category can be activated as part of Standard Text Moderation and Username Moderation. To activate this category, you need to add an extra request parameter named categories. This parameter is a comma-separated list of categories you want to activate. For extremism detection, its value would be extremism

Code example

Let's say you want to detect extremist content in the following text item:

love hitler and the klu klux klan

Simply send a POST request containing the UTF-8 formatted text along with the comma-separated list of categories you want to detect and the ISO 639-1 language code (such as en for english). Here is an example:


curl -X POST 'https://api.sightengine.com/1.0/text/check.json' \
  -F 'text=love hitler and the klu klux klan' \
  -F 'lang=en' \
  -F 'categories=extremism' \
  -F 'mode=rules' \
  -F 'api_user={api_user}' \
  -F 'api_secret={api_secret}'


# this example uses requests
import requests
import json

data = {
  'text': 'love hitler and the klu klux klan',
  'mode': 'rules',
  'lang': 'en',
  'categories': 'extremism',
  'api_user': '{api_user}',
  'api_secret': '{api_secret}'
}
r = requests.post('https://api.sightengine.com/1.0/text/check.json', data=data)

output = json.loads(r.text)


$params = array(
  'text' => 'love hitler and the klu klux klan',
  'lang' => 'en',
  'categories' => 'extremism',
  'mode' => 'rules',
  'api_user' => '{api_user}',
  'api_secret' => '{api_secret}',
);

// this example uses cURL
$ch = curl_init('https://api.sightengine.com/1.0/text/check.json');
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $params);
$response = curl_exec($ch);
curl_close($ch);

$output = json_decode($response, true);


// this example uses axios and form-data
const axios = require('axios');
const FormData = require('form-data');

data = new FormData();
data.append('text', 'love hitler and the klu klux klan');
data.append('lang', 'en');
data.append('categories', 'extremism');
data.append('mode', 'rules');
data.append('api_user', '{api_user}');
data.append('api_secret', '{api_secret}');

axios({
  url: 'https://api.sightengine.com/1.0/text/check.json',
  method:'post',
  data: data,
  headers: data.getHeaders()
})
.then(function (response) {
  // on success: handle response
  console.log(response.data);
})
.catch(function (error) {
  // handle error
  if (error.response) console.log(error.response.data);
  else console.log(error.message);
});

See request parameter description

ParameterTypeDescription
textstringUTF-8 encoded text to moderate
modestringcomma-separated list of modes. Modes are rules for the rule-based model or ml for ML models
categoriesstringcomma-separated list of categories to check. Possible values: profanity, personal, link, drug, weapon, violence, self-harm, medical, extremism, spam, content-trade, money-transaction (optional)
langstringcomma-separated list of target languages
opt_countriesstringcomma-separated list of target countries for phone number detection (optional)
liststringid of a custom list to be used for rule-based moderation (optional)
api_userstringyour API user id
api_secretstringyour API secret

The JSON response contains a description of profanities with positions within the text string.


{
  "status": "success",
  "request": {
    "id": "req_6cujQglQPgGApjI5odv0P",
    "timestamp": 1471947033.92,
    "operations": 1
  },
  "profanity": {
    "matches": []
  },
  "personal": {
    "matches": []
  },
  "link": {
    "matches": []
  },
  "extremism": {
    "matches": [
      {
        "type": "extremism",
        "match": "hitler",
        "start": 5,
        "end": 10
      },
      {
        "type": "extremism",
        "match": "kukluxklan",
        "start": 20,
        "end": 31
      }
    ]
  }
}

Any other needs?

See our full list of Text models for details on other filters and checks you can run on your text content. You might also want to check our Image & Video models to moderate images and videos. This includes moderation of text in images/videos.

Was this page helpful?