Products

SIGN UP LOG IN

Models / Text / Drug Detection

Drug Detection in Texts

Overview

Drug Detection is an optional capability that is made available as part of Sightengine's Text Moderation APIs. This capability is useful to detect if user-generated texts (comments, messages, posts, reviews, usernames, etc.) contain words related to recreational drugs.

Other categories are also available through the Text Moderation API, for instance Extremism Detection, Medical Term Detection and Weapon Detection.

If you have images or videos, you might also want to use the Drug Detection API for Visual content.

Detected Drugs

Detected text items are common names of drugs but also more informal words like nicknames or shortened variants.

Here are a few examples of the names of drugs that are detected:

Commonly used nameOther detected variants

cannabis

weed, ganja, marijuana...

ecstasy

md, mdma, xtc...

tnt

poppers...

cocaine

coke, crack...

...

...

Detection strength and language support

The API is a lot stronger than simple word-based filters. It catches not only exact drug-related words but also all kinds of variations (millions of them) that might be used to evade filtering while smartly ignoring false positives.

Here are a few examples of the types of obfuscations that will be caught (not exhaustive):

ObfuscationExample
Repetitions

vveeeeeeeeed

Insertions

x_t**c

Obfuscation and Special characters

ϲợḉầḭπ🄴

Spelling mistakes and phonetic variations

ekstasy

Leet speak

vv33d

Smart embeddings

smokeweed but not tweed

Language support

The Drug category can be activated for all languages supported by the Text Moderation API. For most languages, detection will focus on international names of drugs that are known worldwide, and additional language-specific words and expressions are added for English.

How to use this

The Drug Category can be activated as part of Standard Text Moderation and Username Moderation. To activate this category, you need to add an extra request parameter named categories. This parameter is a comma-separated list of categories you want to activate. For drug detection, its value would be drug

Code example

Let's say you want to detect references to drugs in the following text item:

got some w33d?

Simply send a POST request containing the UTF-8 formatted text along with the comma separated list of categories you want to detect and the ISO 639-1 language code (such as en for english). Here is an example:


curl -X POST 'https://api.sightengine.com/1.0/text/check.json' \
  -F 'text=got some w33d?' \
  -F 'lang=en' \
  -F 'categories=drug' \
  -F 'mode=rules' \
  -F 'api_user={api_user}' \
  -F 'api_secret={api_secret}'


# this example uses requests
import requests
import json

data = {
  'text': 'got some w33d?',
  'mode': 'rules',
  'lang': 'en',
  'categories': 'drug',
  'api_user': '{api_user}',
  'api_secret': '{api_secret}'
}
r = requests.post('https://api.sightengine.com/1.0/text/check.json', data=data)

output = json.loads(r.text)


$params = array(
  'text' => 'got some w33d?',
  'lang' => 'en',
  'categories' => 'drug',
  'mode' => 'rules',
  'api_user' => '{api_user}',
  'api_secret' => '{api_secret}',
);

// this example uses cURL
$ch = curl_init('https://api.sightengine.com/1.0/text/check.json');
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $params);
$response = curl_exec($ch);
curl_close($ch);

$output = json_decode($response, true);


// this example uses axios and form-data
const axios = require('axios');
const FormData = require('form-data');

data = new FormData();
data.append('text', 'got some w33d?');
data.append('lang', 'en');
data.append('categories', 'drug');
data.append('mode', 'rules');
data.append('api_user', '{api_user}');
data.append('api_secret', '{api_secret}');

axios({
  url: 'https://api.sightengine.com/1.0/text/check.json',
  method:'post',
  data: data,
  headers: data.getHeaders()
})
.then(function (response) {
  // on success: handle response
  console.log(response.data);
})
.catch(function (error) {
  // handle error
  if (error.response) console.log(error.response.data);
  else console.log(error.message);
});

The JSON response contains a description of profanities with positions within the text string.


{
    "status": "success",
    "request": {
        "id": "req_c235AxIQXY4LntE3dB1oh",
        "timestamp": 1655297706.600021,
        "operations": 1
    },
    "profanity": {
        "matches": []
    },
    "personal": {
        "matches": []
    },
    "link": {
        "matches": []
    },
    "drug": {
        "matches": [
            {
                "type": "drug",
                "match": "weed",
                "start": 9,
                "end": 12
            }
        ]
    }
}

Any other needs?

See our full list of Text models for details on other filters and checks you can run on your text content. You might also want to check our Image & Video models to moderate images and videos. This includes moderation of text in images/videos.

Was this page helpful?