FAQ / Text Moderation

How does Sightengine's text moderation differ from keyword filtering? How do you prevent users from circumventing word filters?

Keyword filtering vs Ruled-based Text Moderation

Our Ruled-based Text Moderation is a lot stronger than word-based filters. It uses advanced language analysis to detect objectionable content, even when users specifically attempt to circumvent your filters.

As an example, for each word we will be looking up millions of variations that might be used to evade filtering, while smartly ignoring all situations that might generate false positives. Here is a partial list of the situations that we cover:

druuugggggggs

Repetitions

Characters being repeated to avoid basic word filtering

$#!t

Grawlix

Replacement of characters with typographical symbols

B__* 0 __ 0 -- B__s

Insertions

Adding spaces, punctuation and more within words

🅓rͬu̸🄶s̼

Obfuscation and Special characters

Unusual non-ASCII characters used to evade basic word filters

phok yu

Spelling mistakes & Phonetic Variations

Changing word spellings while retaining their original meaning or pronunciation

|)R|_|G5

Leet speak

Replacing some alphabetical characters with a combination of punctuation, digits and letters

123FuckBlablah

Smart embeddings

Catching profanity based embeddings, while smartly ignoring potential false positives such as bassguitar amass...

Keyword filtering vs Text Classification models

Machine learning models can detect problematic content in situations that would otherwise have been missed or incorrectly flagged by simple keyword filtering and even by Rule-based models because they are able to take context into account.

As an example, words such as dick, kill or failure will be understood in context:

this is my Dick ❌	I grew up reading Dick Tracy's comics ✔️
I read a book to kill my neighbor ❌	I read a book to kill time ✔️
you are such a failure ❌	I'm not used to failure ✔️

Was this page helpful?

Products

MODERATION

REDACTION

REFERENCE

How does Sightengine's text moderation differ from keyword filtering? How do you prevent users from circumventing word filters?

Keyword filtering vs Ruled-based Text Moderation

Keyword filtering vs Text Classification models

Other frequent questions