Sightengine's Text Moderation API is useful to moderate any type of text contents: comments, messages, chats, posts and even usernames.
Sightengine offers two different approaches to Text Moderation:
The text classification models are great to interpret full sentences and understand linguistic subtleties. The pattern-matching algorithms are great to detect specific words or phrases, and to work even in presence of heavy obfuscation by users trying to circumvent basic filters. Users can also add their own words and expressions to the pattern-matching algorithms.
The text classification models are multi-label models. They return multiple severity values to allow customers to make granular moderation decisions. The model returns one severity value per class. The following classes are available:
Class | Description |
sexual | detects references to sexual acts, sexual organs or any other content typically associated with sexual activity |
discriminatory | detects hate speech directed at individuals or groups because of specific characteristics of their identity (origin, religion, sexual orientation, gender, etc.) |
insulting | detects insults undermining the dignity or honor of an individual, signs of disrespect towards someone |
violent | detects threatening content, i.e. with an intention to harm / hurt, or expressing violence and brutality |
toxic | detects whether a text is unacceptable, harmful, offensive, disrespectful or unpleasant |
You can also implement a custom whitelist to force our API to disregard any words or content you feel shouldn't be flagged.
Learn how you can perform requests to the API to moderate texts.
Learn how to use our deep-learning based text moderation models.
Explore and view all capabilities made available under the Text Moderation APIs.
Detect PII such as email addresses, phone numbers and more.