Models / Text Moderation for Images

Text Moderation for Images


The Text Moderation API for Images is useful to determine if an image contains unwanted text such as profanity or personally identifiable information.

The API gives you a fine-grained control over the moderation decision. The API will tell you what type of content has been found (phone number, email address, discriminatory content, sexual content...) along with a text extract of the content. You can then use this response to reject, flag or review the image on your end.

Just like our other Image Moderation APIs, this API uses advanced AI to perform the analysis entirely automatically. There are no humans reviewing your image. This helps us achieve very fast turnaround times — typically a couple of seconds — and very high scalablity.

boxes showing natural text found on a woman's shirt and artificial text added through post-processing
Image containing flagged profanity


The Text Moderation API for Images works in several steps:

  1. Detection of text items contained in the image
  2. Recognition of the text (this is equivalent to transforming text into string objects)
  3. Analysis of the recognized text, through our text moderation engine


  • Prevent users from adding insults, profanity, racial slurs or sexually suggestive text in an image
  • Remove photos that contain PII such as an email address or phone number
  • Flag users who include links/URLs in their images

Profanity Detection in Images

Profanity Detection will enable you to detect insults, discriminatory content, sexual content or other inappropriate words and phrases in your images.

lot stronger than word-based filters. It uses advanced language analysis to detect objectionable content, even when users specifically attempt to circumvent your filters. It covers obfuscation techniques such as repetitions, insertions, spelling mistakes, leet speak and more. Learn more on our Text Moderation Engine.

boxes showing natural text found on a woman's shirt and artificial text added through post-processing
Image containing flagged profanity

Personal Information Detection in Images

Email addresses

Email addresses will be detected and flagged as such in the image.

Image with a flagged email address

Phone numbers

Phone numbers from will be detected and flagged as such in the image. We currently support phone numbers from the following countries: United States, Canada, United Kingdom, France, India. Please reach out if you need support for other countries.

Image with a flagged US phone number

Link and URL Detection in Images

Links and URLs will be detected and flagged as such in the image.

Image with a flagged link to a twitter handle

Languages and Recommendations


English is the default language used for the text recognition and profanity filtering.

Other languages are available upon request (Spanish, French, German...) as well as non-latin alphabets. If you need another language, please get in touch.


  • Minimum text size: text that is too small to read may be ignored. Our recognition engine will analyze text that has a width or height of at least 4% of the image's max dimension.
  • Dense text: This Model has been designed to work with photographs that contain short text items. It is not meant to analyze images with dense text such as PDFs, scans or photos of printed documents. If you submit an image containing dense text, the API will decline the image and ignore the dense text. To know if text has been ignored just check the ignore_text flag. It is set to true when dense text has been ignored
  • Image rotation: Make sure submitted images are correctly rotated or have proper rotation EXIF data. Text that is upside-down or rotated (by more than 20 degrees) might not be properly recognized
  • Multi-frame processing: GIF images containing multiple frames will not be processed. If you need to review a multi-frame GIF image, we recommend submitting individual frames to the API

Use the model

If you haven't already, create an account to get your own API keys. You should then install the SDK that corresponds to your programming language. You can also implement your own logic to interact with our API if you prefer. Have a look at our API reference for more details.

# install cURL:

pip install sightengine

composer require sightengine/client-php

npm install sightengine --save

Detect unwanted text in an image

Let's say you want to moderate the following image:

photo with an embedded phone number

You can either upload a public URL to the image, or upload the raw binary image. Here's how to proceed if you choose to share the image's public URL:

curl -X GET -G '' \
    -d 'models=text-content' \
    -d 'api_user={api_user}&api_secret={api_secret}' \
    --data-urlencode 'url='

# if you haven't already, install the SDK with 'pip install sightengine'
from sightengine.client import SightengineClient
client = SightengineClient('{api_user}','{api_secret}')
output = client.check('text-content').set_url('')

// if you haven't already, install the SDK with 'composer require sightengine/client-php'
use \Sightengine\SightengineClient;
$client = new SightengineClient('{api_user}','{api_secret}');
$output = $client->check(['text-content'])->set_url('');

// if you haven't already, install the SDK with 'npm install sightengine --save'
var sightengine = require('sightengine')('{api_user}','{api_secret}');
sightengine.check(['text-content']).set_url('').then(function(result) {
    // The API response (result)
}).catch(function(err) {
    // Handle error

The API will then return a JSON response:

    "status": "success",
    "request": {
        "id": "req_22Qd0gUNmRH4GCYLvYtN6",
        "timestamp": 1512483673.1405,
        "operations": 1
    "text": {
        "personal": [
            "type": "phone_number_us",
            "match": "+1 800 222 2408"
        "link": [],
        "profanity": [],
        "ignore_text": false
    "media": {
        "id": "med_22Qdfb5s97w8EDuY7Yfjp",
        "uri": ""

Any other needs?

See our full list of models for details on other filters and checks you can run on your images and videos.

Did you find this page helpful?

We're always looking for advice to help improve our documentation!

Let us know what you think

Cookies help us deliver our services. By using our services, you agree to our use of cookies. Learn more