The text detection API can help you determine if an image contains natural text or artificial (embedded) text.
The text detection will not act upon the actual text content. If you need to analyze the text content itself (such as to detect profanity, email addresses, phone number etc..) please use the Text Moderation for Image model.
The text detection does not use any image meta-data to determine if text is present in an image. The file extension, the meta-data or the name will not influence the result. The classification is made using only the pixel content of the image.
The model works for a host of different alphabets: latin, chinese, japanese, korean, arabic, hebrew, hindi...
The returned value is between 0 and 1, images with a natural text value closer to 1 will contain natural text while images with a natural text value closer to 0 will not contain natural text.
The returned value is between 0 and 1, images with an artificial text value closer to 1 will contain artificial text while images with an artificial text value closer to 0 will not contain artificial text.
The values returned (x1, x2, y1, y2) help locate the texts present in the image.
For each box there you get a label that indicates the type of text (text-natural or text-artificial)
If you haven't already, create an account to get your own API keys.
Let's say you want to moderate the following image:
You can either upload a public URL to the image, or upload the raw binary image. Here's how to proceed if you choose to share the image's public URL:
curl -X GET -G 'https://api.sightengine.com/1.0/check.json' \
-d 'models=text' \
-d 'api_user={api_user}&api_secret={api_secret}' \
--data-urlencode 'url=https://sightengine.com/assets/img/examples/example2.jpg'
# this example uses requests
import requests
import json
params = {
'url': 'https://sightengine.com/assets/img/examples/example2.jpg',
'models': 'text',
'api_user': '{api_user}',
'api_secret': '{api_secret}'
}
r = requests.get('https://api.sightengine.com/1.0/check.json', params=params)
output = json.loads(r.text)
$params = array(
'url' => 'https://sightengine.com/assets/img/examples/example2.jpg',
'models' => 'text',
'api_user' => '{api_user}',
'api_secret' => '{api_secret}',
);
// this example uses cURL
$ch = curl_init('https://api.sightengine.com/1.0/check.json?'.http_build_query($params));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$response = curl_exec($ch);
curl_close($ch);
$output = json_decode($response, true);
// this example uses axios
const axios = require('axios');
axios.get('https://api.sightengine.com/1.0/check.json', {
params: {
'url': 'https://sightengine.com/assets/img/examples/example2.jpg',
'models': 'text',
'api_user': '{api_user}',
'api_secret': '{api_secret}',
}
})
.then(function (response) {
// on success: handle response
console.log(response.data);
})
.catch(function (error) {
// handle error
if (error.response) console.log(error.response.data);
else console.log(error.message);
});
The API will then return a JSON response:
{
"status": "success",
"request": {
"id": "req_22Qd0gUNmRH4GCYLvYtN6",
"timestamp": 1512483673.1405,
"operations": 1
},
"text": {
"has_artificial": 0.99932,
"has_natural": 0.19986,
"boxes": [
{
"x1": 0.18466,
"y1": 0.01757,
"x2": 0.8555,
"y2": 0.244,
"label": "text-artificial"
}
]
},
"media": {
"id": "med_22Qdfb5s97w8EDuY7Yfjp",
"uri": "https://sightengine.com/assets/img/examples/text1-1200.jpg"
}
}
See our full list of Image/Video models for details on other filters and checks you can run on your images and videos. You might also want to check our Text models to moderate text-based content: messages, reviews, comments, usernames...
Was this page helpful?