Products

SIGN UPLOG IN

Models / URL and Link Moderation

URL and Link Moderation

Overview

URL and Link Moderation can be used to detect and filter links wherever they appear:

boxes showing a qr code found on a sleeve with a link to a harmful website
Image containing a flagged QR code linking to a harmful website

Features

URL moderation works out-of-the-box, and will automatically flag links and URLs from more than 5 million different domain names, updated weekly.

The API detects a broad spectrum of unwanted or unsafe websites, across many categories. It also provides you with the following capabilities:

  • Smart handling of redirects and URL shorteners
  • Detection of deceptive/spoofing techniques such as punycode attacks
  • Detection of obfuscated URLs and links
  • Works across all your content: texts, images, videos (embedded links, qr codes...)

The link moderation model does not perform a real-time analysis on the content of the target website. The link moderation relies on our internal databases of known domains and pages, updated weekly, to categorize links.

Detection categories

The API returns all links that have been detected. When applicable, the API also returns the category to which the link or URL belongs. Here is the list of categories that Sightengine will flag for you:

CategoryDescription
unsafe

sites presenting a risk for visitors, such as phishing, malware, scams

adult

sites containing porn, erotica, escort services

gambling

legal and illegal casinos, money games

drugs

sites promoting or selling recreational drugs

hate

extremist or hateful content

custom

your own custom disallow lists and allow lists

All categories apart from the custom one work directly out-of-the-box. The categories cover links and URLs from more than 5 million domains known to host unwanted content. Our lists are updated weekly to reflect the ever changing nature of the web.

Moderate URLs in Text Messages

Here is how you can detect and moderate URLs in text messages:


curl -X POST 'https://api.sightengine.com/1.0/text/check.json' \
  -F 'text=Come check this page: http://harmfulsiteexample.com' \
  -F 'lang=en' \
  -F 'mode=rules' \
  -F 'api_user={api_user}' \
  -F 'api_secret={api_secret}'


# this example uses requests
import requests
import json

data = {
  'text': 'Come check this page: http://harmfulsiteexample.com',
  'mode': 'rules',
  'lang': 'en',
  'api_user': '{api_user}',
  'api_secret': '{api_secret}'
}
r = requests.post('https://api.sightengine.com/1.0/text/check.json', data=data)

output = json.loads(r.text)


$params = array(
  'text' => 'Come check this page: http://harmfulsiteexample.com',
  'lang' => 'en',
  'mode' => 'rules',
  'api_user' => '{api_user}',
  'api_secret' => '{api_secret}',
);

// this example uses cURL
$ch = curl_init('https://api.sightengine.com/1.0/text/check.json');
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $params);
$response = curl_exec($ch);
curl_close($ch);

$output = json_decode($response, true);


// this example uses axios and form-data
const axios = require('axios');
const FormData = require('form-data');

data = new FormData();
data.append('text', 'Come check this page: http://harmfulsiteexample.com');
data.append('lang', 'en');
data.append('mode', 'rules');
data.append('api_user', '{api_user}');
data.append('api_secret', '{api_secret}');

axios({
  url: 'https://api.sightengine.com/1.0/text/check.json',
  method:'post',
  data: data,
  headers: data.getHeaders()
})
.then(function (response) {
  // on success: handle response
  console.log(response.data);
})
.catch(function (error) {
  // handle error
  if (error.response) console.log(error.response.data);
  else console.log(error.message);
});

See request parameter description

ParameterTypeDescription
textstringUTF-8 encoded text to moderate
modestringcomma-separated list of modes. Modes are rules for the rule-based model or ml for ML models
categoriesstringcomma-separated list of categories to check. Possible values: profanity, personal, link, drug, weapon, violence, self-harm, medical, extremism, spam, content-trade, money-transaction (optional)
langstringcomma-separated list of target languages
opt_countriesstringcomma-separated list of target countries for phone number detection (optional)
liststringid of a custom list to be used for rule-based moderation (optional)
api_userstringyour API user id
api_secretstringyour API secret

The JSON response contains a description of URLs that have been detected under the link key. The response also detects other elements such as profanity, personal information and grawlix. Check the text moderation guide to learn more about those capabilities.


{
  "status": "success",
  "request": {
    "id": "req_6cujQglQPgGApjI5odv0P",
    "timestamp": 1471947033.92,
    "operations": 1
  },
  "profanity": {
    "matches": []
  },
  "personal": {
    "matches": []
  },
  "link": {
    "matches": [
      {
        "type": "url",
        "category": "unsafe",
        "match": "http://harmfulsiteexample.com"
      }
    ]
  },
}

Moderate URLs in Images/Videos

Here is how you can detect and moderate URLs in images or videos:


curl -X GET -G 'https://api.sightengine.com/1.0/check.json' \
    -d 'models=text-content,qr-content' \
    -d 'api_user={api_user}&api_secret={api_secret}' \
    --data-urlencode 'url=https://sightengine.com/assets/img/examples/example-qr-600.jpg'


# this example uses requests
import requests
import json

params = {
  'url': 'https://sightengine.com/assets/img/examples/example-qr-600.jpg',
  'models': 'text-content,qr-content',
  'api_user': '{api_user}',
  'api_secret': '{api_secret}'
}
r = requests.get('https://api.sightengine.com/1.0/check.json', params=params)

output = json.loads(r.text)


$params = array(
  'url' =>  'https://sightengine.com/assets/img/examples/example-qr-600.jpg',
  'models' => 'text-content,qr-content',
  'api_user' => '{api_user}',
  'api_secret' => '{api_secret}',
);

// this example uses cURL
$ch = curl_init('https://api.sightengine.com/1.0/check.json?'.http_build_query($params));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$response = curl_exec($ch);
curl_close($ch);

$output = json_decode($response, true);


// this example uses axios
const axios = require('axios');

axios.get('https://api.sightengine.com/1.0/check.json', {
  params: {
    'url': 'https://sightengine.com/assets/img/examples/example-qr-600.jpg',
    'models': 'text-content,qr-content',
    'api_user': '{api_user}',
    'api_secret': '{api_secret}',
  }
})
.then(function (response) {
  // on success: handle response
  console.log(response.data);
})
.catch(function (error) {
  // handle error
  if (error.response) console.log(error.response.data);
  else console.log(error.message);
});

The JSON response contains a description of URLs that have been detected either as text under the text key, or as QR codes under the qr key.


{
  "status": "success",
  "request": {
    "id": "req_6cujQglQPgGApjI5odv0P",
    "timestamp": 1471947033.92,
    "operations": 2
  },
  "text": {
    "personal": [],
    "link": [],
    "social": [],
    "profanity": [],
  },
  "qr": {
    "personal": [],
    "link": [
      "type": "url",
      "category": "unsafe",
      "match": "http://harmfulsiteexample.com"
    ],
    "social": [],
    "profanity": [],
  }
}

Any other needs?

See our full list of Image/Video models for details on other filters and checks you can run on your images and videos. You might also want to check our Text models to moderate text-based content: messages, reviews, comments, usernames...

Was this page helpful?