Products

SIGN UPLOG IN

Models / Duplicate Detection

Video Blacklists and Disallow lists

Introduction

This guide will tell you how to blacklist videos and prevent them from (re)appearing on your site or app. This is useful to make sure known copyrighted videos, illegal videos, or previously removed videos do not get re-uploaded to your properties, either in full or in part.

To do so, you will be creating a Video Disallow List, also known as a Blacklist, to fingerprint all videos or clips you want to prevent from appearing on your properties. Once you add a video to the list, any other video that has a common sequence with this video will be detected.

Here are the steps to setup and use a Video Disallow List:

  1. Create a Video Disallow List from your dashboard
  2. Add Videos to the disallow list (through the dashboard or through the API)
  3. Check new and existing videos against the disallow list

1. Create a Video Disallow List

Go to your Sightengine dashboard to create a new list.

Once you have created a new list, retrieve the list id (this is a string starting with vli_), as this will be useful to interact with your newly created list.

2. Add Videos to the disallow list

Any video that you want to disallow should be added to the blacklist. You can do so either from your Sightengine dashboard or through the API.

Option A: Add a Video through the Dashboard

Go your Sightengine dashboard and click on the list you created. You can now add video by clicking the "ADD VIDEO" button and manually uploading videos.

Please keep in mind that large videos can take time to process. The status field will show keep you informed once a video has been successfully added.

Option B: Add a Video through the API

Here is the code to add a video to a list:


curl -X POST 'https://api.sightengine.com/1.0/video/add-to-list.json' \
    -F 'media=@/path/to/video.mp4' \
    -F 'add_to_list={list_id}' \
    -F 'api_user={api_user}' \
    -F 'api_secret={api_secret}'


# this example uses requests
import requests
import json

params = {
  'add_to_list': '{list_id}',
  'api_user': '{api_user}',
  'api_secret': '{api_secret}'
}
files = {'media': open('/path/to/video.mp4', 'rb')}
r = requests.post('https://api.sightengine.com/1.0/video/add-to-list.json', files=files, data=params)

output = json.loads(r.text)


$params = array(
  'media' => new CurlFile('/path/to/video.mp4'),
  'add_to_list' => '{list_id}',
  'api_user' => '{api_user}',
  'api_secret' => '{api_secret}',
);

// this example uses cURL
$ch = curl_init('https://api.sightengine.com/1.0/video/add-to-list.json');
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $params);
$response = curl_exec($ch);
curl_close($ch);

$output = json_decode($response, true);


// this example uses axios and form-data
const axios = require('axios');
const FormData = require('form-data');
const fs = require('fs');

data = new FormData();
data.append('media', fs.createReadStream('/path/to/video.mp4'));
data.append('add_to_list', '{list_id}');
data.append('api_user', '{api_user}');
data.append('api_secret', '{api_secret}');

axios({
  method: 'post',
  url:'https://api.sightengine.com/1.0/video/add-to-list.json',
  data: data,
  headers: data.getHeaders()
})
.then(function (response) {
  // on success: handle response
  console.log(response.data);
})
.catch(function (error) {
  // handle error
  if (error.response) console.log(error.response.data);
  else console.log(error.message);
});

The above call works for videos that are no larger than 50MB. If you need to add a larger video, you should first upload it separately to the Upload API, and then submit the media id to the API.

The API will return a JSON response with the following structure:

            
            
{
  "status": "success",
  "request": {
    "id": "req_1SJJxJjUHnSVWreApx9fF",
    "timestamp": 1510153177.0043
  },
  "media": {
    "id": "med_1SJDfFuLAFj34TlAMfksaA",
    "uri": "video.mp4"
  }
}
            
        

If you have defined a callback URL, a callback will be sent to your callback URL to notify you as soon as the processing of the video has finished.

3. Check videos against the disallow list

Here is the code to check if a local video has a near-duplicate within a list:


curl -X POST 'https://api.sightengine.com/1.0/video/check-list.json' \
    -F 'media=@/path/to/video.mp4' \
    -F 'lists={list_id}' \
    -F 'api_user={api_user}' \
    -F 'api_secret={api_secret}'


# this example uses requests
import requests
import json

params = {
  'lists': '{list_id}',
  'api_user': '{api_user}',
  'api_secret': '{api_secret}'
}
files = {'media': open('/path/to/video.mp4', 'rb')}
r = requests.post('https://api.sightengine.com/1.0/video/check-list.json', files=files, data=params)

output = json.loads(r.text)


$params = array(
  'media' => new CurlFile('/path/to/video.mp4'),
  'lists' => '{list_id}',
  'api_user' => '{api_user}',
  'api_secret' => '{api_secret}',
);

// this example uses cURL
$ch = curl_init('https://api.sightengine.com/1.0/video/check-list.json');
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $params);
$response = curl_exec($ch);
curl_close($ch);

$output = json_decode($response, true);


// this example uses axios and form-data
const axios = require('axios');
const FormData = require('form-data');
const fs = require('fs');

data = new FormData();
data.append('media', fs.createReadStream('/path/to/video.mp4'));
data.append('lists', '{list_id}');
data.append('api_user', '{api_user}');
data.append('api_secret', '{api_secret}');

axios({
  method: 'post',
  url:'https://api.sightengine.com/1.0/video/check-list.json',
  data: data,
  headers: data.getHeaders()
})
.then(function (response) {
  // on success: handle response
  console.log(response.data);
})
.catch(function (error) {
  // handle error
  if (error.response) console.log(error.response.data);
  else console.log(error.message);
});

Make sure that you have defined a callback URL in order to be notified once the analysis has finished. You can do so from your Sightengine dashboard, or through the `callback_url` parameter.

The above call works for videos that are no larger than 50MB. If you need to check a larger video, you should first upload it separately to the Upload API, and then submit the media id to the API.

The analysis and retrieval of near-duplicates is an asynchronous process, meaning that it will happen outside of the API request cycle. Once the search has finished, you will receive a callback containing the search results. The callback will have the following structure:

            
            
{
  "callback_type": "video_list.query.finished",
  "media": {
    "id": "med_1SJJEFuLqeSedThQjhNoS",
    "uri": "video.mp4"
  },
  "similarity": [
    {
      "list": {
        "id": "{list_id}",
      },
      "matches": [
          {
            "id": "med_1SJDfFuLAFj34TlAMfksaA",
            "custom_id": null,
            "hit_ratio": 0.94,
            "hit_score": 0.75,
            "hit_periods": [
                {
                    "start_ms": 0,
                    "end_ms": 2200,
                    "score": 0.75
                }
            ],
            "query_ratio": 0.5,
            "query_score": 0.75,
            "query_periods": [
                {
                    "start_ms": 2000,
                    "end_ms": 4200,
                    "score": 0.75
                }
            ]
        }
      ]
    }
  ]
}
            
        

If matches were found, they will be returned under the matches array. For each match, the API will return the following information:

  • the id of the original video, as defined by Sightengine when the original video was added to the list.
  • the custom_id that you set for the original video. This is an optional field and will be null if no custom id was provided.
  • the hit_ratio: this is the share of the hit video that is in common with the submitted query video.
  • the hit_periods: this is a list of all the parts of the hit video that are in common with the submitted query video. Each part is composed of a similarity score, as well as a start and end time.
  • the hit_score: overall score reflecting how similar the hit_periods are to the corresponding query_periods.
  • the query_ratio: this is the share of the query video that is in common with the hit video.
  • the query_periods: this is a list of all the parts of the submitted query video that are in common with the hit video. Each part is composed of a similarity score, as well as a start and end time.
  • the query_score: overall score reflecting how similar the query_periods are to the corresponding hit_periods.

Was this page helpful?