The Near-Duplicate Detection model is used to identify images that are so called duplicates or near-duplicates. It can be used across the following use-cases:
Image Blacklists and Disallow lists: Blacklist images and prevent them from (re)appearing on your site or app. For instance copyrighted images, illegal images, previously removed images.
Spam & theft prevention: Detect spammy and unwanted behaviors. Prevent users from submitting the same image multiple times, and from submitting other users' photos.
Duplicate detection works across all types of images: natural photos, drawings, screenshots, anime etc.
Duplicates are detected across a wide range of transformations and modifications, many of which are typically used to try to evade duplicate detection. Examples:
Original image
Resolution, size and format changes
Downscaling and upscaling
DPI/Resolution changes
Re-encoding or format conversion (e.g. JPEG, PNG, WEBP...)
Text overlays
Text overlays and added captions
Stickers, logos, watermarks
Image overlays
Large image overlays obscuring parts of the original image
Emojis, shapes and other graphical overlays
Cropping and reframing
Tight crops, letterboxing, added borders/frames
Partial views of the original
Collage
When the source image appears inside a multi-image layout
Split-screen layouts
Blur
Strong gaussian blur, motion blur, defocus...
Pixelation
Image mixing
When the source image is blended with other images or backgrounds