Explore chapters and articles related to this topic
Image and Video Copy Detection Using Content-Based Fingerprinting
Published in Ling Guan, Yifeng He, Sun-Yuan Kung, Multimedia Image and Video Processing, 2012
Mehrdad Fatourechi, Xudong Lv, Mani Malek Esmaeili, Z. Jane Wang, Rabab K. Ward
As seen above, CF is able to find copies of a given multimedia content without altering its content. These algorithms have therefore become very popular in recent years for multimedia copy detection purposes [2,3]. As a side note, we should state that various other terms have also been used in the literature to refer to the above procedure including multimedia hashing,∗perceptual hashing, content-based copy detection, content-based copy identification, content-based digital fingerprints, among others. However, we believe that the term CF is more general and intuitive and thus we use it throughout this chapter.
Processing Social Media Images by Combining Human and Machine Computing during Crises
Published in International Journal of Human–Computer Interaction, 2018
Firoj Alam, Ferda Ofli, Muhammad Imran
For the first experiment, we analyzed perceptual hashing-based approach. Perceptual hashing technique extracts certain features from each image, and computes a hash value (i.e., a binary string of length 49) for each image based on these features, and compares the resulting pair of hashes to decide the level of similarity between the images. During an event, the system maintains a list (i.e., in-memory data structure) of hashes computed for a set of distinct images it receives from the Image Collector module (see the Image Collector subsection). To determine whether a newly arrived image is duplicate of a previously seen image, hash value of the new image is computed and compared against the list of stored hashes to calculate its distance from the existing image hashes. In our case, we use the Hamming distance to compare two hashes. If the distance between the hash of a newly arrived image and a hash in the list is smaller than d (threshold) then we consider the newly arrived image as a duplicate image. We always keep the recent 100K hashes in the physical memory. This number obviously depends on the size of available memory in the system.