Explore chapters and articles related to this topic
Theory, Practical Concepts, Strategies and Methods for Emotion Recognition
Published in Rashmi Gupta, Arun Kumar Rana, Sachin Dhawan, Korhan Cengiz, Advanced Sensing in Image Processing and IoT, 2022
Varsha K. Patil, Vijaya Pawar, Vaishnavi Vajirkar, Vedita Kharabe, Nimisha Gutte, Mustafa Sameer
In the preceding section, we learned about CNN-based emotion recognition. In this section, let us learn about one of the popular machine learning emotion recognition algorithms. DeepFace is a Python hybrid framework for face recognition and facial feature analysis (age, gender, emotion, and race). Models like VGG-Face, Google FaceNet, OpenFace, Facebook DeepFace, DeepID, ArcFace, Dlib, VGG-Keras and TensorFlow are the major components of the package. DeepFace is a system that has narrowed most of the gap in facial recognition and is approaching human-level accuracy DeepFace is a Facebook research group's deep learning facial recognition technology. It is trained on a big dataset of faces, which outperformed all previous systems with relatively minor improvements. The whole process of emotion recognition happens in the background when we import DeepFace and call the required functions. We have tried to demonstrate the basic methodology in Figure 5.15.
Application of Artificial Intelligence in Image Processing
Published in Nedunchezhian Raju, M. Rajalakshmi, Dinesh Goyal, S. Balamurugan, Ahmed A. Elngar, Bright Keswani, Empowering Artificial Intelligence Through Machine Learning, 2022
In the year, 2014, DeepFace, a face recognition feature, was developed by the most widely used social networking site: Facebook.8 It used neural networks to identify faces with an accuracy of about 97.35%. With the existing image recognitions modality, Facebook has improved the system by 27%, coming to 97.50%. First, a template will be created by going through the images of an individual, which includes the users’ profile photo and images that are tagged by friends. On uploading a new photo, similarities or patterns are being matched with the existing template and it automatically tags friends if the template matches. This feature also enables one to unlock an account where the user has to identify the photos of friends when their account has been hacked. Seeing this technology grow, Google Photos have also adapted this type of program.
What’s Hiding in My Deep Features?
Published in Mayank Vatsa, Richa Singh, Angshul Majumdar, Deep Learning in Biometrics, 2018
Ethan M. Rudd, Manuel Günther, Akshay R. Dhamija, Faris A. Kateb, Terrance E. Boult
Face biometric systems have seen remarkable performance improvements across many tasks since the advent of deep convolutional neural networks. Taigman et al. [12] pioneered the application of modern deep convolutional neural networks to face-recognition tasks, with DeepFace, the first network to reach near-human verification performance on the Labeled Faces in the Wild (LFW) benchmark [13]. In their work, they used an external image preprocessing to frontalize images and trained their network on a private data set of 4.4 million images of more than 4000 identities. Later, Oxford’s Visual Geometry Group (VGG) publicly released a face-recognition network [2] that omits the frontalization step, while training the network with a relatively small data set containing 95% frontal and 5% profile faces. Parkhi et al. [2] also implemented a triplet-loss embedding and demonstrated comparable performance to [12] on LFW despite the lower amount of training data. Lately, the IJB-A data set and challenge [10], which contains more profile faces, was proposed. Chen et al. [3] trained two networks on a small-scale private data set containing more profile faces than the DeepFace and VGG training sets. Using a combination of these two networks and a triplet-loss embedding that was optimized for comparing features with the dot product, they achieved the current state-of-the-art results on the IJB-A challenge. The combination of these deep features is the basis for our analysis in Section 7.3.
RFR-DLVT: a hybrid method for real-time face recognition using deep learning and visual tracking
Published in Enterprise Information Systems, 2020
Zhenfeng Lei, Xiaoying Zhang, Shuangyuan Yang, Zihan Ren, Olusegun F. Akindipe
Target tracking in the surveillance videos has also become an important research direction. Prior to 2010, the most commonly used methods in the field of target tracking were classical tracking methods, such as Meanshift, particle filtering, Kalman filtering, and feature-based optical flow algorithms (Wu, Lim, and Yang 2013). Since these methods cannot handle complex motion variations in videos, classical tracking methods have been less used after the advent of tracking methods based on correlation filtering and DL. In 2012, Henriques et al. (Henriques, Rui, and Martins et al. 2012) proposed a circulant structure kernel tracking method (CSK), which uses circulant matrices to perform dense sampling and the Fourier transform to achieve rapid detection. Its processing speed can reach 320 frames per second (FPS), which lays the foundation for real-time application of correlation filter methods. Henriques et al. (yi, Chen, Wang et al. 2014) proposed Kernelized correlation filter (KCF) (Joao et al. 2015) and dual correlation filter (DCF) to improve the tracking accuracy while ensuring high-speed processing. In 2013, Wang and Yeung (Wang and Yeung 2013) proposed the ‘deep learning tracking’ method, which was the first tracking algorithm to apply the deep network to single-target tracking. The multi-domain network (MDNet) method proposed by (Nam and Han 2016) used multiple types of tracking sequences to pre-train the network and fine-tuned the model during on-line tracking. Eventually, this method won the First Prize in the . In summary, although CNN-based FR algorithms above achieved positive results in static image FR tasks, the accuracy of most video-based FR tasks is not high. Only DeepFace (Taigman, Yang, and Ranzato et al. 2014) approaching human performance derived the better results (DeepFace: 97.35%; Human: 97.53%). In addition, due to high time complexity of those algorithms, they cannot meet the requirements of real-time FR in videos. For example, FaceNet (Schroff, Kalenichenko, and Philbin 2015) has only a processing speed of 2.5 frames per second on the YTF. In this paper, we conduct main contributions as follows: We adopted the video grouping method to perform face recognition on video sequences in groups, which simplifies the model calculation.A face feature network based on residual block structure was designed, named 32RBSNet, to improve the accuracy of FR.We introduced the visual tracking method into the FR framework to speed up the recognition speed.We proposed a real-time video FR method based on visual tracking and Deep Learning called RFR-DLVT, and achieved better performance on common datasets.