Explore chapters and articles related to this topic
A Study on the Influence of Angular Signature of Landmark Induced Triangulation in Recognizing Changes in Human Emotion
Published in Sourav De, Paramartha Dutta, Computational Intelligence for Human Action Recognition, 2020
Nasir Md, Dutta Paramartha, Nandi Avishek
Recognition of temporal changes of human facial expressions provides a substantial impact on researchers in recent eras. Of late Affective Computing [3], which deals with intelligent recognition of human emotion, attracts the attention of the concerned research community to a large extent. It is very difficult to understand human emotions for a machine but an Affective Computation facilitates a machine to perceive, realize and amalgamate emotions. According to [4], various expressions of a human face at different levels has been regulated by varying facial activities such as the action of muscles on the face may cause changes in facial behaviors individually or in groups. Paul Ekman and Friesen [4] have placed human emotions into six different basic expression labels: anger, disgust, fear, happiness, sadness, and surprise. They introduced the Facial Action Coding System (FACS) imposing an indication of recognition of landmarks movement on the face in terms of the temporal profile of action units (AUs). Authors in [5] realized that detection of perfect landmark points on the face is a more difficult task than facial expression classification. Active Appearance Model (AAM) [2] is an assortment of both texture and shape models that provides landmark points on the face. In [6], the authors mentioned that Facial Expression Recognition System (FERS) can be developed by using two main approaches: one is a static based approach which used static images to get geometric features and another is a dynamic based approach that utilized video frames. Tracking of landmark points on the face from video frames is a more challenging task than static images due to high data dimensionality. Authors in [7],[8],[9] used static images for emotion recognition. Only finding the best geometric representation of images or sequence of images is not sufficient to classify the human emotion. To make a perfect classification of expression, a classifier needs to play an important role. Most of the approaches like [10], [11], [12] used Support Vector Machine (SVM), Artificial Neural Network (ANN) and Nave Bayesian Classifier (NB) respectively to discriminate facial expressions into different basic labels.
Integrating Feature Extractors for the Estimation of Human Facial Age
Published in Applied Artificial Intelligence, 2019
The first public domain database that has been used for age estimation is FG-NET (Lanitis, Taylor, and Cootes 2002) which contains 1002 images of 82 individuals and the age of subjects range from 0 to 69 years. Many researchers used FG-NET database for age estimation and classification (Fu and Huang 2008; Izadpanahi and Toygar 2014; Mirzaei and Toygar 2011). A comprehensive survey of research on facial aging using the FG-NET aging database had been published by Panis et al. in 2016 (Panis et al. 2016). One of the earliest steps in age estimation is to extract particular visual feature descriptors. These features should be robust within the same age and discriminative among different ages. Additionally, the features’ dimensionality and computation time should be optimal. Some methods rely on appearance models and flexible shape such as Active Appearance Model (AAM) and Active Shape Model (ASM) techniques. These statistical methods try to model aging patterns (Cootes, Edwards, and Taylor 1998; Panis et al. 2016; Scandrett, Solomona and Gibsona 2006) by capturing the fundamental modes of variation in intensity and shape founded in a series of facial images and encoding the face signatures based on these characteristics. On the other hand, some approaches use various types of feature extractors, and then use a classification or regression method to estimate the age. For instance, Bio-Inspired Features (BIF) scheme which classifies the images by mimicking the process of visual cortex in recognition tasks, has frequently applied for age estimation (Geng, Yin and Zhou 2013; Riesenhuber and Poggio 1999). The BIF method is a feed-forward network structure that consists of several cascaded convolutional and pooling layers. In convolutional layer, an input image is convolved with a bank of multi-scale and multi-orientation Gabor filters. Then in the pooling layer, by using MAX operation the results will be down-sampled. (Guo et al. 2009) construct a simplified version of BIF for age estimation which has only two layers and they manually tuning the filter banks specifications. These extracted features are also used in their posterior studies (Guo and Mu 2011).