Explore chapters and articles related to this topic
CAD of GI Diseases with Capsule Endoscopy
Published in de Azevedo-Marques Paulo Mazzoncini, Mencattini Arianna, Salmeri Marcello, Rangayyan Rangaraj M., Medical Image Analysis and Informatics: Computer-Aided Diagnosis and Therapy, 2018
Although the BoW model works quite well in some classification tasks, there are some disadvantages associated with the BoW model since it is usually affected by the synonyms of visual words and fails to reveal the semantics among words. To overcome this shortcoming, a generative model named Probabilistic Latent Semantic Analysis (PLSA) [38] has been proposed to disambiguate visual words by introducing a latent topic layer between the image and visual words. PLSA [39] was introduced by Hofmann originally for information retrieval and later extended for unsupervised learning and document classification [38]. In image analysis, it is assumed that each image consists of multiple topics, and the occurrences of visual words in images are results of topic mixture. In this way, the PLSA model could reduce the dimension of images since the number of topics is smaller than the number of visual words in images. The PLSA models have already been applied in medical image analysis [40–44] and yielded good performance. The method [40] used the PLSA to analyze image features and then classified images into different organs. Reza et al. [41] utilized the PLSA on the extracted features to provide more stable representation of the x-ray images.
Feature Engineering for Text Data
Published in Guozhu Dong, Huan Liu, Feature Engineering for Machine Learning and Data Analytics, 2018
Chase Geigle, Qiaozhu Mei, ChengXiang Zhai
To address these issues, a generative model called Probabilistic Latent Semantic Analysis (PLSA) was introduced [23]. PLSA seeks to uncover a latent semantic representation of documents just like LSA, but it does so by constructing a probability model for the corpus and representing documents under that probability model. Thus, document representations obtained by PLSA can be interpreted by inspection because the parameters of a PLSA model are parameters of well-formed probability distributions.
Finding Clusters
Published in Wendy L. Martinez, Angel R. Martinez, Jeffrey L. Solka, Exploratory Data Analysis with MATLAB®, 2017
Wendy L. Martinez, Angel R. Martinez, Jeffrey L. Solka
Probabilistic latent semantic analysis (PLSA or PLSI) is one approach to cast the identification of latent structure of a document collection within a statistical framework. We follow Hofmann (1999a, 1999b), the originator of PLSA, in our discussions.
Factors influencing crowdsourcing riders’ satisfaction based on online comments on real-time logistics platform
Published in Transportation Letters, 2023
Yi Zhang, Xiaomin Shi, Zalia Abdul-Hamid, Dan Li, Xinle Zhang, Zhiyuan Shen
The topic model is a common machine learning application, which is usually used for text classification. In the early stage, some scholars proposed a latent semantic analysis model (LSA) based on singular value decomposition (SVD), which can achieve dimensionality reduction processing and information extraction of documents. Later, in order to deal with the polysemy of words faced by the LSA model, the Probabilistic Latent Semantic Analysis (PLSA) model that introduces probability statistics has been put forward (Hofmann 1999). On this basis, some research introduced a three-level probabilistic topic model (Blei, Ng, and Jordan 2003), also known as the Latent Dirichlet Allocation (LDA), which can automatically identify the potential topic of the text, and give the word probability of the topic. Due to its modular characteristics, it has a unique advantage in the processing of text similarity. Some scholars use probability generation models to improve the performance of sentiment analysis (Almars, Li, and Zhao 2019), others use the LDA model to dig out the factors that affect customer satisfaction (Jelodar et al. 2019). In addition, some scholars have proposed a weakly supervised topic sentiment joint model for short text documents and sparse text features to improve topic recognition and sentiment analysis (Fu et al. 2018).
Artificial intelligence methods to support the research of destination image in tourism. A systematic review
Published in Journal of Experimental & Theoretical Artificial Intelligence, 2022
Angel Diaz-Pacheco, Miguel Á. Álvarez-Carmona, Rafael Guerrero-Rodríguez, Luz Angélica Ceballos Chávez, Ansel Y. Rodríguez-González, Juan Pablo Ramírez-Silva, Ramón Aranda
Although many studies topics modelling has been used to discover the dimensions of DIT as Cultural attractions, ‘Climate’, ‘quality of service’, in many more aspect-based sentiment analysis has been employed as a procedure for measuring the polarity of feelings in pre- established dimensions. We are sure that a sound research methodology has discovered these dimensions. However, we believe that conducting a thematic modelling analysis could provide valuable information about these dimensions or even the discovery of new ones such as ‘IT infrastructure’, or ‘Sanitisation’, a very important and rarely considered dimension, in a pre-pandemic world. In a previous section of this work, we briefly described a popular algorithm for topic modelling (LDA). However, there is a wide range of available tools for this task. In the work of Chehal et al. (2020), there are three major approaches for modelling topics, Latent Semantic Analysis (LSA), Probabilistic Latent semantic analysis (PLSA), and Latent Dirichlet Allocation (LDA). In LSA, the singular value decomposition technique is applied to reduce the dimensionality of a document-term matrix, and topics are found from identified terms. As for PLSA, it is a partially generative model that identifies different contexts of usage in a word and determines the probability of a word-document occurrence. Regarding LDA, there are several versions of this algorithm, but we would like to draw attention to the approach proposed by Ozyurt and Akcayol (2021) that has been specifically designed to perform topic discovery in short user reviews.
Review and Implementation of Topic Modeling in Hindi
Published in Applied Artificial Intelligence, 2019
Santosh Kumar Ray, Amir Ahmad, Ch. Aswani Kumar
Introduced by Thomas Hofmann in 1999, PLSA is a topic modeling method that improves LSA (Hofmann 1999). While LSA is based on linear algebra, the PLSA model uses a more principled approach that is based on a mixture decomposition derived from a latent class model called aspect model (Hofmann, Puzicha, and Jordan 1999). In this model, each document is considered an unordered set of words. This model associates the topics with the document-word pairs. However, the words and documents in the document collection are assumed to be conditionally independent for a topic. The likelihood estimation and model fitting are done with the help of Expectation Maximization algorithm (Dempster, Laird, and Rubin 1977). It identifies and distinguishes different contexts of words without referring to a dictionary. PLSA allows disambiguating words and detecting similarities by grouping words sharing common contexts. Another way to present PLSA is as a matrix factorization approach. Then linear algebra-based algorithms are used to factorize high-dimensional sparse document term matrices into low-dimensional matrixes with non-negative coefficients. PLSA algorithm has been used in auto-annotation of images (Monay and Gatica-Perez 2004) and object categorization (Sivic et al. 2005).