Explore chapters and articles related to this topic
Role of Dimensionality Reduction Techniques for Plant Disease Prediction
Published in Utku Kose, V. B. Surya Prasath, M. Rubaiyat Hossain Mondal, Prajoy Podder, Subrato Bharati, Artificial Intelligence and Smart Agriculture Technology, 2022
Muhammad Kashif Hanif, Shaeela Ayesha, Ramzan Talib
Singular value decomposition (SVD) is a linear approach that transforms the high dimensions of data into lower dimensions using matrix decomposition and linear transformation functions. SVD can be applied to almost all datasets that can be processed as a matrix (Wang & Zhu, 2017; Wang et al., 2021). The limitation of SVD is that it can reduce the dimensions of data based on linear projections (Zhang et al., 2018). Singular values (SVs) represent stable, rotation, and ratio invariant features of the image used for disease recognition. Although the SVs of leaf images represents algebraic features that may limit the efficiency of ML models to make precise predictions (Wang et al., 2021). SVD can preserve the essential features of the image that offers good performance for object recognition. Several extensions of SVD have been proposed to improve the efficiency of SVD (Zhang & Wang, 2016; Wang et al., 2021).
Principal Component Analysis
Published in Bogdan M. Wilamowski, J. David Irwin, Intelligent Systems, 2018
Anastasios Tefas, Ioannis Pitas
In many cases, PCA is implemented using the singular value decomposition (SVD) of the covariance matrix. SVD is an important tool for factorizing an arbitrary real or complex matrix, with many applications in various research areas, such as signal processing and statistics. SVD is widely used for computing the pseudoinverse of a matrix, for least-squares data fitting, for matrix approximation and rank, null space calculation. SVD is closely related to PCA, since it gives a general solution to matrix decomposition and, in many cases, SVD is more stable numerically than PCA. Let X be an arbitrary N × M matrix and C = XTX be a rank R, square, symmetric M × M matrix. The objective of SVD is to find a decomposition of the matrix X to three matrices U, S, V of dimensions N × M, M × M, and M × M, respectively, such that () X=USVT,
Feature Engineering for Text Data
Published in Guozhu Dong, Huan Liu, Feature Engineering for Machine Learning and Data Analytics, 2018
Chase Geigle, Qiaozhu Mei, ChengXiang Zhai
The first major method is called Latent Semantic Analysis (LSA) [11]. The central idea in LSA is to try to perform dimensionality reduction on the term-document matrix (often with weights transformed using TF-IDF weighting) to obtain a representation of documents in a lower-dimensional semantic vector space, rather than staying in the original word-based vector space. This is achieved via the application of truncated singular value decomposition on the term-document matrix (for which there exist efficient randomized algorithms [22]). The goal of SVD is to find a decomposition of the input matrix M into a multiplication of three matrices UΣV, where U is the matrix of left singular vectors, Σ is a diagonal matrix of singular values, and V is the matrix of right singular vectors. If we stop the computation after finding only the first k singular values, we have the truncated SVD. The example input and output of the truncated SVD for LSA is given in Figure 2.5. Geometrically, we can think of the left singular vectors (multiplied with the singular values matrix) as mapping each term into a reduced k-dimensional space. Similarly, we can think of the right singular vectors (multiplied by the singular values matrix) as mapping each document into a reduced k-dimensional space. Each dimension in this reduced space can be thought of as representing a latent concept.
Real-time anomaly detection methodology in the electric submersible pump systems
Published in Petroleum Science and Technology, 2022
Long Peng, Guoqing Han, Arnold Landjobo Pagou
The PCA model proposed in this paper contained two processes: the first one is model construction and the other is model prediction. The model construction process aims to create the new principal component space (PCs) to reassess the original ESP system. More than 170,000 historical data of 110 stable operating ESP wells were implemented to construct a robust PCA model. The missing information and zero values among the selected stable operating data had been removed. Moreover, data normalization for all the stable operating data was performed based on z-scores. The singular value decomposition (SVD) was a factorization of a real matrix, generating the eigendecomposition of a square matrix. Singular value decomposition was applied to calculate each eigenvalue decomposition on the coefficient matrix from the obtained datasets. Furthermore, every principal component had a specific eigenvalue and each could explain different percentages of the total variance. Figure 1 showed the construction of the principal component space (PCs). It was observed that seven principal components captured over 87% of the previously selected parameters and the top-ranked principal component had a larger eigenvalue. Monitoring these seven principal components provided a better way to detect ESP trips or failures in a lower dimension. Once the principal component space (PCs) was constructed, each principal component could be represented by a corresponding eigenvector combined with variable parameters.
Automated categorization of student's query
Published in International Journal of Computers and Applications, 2022
Naveen Kumar, Hare Krishna, Shashi Shubham, Prabhu Padarbind Rout
The proposed platform's core has a model that categorizes students' queries. The text classification is not a new area; authors of [6–17] have already explored it. This article uses text classification for the classification of students' queries. The query categorization can be divided into a series of tasks like data collection, text preprocessing (data filtering, tokenization, stemming, stop word removal, and vectorization), feature reduction (dimensionality reduction or feature selection), classification, performance evaluation, etc. There are many dimensionality reduction approaches, i.e. Principal Component Analysis (PCA), Singular Value Decomposition (SVD), Linear Discriminant Analysis, Isomap Embedding, Locally Linear Embedding, etc. SVD is better suited for sparse data (data with many zeros) [18]. Therefore, the proposed platform uses Singular Value Decomposition (SVD) for dimensionality reduction. Five different machine learning-based approaches, i.e. Naïve Bayes (NB) [19] classifier, Multi-Layer Perception with Back Propagation (MLP with BP) [20], K-Nearest Neighbours (KNN) [21], Support Vector Machine (SVM) [22], and Random Forest (RF) [23] classifier are used for categorizing the query. These classifiers are used as they have different natures; this will help us determine which class of classifiers are suitable for query categorization. Ten-fold cross-validation is used to evaluate the performance of the classifiers on four different metrics, i.e. Accuracy, Precision, Recall, and F1-Measure. The result for various dimensions and folds are shown using box plots.
When contextual information meets recommender systems: extended SVD++ models
Published in International Journal of Computers and Applications, 2022
Maryam Jallouli, Sonia Lajmi, Ikram Amous
With the advent of Web 2.0, the Internet has been widely used, something that has changed the human lifestyle. When a user surfs the net or opens an application, some recommendation lists, which may be interesting for him or for her, appears. These recommendations are based on historical browsing data and evaluation information, etc. A systematic survey about recommender systems has been studied in detail in [1] and [2]. In fact, the Singular Value Decomposition (SVD) [3] model is the most popular method belonging to collaborative filtering approaches that provide recommendations by considering both the user and the item bias information. Moreover, the SVD++ model [3] is a derivative model of SVD, which achieves better recommendation accuracy due to the consideration of implicit feedback information.