Explore chapters and articles related to this topic
Implementation of Data-Driven Approaches for Condition Assessment of Structures and Analyzing Complex Data
Published in M.Z. Naser, Leveraging Artificial Intelligence in Engineering, Management, and Safety of Infrastructure, 2023
Vafa Soltangharaei, Li Ai, Paul Ziehl
Principal component analysis (PCA) is a method to reduce the dimensionality of a data set. Many features can be extracted from AE signals, such as duration, counts, amplitude, peak frequency, energy, etc. However, working with all features and finding the relation between them is difficult. PCA can reduce the dimensionality of a data set by projecting the data on new coordinates. Input for a PCA is a matrix where columns are features (variables) and rows are observations (hits). PCA initially calculates a covariance of the input matrix. Then, eigenvalue analysis is conducted on the covariance matrix, resulting in eigenvalues and eigenvectors. The number of eigenvalues and eigenvectors is the same as the number of features in the input matrix. The eigenvectors have components equal to the number of features. The eigenvalues and corresponding eigenvectors are sorted from the largest to smallest values. Then, the original input matrix is transferred to the new space by multiplying a matrix, which contains all eigenvectors. According to the eigenvalues, the least important principal components can be deleted without losing a significant amount of information.
Fog Computing and Machine Learning
Published in Ravi Tomar, Avita Katal, Susheela Dahiya, Niharika Singh, Tanupriya Choudhury, Fog Computing, 2023
Kaustubh Lohani, Prajwal Bhardwaj, Ravi Tomar
Moreover, implementing ML in fog computing saves the human energy required to deliver and maintain the IoT services because ML models can perform with little to no human intervention. Moreover, they can detect or predict any potential malfunction and alert the system administrator before the malfunction can affect any services. Ability to perform tasks without explicit programming: Fog nodes collect a large variety of data; useful information needs to be extracted from all the data. Extraction reduces the data size as storage, and network bandwidth is limited in a fog system. A rule-based feature extraction system can be implemented to discard features that are not relevant in analysis. However, given the variety of data collected in an IoT system, a customized rule engine for each type of collected data must be designed to filter out the valuable information from the data. On the other hand, PCA—an unsupervised ML technique—can be used for dimensionality reduction of collected data. PCA constructs a set of new features called principal components (PC). Usually, a small subset of principal components are required to explain a large amount of variance in the original data. Moreover, PCA can do feature extraction without being explicitly programmed, unlike the rule-based method.
Machine Learning
Published in Seyedeh Leili Mirtaheri, Reza Shahbazian, Machine Learning Theory to Applications, 2022
Seyedeh Leili Mirtaheri, Reza Shahbazian
Principal Component Analysis (PCA), is a dimensionality reduction method that is often used to reduce the dimensionality of large datasets by transforming a large set of variables into a smaller one that still contains most of the information from the large set. Reducing the number of variables of a data set naturally comes at the expense of accuracy, but the trick in dimensionality reduction is to trade a little accuracy for simplicity. Because smaller datasets are easier to explore and visualize they make analyzing data much easier and faster for machine learning algorithms without the extraneous variables to process. In nutshell, the idea of the principal component analysis method is to reduce the number of variables of a dataset, while preserving as much information as possible.
Spatio-temporal variations in the ecological vulnerability of the Upper Mzingwane sub-catchment of Zimbabwe
Published in Geomatics, Natural Hazards and Risk, 2023
Bright Chisadza, France Ncube, Margaret Macherera, Tsitsi Bangira, Onalenna Gwate
PCA is a data pre-processing method that reduces the number of original data types by applying the dimensionality reduction approach to reduce the number of indicator data to a few principal components. This reduces the total number of original variables to a smaller one (Wei et al. 2020). To achieve simpler handling of the indicator data and to easily show the relationships between the different indicators, the PCA approach retains as much information as possible that was originally included in the analysed data. The PCA approach identifies a small number of principal components that are more comprehensive than the original indicators. Since a smaller number of principal components represent most of the information in the primary data, the primary forces influencing ecological vulnerability could be identified. PCA was conducted separately for the indicators in each subcategory following a similar procedure by Dossou et al. (2021) and Cao et al. (2022). For example, six indicators were subjected to PCA for the hydro climate category. Four principal components were then selected to calculate the hydroclimate vulnerability, with the cumulative contribution of the principal components exceeding more than 90% as the default standard for this study as indicated by (Cao et al. 2022). The same approach was used for the socio-economic, land resource, and topographic indicators. In the latter cases, all indicators in each category were selected in the PCA. Table 3 shows the eigenvalues and cumulative contributions of the principal components for hydro climate indicators for 1990–2020.
Fluoride and Metals in the Agricultural Soils of Mica Mining Areas of Jharkhand, India: Assessing the Ecological and Human Health Risk
Published in Soil and Sediment Contamination: An International Journal, 2023
PCA and CA are the two important tools used for statistical source apportionment of contaminants in various environmental matrices. PCA is a method which reduces the variation in data by converting the number of measured variables to smaller number of artificial variables depicted as principal components (PCs). The extracted variables with eigenvalue greater than unity are taken up for analysis since they attribute for maximum data variance and thus each PC is linked to a source of the variables (contaminants) (Kolsi et al. 2013). To decrease the role of insignificant variables, varimax rotation is applied to the extracted PCs (Closs and Nichol 1975). All the considered variables have a weight factor called PC score allied to it which is the correlation between the factor and the variable.
Study of hourly site-pair correlations of the clear-sky index and its predictor variables for long-term resource planning of solar cities
Published in International Journal of Ambient Energy, 2022
Correlation values are calculated for each site pair in each of the eight cities. Correlation values for predictor variables, i.e. dm,n, ρWSm,n,ρRHm,n, ρTm,n, are fed to SPSS 16.0 software for principal component analysis. The Principal Component Analysis (PCA) is a data reduction technique that is used to reduce the number of predictor variables to a lesser number of components, which accounts for most of the variance in the original variables. In the principal component analysis, the first Kaiser Meyer Olkin (KMO) (Latent root) test value is observed. If the KMO test value is lesser than 0.6, then this means that the variables cannot be reduced into underlying components. If the KMO test value is more than 0.6, then the variables can be reduced into a lesser number of principal components. The number of principal components is defined either by the scree plot bend, or the total variance explained matrix. The number of principal components which can explain more than 85% of the variance are extracted. Next comes the component matrix. The component matrix shows the loadings of the variables on the principal components extracted. Components contribute more to the variables if the absolute value of loading is more. Next, scores for the principal components are calculated.