Explore chapters and articles related to this topic
Dimensionality Reduction
Published in Harry G. Perros, An Introduction to IoT Analytics, 2021
Discriminant analysis is a classification technique, whereby an unlabeled data point is assigned a label given a labeled data set. Specifically, we have a data set that is already grouped into different classes, i.e., each data point has been assigned a class id or a label. Discriminant analysis can be used to assign a label to a new unlabeled data point.
An applied credit scoring model
Published in Noura Metawa, Mohamed Elhoseny, Aboul Ella Hassanien, M. Kabir Hassan, Expert Systems in Finance, 2019
Esther Castro, M. Kabir Hassan, Mark Rosa
Two common used methodology when working with probability of default is discriminant analysis and logistic regression. Linear discriminant analysis assumes normal distribution in the explanatory variables. Logistic regression, however, does distribution assumptions of the independent variables, thus it is more general.
Potential Applications of Multivariate Analysis for Modeling the Reliability of Repairable Systems—Examples Tested
Published in Mangey Ram, Modeling and Simulation Based Analysis in Reliability Engineering, 2018
Miguel Angel Navas, Carlos Sancho, Jose Carpio
The discriminant analysis procedure is designed to help distinguish between two or more data groups based on a group of p observed quantitative variables. It does so by constructing discriminant functions that are linear combinations of variables. The purpose of such an analysis is usually one or both of the following:
Integrated GIS and multivariate statistical approach for spatial and temporal variability analysis for lake water quality index
Published in Cogent Engineering, 2023
Poornasuthra Subramaniam, Ali Najah Ahmed, Chow Ming Fai, Marlinda Abdul Malek, Pavitra Kumar, Yuk Feng Huang, Mohsen Sherif, Ahmed Elshafie
Spatial discriminant analysis was performed using water quality outcomes concerning the three clusters obtained after cluster analysis. Discriminant analysis aims to ascertain the major variables connected to the spatial variability between the three clusters. The clusters comprise the predicted variable; every other water quality indicator is a predictor. Discriminant function coefficients (DFs) and classification matrices (CMs) produced from the conventional and stepwise discriminant analysis approaches are listed in Tables 8, 9 and 10. The determined categorisation function coefficients indicate that some indicators like COD and NH3-N have negative values. Conventional discriminant analysis produces the associated classification having 74.2% accuracy. Further, the stepwise approach depends on three indicators: pH, temperature and TSS; its classification matrix has a 69.7% accuracy. Discriminant analysis implemented concerning spatial water quality variables suggested that significant variables strongly affect water quality for different areas of Kenyir Lake. The outcomes offer better insights concerning water catchment management and devising better inspection approaches to safeguard Kenyir Lake’s water quality.
Basketball performance is affected by the schedule congestion: NBA back-to-backs under the microscope
Published in European Journal of Sport Science, 2021
Pedro T. Esteves, Kazimierz Mikolajec, Xavier Schelling, Jaime Sampaio
Stage 3: Discriminatory power of game-related statistics between fixture congestion cycles. A discriminant analysis was used to identify the game-related statistics that best discriminate between the different fixture congestion cycles: back-to-back games, 1 d rest; 2 days rest; 3 or more days rest. We considered a cut off of ± 0.30 for the structure coefficients (SC) to interpret the discriminant function, meaning that variables with higher absolute values were considered to better differentiate the different fixture congestion cycles (Pedhazur, 1982). Validation of discriminant models was conducted by applying the leave-one-out method of cross-validation (Norusis, 2004). Cross-validation analysis evaluated the usefulness of discriminant functions when classifying new data. This method implies generating the discriminant function on all but one of the participants (n−1) and then testing for the group membership on that participant. The process is repeated for each participant (n times) and the percentage of correct classifications is computed as the mean for the n trials.
Biomarkers of exposure and effect in a working population exposed to lead, manganese and arsenic
Published in Journal of Toxicology and Environmental Health, Part A, 2018
Daniela C Serrazina, Vanda Lopes De Andrade, Madalena Cota, Maria Luísa Mateus, Michael Aschner, Ana Paula Marreilha dos Santos
Statistical analysis was performed using IMB SPSS Statistics, version 23, for Windows. Data were expressed as mean ± SD. Kolmorogoff–Smirnoff and Levene’s tests revealed, respectively, lack of data normality and homoscedasticity. Therefore, for all BMs, the groups were compared by Mann–Whitney tests. Correlation analysis was performed determining Spearman’s coefficients to access correlations between exposure BMs and BMs of effect. Discriminant analysis was used to assess whether a set of urinary BMs might provide more reliable results than blood BMs regardless of the type of exposure of each subject (occupational or not occupational). This statistical tool enabled identification of variables that better discriminate two or more different groups of individuals. Discriminant analysis is known to be very robust to data assumption violations, but only if two conditions are met: the dimension of the smallest group needs to be less than the number of predictor variables, and for each variable, the means in each group cannot be proportional to its variances (Marôco 2014). After verifying that these two conditions were ensured, several combinations of urinary or blood BMs were tested, achieving a compromise between the use of a minimum number of BMs and maximization of predictive performance of the model, which was evaluated through examination of classification results tables. The significance of all the results was considered when p values were less than 0.05.