Statistical classification – Knowledge and References

Explore chapters and articles related to this topic

A Problem Solving System for Data Analysis, Pattern Classification and Recognition

Published in Abraham Kandel, Gideon Langholz, Lotfi A. Zadeh, Hybrid Architectures for Intelligent Systems, 2020

The main objective in classification problems is to generate a classification rule or a classifier to systematically predict what class a case is in [6, 7]. A classifier is a partition of the measurement space χ, which contains all possible measurement vectors, into J disjoint subsets A1, A2, .., Aj, χ = ∪ Aj such that for every measurement feature vector x ∈ Aj the predicted class is j. Classifiers are defined based on the measurement data on N cases observed in the past together with their actual classification. Two major approaches are currently used. One is to use classical pattern recognition methods based on statistics and the other is to use artificial neural networks. In this section we will review three classical statistical classification methods which are used in our system, namely, the Bayes’ rule, the Fisher’s Discriminant Function, and the K-Nearest Neighbor rule.

Automated Biventricular Cardiovascular Modelling from MRI for Big Heart Data Analysis

View Chapter

Purchase Book

Published in Ervin Sejdić, Tiago H. Falk, Signal Processing and Machine Learning for Biomedical Big Data, 2018

Kathleen Gilbert, Xingyu Zhang, Beau Pontré, Avan Suinesiaputra, Pau Medrano-Gracia, Alistair Young

A logistic regression model [52] was used after the PCA was complete. The model allowed for identification of which shape modes from the PCA were most associated with the differences between MI patients and the asymptomatic patients. The weights of the PCA components (up to 90% of the total variability) were used as predictors for classification. In statistics, logistic regression is a type of probabilistic, statistical classification model, which is used to predict a binary response from continuous, binary or canonical variables. MESA cases (non-patients) were assigned a zero label, whereas DETERMINE cases (patients) were assigned a one label. These values were used to obtain the coefficients in the regression models. Thus, the following equation can be used to calculate the probability that a new case belongs to the patient class [53]: P=11+exp(−β0+∑βiXi),

Statistical Discriminant Functions

View Chapter

Purchase Book

Published in Sing-Tze Bow, Pattern Recognition and Image Preprocessing, 2002

Sing-Tze Bow

So far the formulation of the statistical classification problem and the optimum discriminant function for a normally distributed pattern have been discussed. The next problem that might interest us will be how to determine the unknown probability density function. One of the ways of doing this is by functional approximation. Assume that we wish to approximate p(x|ωi) by a set of functions () p^(x|ωi)=∑k=1Kcikϕk(x)

A Comparison of Lucene Search Queries Evolved as Text Classifiers

View Article

Journal Information

Published in Applied Artificial Intelligence, 2018

Laurence Hirsch, Teresa Brunsdon

Such a method, sometimes referred to as “knowledge engineering,” provides accurate rules and has the additional benefit of being human understandable—that is, the definition of the category is meaningful to a human, producing additional uses of the rule including category verification. However, the disadvantage is that constructing the required rules requires significant human input from both those with knowledge of the domain and of rule construction (Apt´e et al. 1994). Since the 1990s, the machine learning approach to text categorization has become dominant, requiring only a set of pre-classified training documents and an automated classifier. A wide variety of statistical classification systems have been developed, for example: Naive Bayes, k-nearest neighbor, support vector machines (SVMs) and neural networks (Baharudin, Lee, and Khan 2010). Probabilistic classifiers based on numerical models often require hundreds or thousands of features and are not open to human interpretation or maintenance.