Explore chapters and articles related to this topic
Multivariate Normal Distribution
Published in Jhareswar Maiti, Multivariate Statistical Modeling in Engineering and Management, 2023
Statistical distance plays a significant role in multivariate data analysis. In order to understand it fully, we start with Euclidian distance and then gradually get into the deeper meaning of statistical distance. The Euclidean distance is a measure of the distance between two points in a Cartesian coordinate system. For example, the distance between two cities can be measured with two-dimensional Cartesian coordinates. Denoted by X1 and X2 axes, the Cartesian coordinate system can be represented as Figure 5.1. The distance between the origin O (0,0) and a point P (x1, x2) can be written as:d(OP)=x12+x22
Clustering
Published in Jan Žižka, František Dařena, Arnošt Svoboda, Text Mining with Machine Learning, 2019
Jan Žižka, František Dařena, Arnošt Svoboda
The Euclidean distance is a standard geometric distance measuring the distance of two points in an n-dimensional space (in a two or three dimensional space, it can easily be measured by a ruler). It is the implicit distance for the k-means algorithm [120]. dEuclidean(x1,x2)=∑i=1m(x1i−x2i)2
Chemometric techniques: Theoretical postulations
Published in Madhusree Kundu, Palash Kumar Kundu, Seshu Kumar Damarla, Chemometric Monitoring: Product Quality Assessment, Process Fault Detection, and Applications, 2017
Madhusree Kundu, Palash Kumar Kundu, Seshu Kumar Damarla
Similarity of newly collected plant data (via an online measuring system) to the data sets pertaining to various operating conditions can be expressed in terms of Euclidean and Mahalanobis distances. The Euclidean distance or Euclidean metric is the “ordinary” distance (i.e., straight line) between two points in Euclidean space. With this distance, Euclidean space becomes metric space. The associated norm is called the Euclidean norm. Older literature refers to the metric as the Pythagorean metric. The Mahalanobis distance is a measure of the distance between a point P and a distribution D, introduced by P.C. Mahalanobis in 1936 [15]. It dictates the number of standard deviations away the point P is from the mean of D. The distance is zero if P is at the mean of D, and grows with P moving away from the mean. Apart from them, some new criterions have been proposed to determine the similarity or dissimilarity among the incoming process data and process historical database, aiming towards the detection of process normal/faulty conditions.
A manhattan metric based perturb and observe maximum power point tracking algorithm for photovoltaic systems
Published in Energy Sources, Part A: Recovery, Utilization, and Environmental Effects, 2022
Euclidean distance is one of the well-known distance metrics, and it uses the Pythagorean theorem to find the distance between two points in the searching space. From this point of view, the adaptive P&O algorithm proposed by Loukriz et al. is a distance-based MPPT algorithm. In this work, the Pythagorean-based P&O method is compared to the conventional P&O algorithm. Performance of the system verified using both computer simulations and laboratory experiments. Mahmod et al. also have proposed a Pythagorean theorem-based P&O algorithm that aims to achieve the MPP in the presence of partial shading conditions (Mahmod Mohammad et al. 2020). Although computer simulation results prove the efficiency of the proposed algorithm, this work does not have experimental verification. A considerable amount of literature has been published to solve the drift problem. In the studies conducted by (Belkaid, Colak, and Kayisli 2020), the incremental conductivity value was observed to detect solar insolation changes. Although these methods solve the drift problem, the proposed methods use the constant ΔD value, which causes the oscillation around the MPP.
Adaptive Convex Clustering of Generalized Linear Models With Application in Purchase Likelihood Prediction
Published in Technometrics, 2021
Shuyu Chu, Huijing Jiang, Zhengliang Xue, Xinwei Deng
The simulated data contain N observations with binary responses at each observation . Specifically, three continuous features, are simulated for each observation, where U1 and U2 are used as clustering features, , to calculate the prespecified weight wjk. The set of observation pairs is formed by connecting five nearest neighbors to each observation (Hallac, Leskovec, and Boyd 2015). The Euclidean distance calculated by is used to define the nearest neighbors. Thus, each observation has at least five connected observations. For simplicity, only X1 = U3 is included in the logistic regression model in the simulation study.
An evaluation of the efficiency of similarity functions in density-based clustering of spatial trajectories
Published in Annals of GIS, 2019
A. Moayedi, R. Ali Abbaspour, A. Chehreghan
The Euclidean distance is a traditional distance which is employed in various areas of data mining like determining the trajectory similarity. Among all advantages of this DF, the simplicity of implementation, low computation time, and not requiring the initial parameter can be mentioned. However, its result is considerably sensitive to the presence of noise and outliers. Moreover, it requires trajectories to be the same length and be sampled at fixed time intervals. (Pfeifer and Deutrch 1980; Priestley 1980). To overcome these stated drawbacks, other DFs were defined that will be discussed later. Given two trajectories Li and Lj of length n in p dimension, the Euclidean distance between them, DE (Li, Lj), is defined as Formula 2.