Explore chapters and articles related to this topic
Unsupervised Learning
Published in Peter Wlodarczak, Machine Learning and its Applications, 2019
As usual, there are other intrinsic cluster evaluation methods. The Dunn index and Davies-Bouldin index both belong to this class of cluster evaluation algorithms. The Dunn index, like the Silhouette index, is a method for cluster evaluation that depends solely on the data itself, not on any external ground truth. The aim is the same as with the Silhouette index: To obtain clusters that are well separated and compact. The instances within a cluster should have little variance. The higher the Dunn index, the better the clustering. The Silhouette and Dunn index are internal measures since they only use information present in the data set. Cluster evaluation against external data such as a gold standard data is also possible. An example is entropy, which measures how well the clustering data matches external data.
Benchmarking the performance of urban rail transit systems: a machine learning application
Published in Transportmetrica A: Transport Science, 2023
Farah A. Awad, Daniel J. Graham, Laila AitBihiOuali, Ramandeep Singh, Alexander Barron
There are many indices and statistical tests that can be used to determine the optimal number of clusters for different algorithms, e.g. the gap statistic (Tibshirani, Walther, and Hastie 2001) and the KL index (Krzanowski and Lai 1988). In this study, the package ‘NbClust' (Charrad et al. 2014) in R is used to find the optimal number of clusters based on 30 different indices; the number of clusters selected by the majority of indices is chosen as the optimal number. Furthermore, to choose the optimal clustering algorithm, the results are compared based on two indices that measure the compactness and the separation of clusters: the Silhouette coefficient and the Dunn index. The Silhouette width is a measure of cluster separation which estimates the average distance between clusters, where a higher value indicates more variation between clusters. Furthermore, the Dunn index is a measure of both cluster compactness and separation. It is a ratio of the minimum distance between objects in a cluster and objects in other clusters, to the maximum distance between objects in a cluster. A higher Dunn index value indicates better clustering results.
Evolutionary multi-objective customer segmentation approach based on descriptive and predictive behaviour of customers: application to the banking sector
Published in Journal of Experimental & Theoretical Artificial Intelligence, 2022
Chiheb-Eddine Ben Ncir, Mohamed Ben Mzoughia, Alaa Qaffas, Waad Bouaguel
The Dunn index represents the ratio between cluster-separation and clusters-compactness of clusters. Separation measures the minimum distance between any pair of customers belonging to different clusters while compactness measures the maximal distance between any pair of customers belonging to the same cluster. If the partitioning contains well-separated clusters, distances between clusters are usually large and the internal distances between customers in the same cluster are expected to be small. Therefore, a higher Dunn Index indicates a possible better segmentation. We note that the choice of the Dunn Index is given as an example of a possible internal validation index given its simplicity and popularity. The model is generic and still valid for other exiting validation criteria which can be used as an optimisation equation for the third axis. However, we must be careful if it’s a maximisation (consider ) or a minimisation (consider ) of the validation criterion.
Machine learning for lumbar and pelvis kinematics clustering
Published in Computer Methods in Biomechanics and Biomedical Engineering, 2023
Seth Higgins, Sandipan Dutta, Rumit Singh Kakar
Dunn index is a clustering evaluation algorithm developed in 1974 (Dunn 1974). Dunn index is defined by Equation (8) where the min is the minimum distance between any two time-series that fit in different clusters, and max is the maximum distance between two time-series in the same cluster (Dunn 1973). δ(Ci,Cj) is the inter-cluster distance between cluster Ci and Cj. Δ(Ck) is the intra-cluster distance within cluster Ck (Equation (8)). The higher Dunn index value indicates the number of clusters for a given data set and the clustering solution when comparing different clustering algorithms (Kryszczuk and Hurley 2010).