Explore chapters and articles related to this topic
Finding Clusters
Published in Wendy L. Martinez, Angel R. Martinez, Jeffrey L. Solka, Exploratory Data Analysis with MATLAB®, 2017
Wendy L. Martinez, Angel R. Martinez, Jeffrey L. Solka
We start with the cophenetic matrix, H. The ij-th element of H contains the fusion value where object i and j were first clustered together. We only need the upper triangular entries of this matrix (i.e., those elements above the diagonal). Say we want to compare this partition to the interpoint distances. Then, the cophenetic correlation is given by the product moment correlation between the values (upper triangular only) in H and their corresponding entries in the interpoint distance matrix. This has the same properties as the product moment correlation coefficient. Values close to one indicate a higher degree of correlation between the fusion levels and the distances. To compare two hierarchical clusters, one would compare the upper triangular elements of the cophenetic matrix for each one.
Mallards Anas platyrhynchos shot in Eastern Poland: ecological risk evaluated by analysis of trace elements in liver
Published in Human and Ecological Risk Assessment: An International Journal, 2019
Agnieszka Sujak, Dariusz Wiącek, Dariusz Jakubas, Andrzej Komosa, Ignacy Kitowski
Common sources of origin of elements using were determined with:Spearman rank correlation; strength of correlation was determined according to Hinkle et al. (2003): strong correlation (r = |0.90–1.00|), high correlation (r = |0.70–0.90|), moderate correlation (r = |0.50–0.70|);Hierarchical Cluster Analysis (HACA) was performed it using Bray-Curtis similarity as a distance measure and paired group method as linkage method; for each obtained cluster, bootstrap probability (BP) was calculated via multiscale bootstrap resampling. BP of a cluster may have value between 0 and 1 indicating how strong the cluster is supported by data. To determine how well the generated clusters represent dissimilarities between objects the cophenetic correlation coefficient was calculated with values close to zero indicating poor clustering, and close to 1, indicating good clustering. Only clusters with BP ≥95 were considered.
Groundwater quality monitoring of the Serra Geral aquifer in Toledo, Brazil
Published in Journal of Environmental Science and Health, Part A, 2018
Silvia Priscila Dias Monte Blanco, Aparecido Nivaldo Módenes, Fabiano Bisinella Scheufele, Pricila Marin, Karise Schneider, Fernando Rodolfo Espinoza-Quiñones, Paulo Roberto Paraíso, Rosângela Bergamasco
Multi-variate statistical analyses of cluster and principal components (PCA) were performed in order to show the similarities among the groundwater composition data collected from the set of 10 tube wells. In order to increase the variance percentage between the data, some variables were selected for the multi-variate analysis, such as: K, Ca, Fe, temperature, pH, dissolved oxygen, electrical conductivity, alkalinity, nitrate and turbidity. In order to avoid the effect of the measurement scales, number of variables and the correlations between them, the data set was standardized by applying a linear transformation (Z) so that the means were equal to zero and the standard deviations equal to one unit, as shown in Eq. (1). For the parameters that presented values below the detection limit of the equipment, half of the value of the detection limit was used.[33] After the linear transformation of the data, the multi-variate analyses were performed using Statistica® software. In order to build the dendrogram in the cluster analysis, the Euclidean distance was used as a measure of the dissimilarity and the linkage centroid method, the veracity of the results generated by this analysis was verified by the cophenetic correlation coefficient. where is the transformed variable; is the value of the real variables, is the mean of the real variables and is the standard deviation of the real variables.