Consensus clustering – Knowledge and References

Explore chapters and articles related to this topic

Multimodal Imaging Radiomics and Machine Learning

Published in Ayman El-Baz, Jasjit S. Suri, Big Data in Multimodal Medical Imaging, 2019

Gengbo Liu, Youngho Seo, Debasis Mitra, Benjamin L. Franc

Clustering algorithms were used to reveal the cluster patterns of radiomic features without knowing prior patient outcome knowledge. In one recent widely cited publication, Aerts et al. [18] applied a hierarchical clustering method to reveal the clusters of patients in similar expression patterns among 440 features. By comparing with the clinical parameters, the primary tumor stage, overall stage and histology of tumors were observed to have a significant association with the cluster. In another study [42], Parmar used a consensus clustering method and showed the correlation between radiomic features and clinical parameters of two types of cancer. The term “consensus clustering algorithm” refers to finding a single (consensus) clustering which is the best fit with input dataset from the existing clusters. After that overlap, features between the feature clusters of lung and head and neck cancer cohorts were assessed by the Jaccard index matrix. In another study, a hierarchical clustering algorithm was used to identify nonredundant features, and the method is reproducible across multiple CT machines [45].

Forecasting Air Quality in India through an Ensemble Clustering Technique

View Chapter

Purchase Book

Published in Himansu Das, Jitendra Kumar Rout, Suresh Chandra Moharana, Nilanjan Dey, Applied Intelligent Decision Making in Machine Learning, 2020

J. Anuradha, S. Vandhana, Sasya I. Reddi

A subspace clustering can also be viewed as a weighted cluster array, in which each cluster represents the importance of its features using a weight vector. In the subspace clustering ensemble, the clusters and the weight vectors given by the base subspace clusters are used in the consensus function. The final consensus clustering can be improved by evaluating the relevance of the base clustering (and the assignments of weights accordingly) as shown by Li, Ding, and Jordan (2007). To the best of our knowledge, Nock and Nielsen (2006) were the first to explore how to use object related weights in the ensemble clustering method.

Introduction

View Chapter

Purchase Book

Published in Sugato Basu, Ian Davidson, Kiri L. Wagstaff, Constrained Clustering, 2008

Sugato Basu, Ian Davidson, Kiri L. Wagstaff

In CONSENSUS CLUSTERING, we are given several candidate clusterings and asked to produce a clustering which combines the candidate clusterings in a reasonable way. Gionis et al [14] note several sources of motivation for CONSENSUS CLUSTERING, including identifying the correct number of clusters and improving clustering robustness.

A new method for weighted ensemble clustering and coupled ensemble selection

View Article

Journal Information

Published in Connection Science, 2021

Arko Banerjee, Arun K. Pujari, Chhabi Rani Panigrahi, Bibudhendu Pati, Suvendu Chandan Nayak, Tien-Hsiung Weng

In Zhou and Tang (2006), Alizadeh et al. (2014), Huang et al. (2015), Rouba and Bahloul (2017) and Yousefnezhad et al. (2018), the weight of a clustering is computed by defining functions of clustering similarity measures such as Normalised Mutual Information (NMI) (Strehl & Ghosh, 2002), Rand Index (Jain & Dubes, 1988) or pure entropy values over all base clusterings. Huang et al. (2017) proposed weight for a cluster as an entropy measure on the level of agreements among clusters. Using the same definition of cluster-level weight, Huang et al. (2018) proposed two more algorithms for clusterings. In most of the above cases, consensus clustering is obtained by applying agglomerative clustering or bipartite clustering (graph partitioning) (D. Xu & Tian, 2015) on a cluster-level or clustering-level weighted similarity matrix derived from the ensemble.

A kernel-induced weighted object-cluster association-based ensemble method for educational data clustering

View Article

Journal Information

Published in Journal of Information and Telecommunication, 2020

Chau Thi Ngoc Vo, Phung Hua Nguyen

For the first question, kWOCA has much better NMI values than its base clustering method. This reflects the well-known feature of an ensemble model compared to a single one. While the results of k-means are different from execution to execution due to the randomness in its initialization, those of kWOCA are more stable on all the used data sets. Nevertheless, the simplicity and efficiency of k-means are significant for us to select it for generating base clusterings. Consensus clustering can then combine them and finalize a more effective clustering.

GAN-based clustering solution generation and fusion of diffusion

View Article

Journal Information

Published in Systems Science & Control Engineering, 2022

Wenming Cao, Zhiwen Yu, Hau-San Wong

In this section, we present and analyse the comparative experiments on the above data sets against related methods including K-means, spectral clustering, consensus clustering, multi-task clustering, deep clustering and adversarial multi-source domain adaptation (MDAN) (Zhao et al., 2018). We choose LSSMTC (Gu & Zhou, 2009, December) and deep embedding clustering (DEC) (Xie et al., 2016) as multi-task clustering and deep clustering, respectively. The comparison results are shown in Tables 3 and 4. From these tables, we have the following observations: Consensus clustering has achieved better performances than K-means and spectral clustering since it can combine and fuse the multi-view informative knowledge in multiple clustering solutions. The solutions are obtained by conducting random subspace tricks or random transformation on the original features.LSSMTC performs better than consensus clustering, in which the knowledge has been explored and transformed among tasks. It indicates that compared with the informative knowledge in multiple solutions, the knowledge discovered from other tasks can play a more important role in performance improvement.DEC achieves better or comparable performance, compared with LSSMTC, although it does not utilize multiple solutions or tasks. It indicates that a neural network with appropriate structure has enough capability to learn effective information beneficial to downstream tasks.Both MDAN-hard and MDAN-soft significantly outperform LSSMTC and DEC, which indicate that combining the knowledge from other tasks with the powerful learning capability of neural networks can further enrich the discriminative and complementary of features.Our proposed method has achieved better performance than MDAN-hard and MDAN-soft in the majority of cases. The advantage of our method over MDAN-hard (soft) can attribute to the clustering-guided feature extractor for learning the discriminative information, the exploration of common knowledge of related tasks and the fusion of clustering solutions generated by GAN.