Explore chapters and articles related to this topic
Clustering Biological Data
Published in Charu C. Aggarwal, Chandan K. Reddy, Data Clustering, 2018
Chandan K. Reddy, Mohammad Al Hasan, Mohammed J. Zaki
Hierarchical clustering algorithms (discussed in detail in Chapter 4) first create a dendrogram for the genes, where each node represents a gene cluster and is merged/split using a similarity measure. There are two categories of hierarchical clustering that are studied in the context of gene expression analysis. Agglomerative clustering (bottom-up approach): This approach starts with each gene as an individual cluster, and at each step of the algorithm, the closest pair of clusters are merged until all the of the genes are grouped into one cluster. Eisen et al. [33] applied an agglomerative clustering algorithm called UPGMA (Unweighed Pair Group Method with Arithmetic Mean). Using this approach, each cell of the gene expression matrix is colored and the rows of the matrix are reordered based on the hierarchical dendrogram structure and a consistent node-ordering rule. An illustration of a simple dendrogram for gene expression data is shown in Figure 16.1.Divisive clustering (top-down approach): This approach starts with a single cluster that contains all the genes; then repeatedly the clusters are split until each cluster contains one gene. Based on a popular deterministic annealing algorithm, authors in [3] proposed a divisive approach to obtain gene clusters. The algorithm first chooses two random initial centroids. An iterative Expectation-Maximization algorithm is then applied to probabilistically assign each gene to one of the clusters. The entire dataset is recursively split until each cluster contains only one gene.
Afforestation may influence changes in tailing heaps in a long time
Published in International Journal of Phytoremediation, 2021
Rogelio Carrillo-González, Ma del Carmen A. González-Chávez
The following tests were performed to the data: one-way ANOVA to detect differences in soil properties (pH, EC, sulfates, phosphorus, OM, LOI, carbonates, ions, and metals) between soil layers and plots; and Tukey for mean comparisons. The Friedman test was used for metal concentrations statistical analysis, using SAS. Student test for comparison of the chlorophyll indexes with the reference values. Clustering analysis by UPGMA (unweighted Pair Group Method Analysis) as a simple agglomerative test. Principal component analysis (PCA), after log data transformation, using MVSP software to choose properties that indicate changes in the Technosols.
Ionic adjustments do not alter plankton composition in low salinity Penaeus vannamei intensive nursery with synbiotic system
Published in Chemistry and Ecology, 2023
Otávio Augusto Lacerda Ferreira Pimentel, Ng Haig They, Rildo José Vasconcelos de Andrade, Valdemir Queiroz de Oliveira, André Megali Amado, Alfredo Olivera Gálvez, Luis Otavio Brito
To inspect the grouping of phytoplankton and zooplankton community samples, an exploratory cluster analysis using Bray Curtis index was performed using the vegdist function of the vegan package [32]. The clustering was performed using the algorithm UPGMA (unweighted pair group method with arithmetic mean). The raw abundance data was log10 (x + 1) transformed.