Explore chapters and articles related to this topic
Clustering Biological Data
Published in Charu C. Aggarwal, Chandan K. Reddy, Data Clustering, 2018
Chandan K. Reddy, Mohammad Al Hasan, Mohammed J. Zaki
In certain scenarios, a gene might not instantaneously be correlated with other genes at that time but a gene might regulate another gene after a certain time. In order to identify time-lagged coregulated gene clusters, [53] proposes a novel clustering algorithm for effectively extracting the time-lagged clusters. This algorithm first generates complete time-lagged information for gene clusters by processing several genes simultaneously. Instead of considering the lags for the entire sequence, it considers only small interesting parts (subsequences) of the genes that are coregulated while there is no distinct relationship between the remaining part. It identifies localized time-lagged co-regulations between genes and/or gene clusters. It builds a novel mechanism that aims to extract clusters (which are referred to as q-clusters) of (time-lagged) coregulated genes over a subset of consecutive conditions. Each such cluster essentially contains information of genes that have similar expression patterns over a set of consecutive conditions. More recently, authors in [113] have extended the concept of time-lagged clustering to three-dimensional clustering.
Biological Data Mining:
Published in Wahiba Ben Abdessalem Karaa, Nilanjan Dey, Mining Multimedia Documents, 2017
Amira S. Ashour, Nilanjan Dey, Dac-Nhuong Le
Both bioinformatics and data mining are fast-intensifying and closely related research areas. It is imperative to inspect the significant research topics in bioinformatics and to improve innovative data mining techniques for effective and scalable biological analysis. Given the problems of biological data mining and analysis, bioinformatics scientists can consider the following computational difficulties for future study:Improving sequence-pattern discovery algorithms.Evolving new approaches of bootstrapping learning algorithms from the biological data.Developing machine learning algorithms for outsized sequence sources.Incorporating multiple information sources into an integrated learning and data mining system.Improving the accuracy and speed of the probabilistic-reasoning systems.Including optimization algorithms such as the genetic algorithm, particle swarm optimization, and cuckoo search algorithm for enhanced data mining systems. For example, genetic algorithms can be applied to the association and classification methods.Techniques can be employed to discover associations among similar gene clusters, genes, protein sequences, and using decision trees for genes classification.Evolving approaches for intelligent selection of the accurate states set from the numerous Markov models is an open research area.For biological sequences analysis and processing, in addition to sequence relations, efficient classifiers must be considered. The information account with relative position of the different shared features should be considered. One of the future objectives is to improve features that can exploit position-precise information.In the biomedical domain, massive datasets are accessible. Numerous algorithms for finding common patterns from the biological sequences are used to predict cancer. Some models use efficient frequent pattern procedure to mine the most recurrent patterns from the specified input dataset to find the most controlling amino acid sequence in order to block the cancer cells growing from the clustered protein sequence. In conclusion, the expected amino acids could be more valuable in making medicine for curing lung cancer. Consequently, existing cancer research is investigating several protein sequences, including tyrosine kinase, ALK, Ral protein, and histone deacetylase sequence, which can be used to block the cancer cells’ growth.
Overview of methodologies for the culturing, recovery and detection of Campylobacter
Published in International Journal of Environmental Health Research, 2023
Marcela Soto-Beltrán, Bertram G. Lee, Bianca A. Amézquita-López, Beatriz Quiñones
Broad spectrum tetracycline and beta-lactams have been used for treating gastrointestinal infections. Resistance to tetracycline in Campylobacter is moderate to high and is generally mediated by the tet(O) gene, commonly found on the pTet plasmid but also on a genomic island. Beta-Lactam antibiotics act by binding to penicillin-binding proteins and disrupting peptidoglycan cross-linking during cell wall synthesis. Resistance through beta-lactamase, blaOXA-61, is widespread in C. jejuni and C. coli. The Campylobacter multidrug efflux pump CmeABC has also worked synergistically to provide resistance to beta-lactams as well as tetracyclines, macrolides and fluoroquinolones (Whitehouse et al. 2018). Campylobacter infections that are resistant to less toxic antibiotics may be treated with aminoglycosides, such as gentamicin (Fair and Tor 2014). Aminoglycosides bind to prokaryotic ribosomes impairing protein synthesis, and over 24 genes, encoding aminoglycoside-modifying enzymes, have been identified in Campylobacter. A gene cluster aadE-sat4-aphA-3 confers multidrug resistance including aminoglycosides and has been found in C. jejuni and C. coli, recovered from food and human. This gene cluster has been detected on a plasmid and integrated in the chromosome (Zhao et al. 2016). Finally, an intrinsic resistance in some C. jejuni and C. coli isolates has been described against penicillin, older cephalosporins, trimethoprim, sulfamethoxazole, rifampicin, and vancomycin (Fitzgerald et al. 2008).
Identification of disease genes and assessment of eye-related diseases caused by disease genes using JMFC and GDLNN
Published in Computer Methods in Biomechanics and Biomedical Engineering, 2022
Samar Jyoti Saikia, S. R. Nirmala
This article proposes to effectively identify the particular disease genes and assess the eye-related diseases that are caused by those disease genes with the utilization of JMFC and GDLNN. The proposed work performs five steps. Initially, the input data are gathered as of the dataset containing the normal and disease gene expression. Subsequently, the SS is evaluated for the input genes based on the gene ontology (GO), and then, the gene is clustered as a normal gene and disease gene utilizing the JMFC algorithm. Then, certain features in the disease gene cluster are extracted. Next, the important and essential features are selected by utilizing the LCM-CSO algorithm. Lastly, the selected features are delivered to the GDLNN classifier as the input. The GDLNN classifies the 10 possible eye-related diseases that come from the affected genes, such as cataract, AMD, glaucoma, Marfan syndrome, inherited optic neuropathies, retinitis pigmentosa, PCV, uveal melanoma along with Stargardt's disease. The proposed work could be comprehended with the block diagram shown in Figure 1.
Response of microcystin biosynthesis and its biosynthesis gene cluster transcription in Microcystis aeruginosa on electrochemical oxidation
Published in Environmental Technology, 2019
Yu Gao, Kazuya Shimizu, Chie Amano, Xin Wang, Thanh Luu Pham, Norio Sugiura, Motoo Utsumi
Biochemical and genetic studies have suggested that mixed polyketide synthase (PKS)/nonribosomal peptide synthetase (NRPS) is responsible for the production of MCs [13]. Tillett et al. [14] identified and sequenced the gene cluster mcy operon, which encodes the MC synthetase gene cluster in M. aeruginosa PCC7086. This 55-kb sequence consists of 10 open reading frames bidirectionally transcribed from a central 732-bp intergenic region between mcyA and mcyD. The description of the mcy operon facilitates new experimental approaches to study the production of MCs. Several studies have demonstrated the up-regulation of mcy transcript levels under high light intensity or in the presence of anthracene [15–17]. Sevilla et al. [18] pointed that iron deficiency slightly affected mcyD transcription, which was correlated with an increase in MC-LR levels in the cells. Transcriptional analysis is apparently a good tool for understanding physiological responses. Real-time reverse transcription polymerase chain reaction (RT-qPCR) matches the evident requirement for quantitative data analysis in various fields and is the method of choice for quantifying mRNA [19,20]. RT-qPCR has been used to analyse the relationship between the transcription level of mcy genes and environmental factors [21–23].