Sequence alignment – Knowledge and References

Explore chapters and articles related to this topic

Genome Editing Tools

Published in Vineet Kumar, Vinod Kumar Garg, Sunil Kumar, Jayanta Kumar Biswas, Omics for Environmental Engineering and Microbiology Systems, 2023

Madhumita Barooah, Dibya Jyoti Hazarika

The accessibility to giant repositories associated with whole-genome sequencing as well as the understanding of previously decoded natural metabolic pathways allows redesigning of pathways through comparison with previously elucidated metabolic networks for remediation of toxic contaminants. Enzymes produced by different organisms are identified, and genes encoding those enzymes are assembled to construct novel metabolic pathways. Databases such as BRENDA (Placzek et al., 2017), KEGG (Kanehisa et al., 2017), MetaCyc (Caspi et al., 2016), and Rhea (Morgat et al., 2015) provide the information regarding the necessary enzymes for redesigning these pathways. Thus, already available pathways can be improved with additional enzymatic reactions for filling the gaps. These collective pathways are also called reference pathways and are very helpful for the comparison of various metabolic models of different organisms. BLAST (Altschul et al., 1990) – the sequence alignment program – enables the comparison of sequences by providing significant statistical similarities between the query sequences (nucleotide or protein) and the target database sequences. This approach identifies enzymes based on the fact that proteins with higher sequence homology are likely to perform similar functions.

An Analysis of Protein Interaction and Its Methods, Metabolite Pathway and Drug Discovery

View Chapter

Purchase Book

Published in Ayodeji Olalekan Salau, Shruti Jain, Meenakshi Sood, Computational Intelligence and Data Sciences, 2022

P. Lakshmi, D. Ramyachitra

Protein sequence alignment is used to know the importance of the homology detection, to predict the various features of a protein and to know the homologous structure. This alignment helps to predict the difference between the structure and the template of the sequence. In sequence alignment, BLAST and FASTA are the basic operations. The operations required to be performed level-wise are sequence identification, searching data in database, detection of homology, alignment of the sequences and updation of the structural information. Figure 13.4 shows the categorization of the sequence alignment. The recent versions of the instrument experiments with the help of NMR and X-ray crystallography are used to store the information of the isolation. These data are input to various algorithms to align the sequences effectively. Three types of alignments are available: single, pairwise and multiple sequence alignments.

Miscellaneous Applications

View Chapter

Purchase Book

Published in Nirupam Chakraborti, Data-Driven Evolutionary Modeling in Materials Technology, 2023

Nirupam Chakraborti

The success of the sequence alignment algorithms depends on their capability to accurately detect or predict the evolutionary relationships existing in the sequences that are being aligned. If it is a robust algorithm then it should be able to detect every possible form of DNA rearrangements, including mutations. These rearrangements and mutations are often spontaneous, and are also possible to be introduced by the environmental factors associated with the individual. The different types of mutations broadly fall into two categories:Point mutations causing change in one nucleotide. It may also delete one nucleotide.Chromosomal mutations that occur through inversions, translocations, as well as insertions and deletions.Jangam and Chakraborti (2007) tested their algorithm on some custom designed sequences, which were in accord with the two categories mentioned above. The sequences they tested were actually of three major categories: (i) short sequences containing100–200 base pairs (bp); (ii) medium sequences containing 200–500 base pairs; and (iii) large sequences containing 500–2000 base pairs. As mentioned before, they compared their runs with similar runs conducted using both BLAST and Clustal W. The sequences were evaluated for two major features: (i) their ability to accurately predict the existing evolutionary relationships as discussed above; and (ii) their capability of making accurate detection of the existing evolutionary relationships.

Extracting process hierarchies by multi-sequence alignment adaptations

View Article

Journal Information

Published in Enterprise Information Systems, 2022

Eren Esgin, Pinar Karagoz

Once we have dominant behaviours of processes, sequence alignment of process variants appears as a feasible solution for measuring cross-organisational process similarities. Sequence alignment is a basic technique used in various research areas including bioinformatics, which is used for comparing biological processes, predicting the biological function of a gene and finding the evolution neighbourhood in homologous genomes (Sung 2010). In (Esgin and Karagoz 2013b), sequence alignment is utilised to exploit the similarities among the process variants. Unlike most atomic true/false equivalence notion, it is able to measure the degree of process similarity by adapting Needleman-Wunsch (NW) algorithm with confidence enhanced cost function. Edit operations are dynamically valuated according to the confidence values extracted from event log as introduced in (Esgin and Karagoz 2013a).

A Hardware-Based Memory-Efficient Solution for Pair-Wise Compact Sequence Alignment

View Article

Journal Information

Published in IETE Journal of Research, 2023

Ardhendu Sarkar, Surajeet Ghosh, Sanchita Saha Ray

Biological sequence alignment in bioinformatics is a contemporary research thrust for aligning sequences at high speed with low utilization of memory. Sequence alignment is the strategy of organizing sequences of protein, DNA or RNA to identify the regions of similarity for imparting the effects of structural, functional, or evolutionary relationships between the biological sequences [1–4]. The sequencing process derives the biological features in a DNA chain and converts it into a series of sequence of symbols. There are four symbols (also known as nitrogenous bases or nucleotides), namely, A: Adenine, C: Cytosine, T: Thymine, G: Guanine which are constituents of a DNA sequence. The procedure of sameness or alignment process could be categorised into two wide groups, namely, (i) PSA: pair-wise sequence alignment and (ii) MSA: multiple sequence alignment. The PSA is used to compute at what range a specific pair of proteins or genes are identical or to recognise the belongingness of a specific pair of sequences of interest (DNA, RNA or protein). The PSA is generally carried out using dynamic programming paradigm. Two well-known PSA techniques are local alignments and global alignments [1,2]. Global sequence alignment is a structure of global optimization that spans over the entire length of sequences, and the Needleman–Wunsch [2] algorithm is widely used for global alignment. The local alignment doesn't need to begin at the ends, and the most common local alignment method is Smith–Waterman algorithm [1]. MSA is a sequence alignment procedure of more than two biological sequences and normally assumes an evolutionary relationship between the query sequences. There also exist few search tools combining exact pattern matching with dynamic programming to achieve faster optimal solutions, namely, BLAST: Basic Local Alignment Search Tool [5] and FASTA: FAST All [6]. These search tools are utilized to achieve sub-optimal PSA to indicate probable homologue for a query sequence on large databases of sequences [7].