Explore chapters and articles related to this topic
The Journey through the Gene: a Focus on Plant Anti-pathogenic Agents Mining in the Omics Era
Published in Mahendra Rai, Chistiane M. Feitosa, Eco-Friendly Biobased Products Used in Microbial Diseases, 2022
José Ribamar Costa Ferreira-Neto, Éderson Akio Kido, Flávia Figueira Aburjaile, Manassés Daniel da Silva, Marislane Carvalho Paz de Souza, Ana Maria Benko-Iseppon
As claimed by Santos-Junior et al. (2020), two significant computational difficulties affect AMPs prospection in sequence-derived data: (i) predicting small genes in DNA/RNA sequences; (ii) the prediction of AMP activity for small genes. Regarding the prediction of small genes in DNA- or RNA-encoding small peptides, most gene prediction approaches exclude small ORFs (Open Reading Frames). The regular tools preserve the longest ORF, conventionalizing it as the coding for a given protein. Thus, specific peptide mining approaches have been developed (e.g., Ramada et al. 2017; Lin et al. 2019). Additionally, a few recent surveys have shown that long ORFs identification methods can be used to detect small ORFs. In this way, the results are subsequently filtered while revealing that small ORFs are biologically active across a range of functions (Miravet-Verde et al. 2019; Sberro et al. 2019).
RNA-seq Analysis
Published in Altuna Akalin, Computational Genomics with R, 2020
RNA-seq generates valuable data that contains information not only at the gene level but also at the level of exons and transcripts. Moreover, the kind of information that we can extract from RNA-seq is not limited to expression quantification. It is possible to detect alternative splicing events such as novel isoforms (Trapnell et al., 2010), and differential usage of exons (Anders et al., 2012). It is also possible to observe sequence variants (substitutions, insertions, deletions, RNA-editing) that may change the translated protein product (McKenna et al., 2010). In the context of cancer genomes, gene-fusion events can be detected with RNA-seq (McPherson et al., 2011). Finally, for the purposes of gene prediction or improving existing gene predictions, RNA-seq is a valuable method (Stanke and Morgenstern, 2005). In order to learn more about how to implement these, it is recommended that you go through the tutorials of the cited tools.
Impact of Integrated Omics Technologies for Identification of Key Genes and Enhanced Artemisinin Production in Artemisia annua L.
Published in Tariq Aftab, M. Naeem, M. Masroor, A. Khan, Artemisia annua, 2017
Shashi Pandey-Rai, Neha Pandey, Anjana Kumari, Deepika Tripathi, Sanjay Kumar Rai
Genomics is the systematic study of an organism’s genome with the help of molecular tools. Traditionally, genes have been analyzed individually, but microarray technology has advanced substantially in recent years. Various steps of genome analysis involve (1) genome sequencing, (2) identification of repetitive as well as unique sequences, (3) gene prediction, (4) identification of functional expressed sequence tags (ESTs) and complementary DNA (cDNA) sequences, and (5) genome annotation and gene location/gene mapping. Recently, DNA microarray techniques have evolved as a powerful tool, which has the potential to measure differences in DNA sequences between individuals and the expression of thousands of genes simultaneously.
Interleukin-1β secretion induced by mucosa-associated gut commensal bacteria promotes intestinal barrier repair
Published in Gut Microbes, 2022
Wan-Jung H. Wu, Myunghoo Kim, Lin-Chun Chang, Adrien Assie, Fatima B. Saldana-Morales, Daniel F. Zegarra-Ruiz, Kendra Norwood, Buck S. Samuel, Gretchen E. Diehl
Fecal pellets from AVMN-treated mice were resuspended in PBS to 100 mg/ml and dilutions were plated on blood agar plates (Fisher) and cultured overnight at 37°C under normal or anaerobic conditions using BD GasPak (Fisher). For sequencing, genomic DNA was extracted with phenol chloroform, and DNA was sheared to 15kb using Covaris g-TUBE® devices, allowing for sizes 5 kb and larger. The library preparation was carried out using SMRTbell Template kit 1.0 Exo VII protocol and the sample was barcoded with PacBio Adaptor. Genome sequencing was performed using the Pacific Biosciences Sequel sequencing platform. Long reads were assembled de novo into two contigs (main chromosome and 1 plasmids) using Canu (v. 1.6).73 Gene prediction and annotation were carried out using the webservice PATRIC.74 Genomic visualization was performed using Circos v0.69–9.75 Genomic comparison was done using the PATRIC webservice and phylogenomic reconstructions were done using the GToTree pipeline and its associated dependencies.76–80 Sequencing reads and the genome assembly were submitted to NCBI under the bioproject PRJNA725420.
Peptidomics and proteogenomics: background, challenges and future needs
Published in Expert Review of Proteomics, 2021
Rui Vitorino, Manisha Choudhury, Sofia Guedes, Rita Ferreira, Visith Thongboonkerd, Lakshya Sharma, Francisco Amado, Sanjeeva Srivastava
Another method for peptide identification is ab initio prediction, which identifies the protein coding region based on the structure and signals from the sequence [45]. This method uses signal sensors to identify splice sites, start-stop codons and branch points, and content sensors to detect exons and exon-intron boundaries. Various algorithms such as dynamic programming, hidden Markov model, and neural networks have been used in ab initio prediction tools, of which hidden Markov is the most commonly used method [45]. Gene prediction in the eukaryotic genome is a more difficult task than in the prokaryotic genome due to the presence of introns and post-translational modifications. This method of gene prediction is useful not only for searching protein-coding genes but also for other non-coding RNA (ncRNA), microRNA (miRNA), and long non-coding RNA (lncRNAs), which account for a significant proportion of new genes.
Challenges and promise at the interface of metaproteomics and genomics: an overview of recent progress in metaproteogenomic data analysis
Published in Expert Review of Proteomics, 2019
Henning Schiebenhoefer, Tim Van Den Bossche, Stephan Fuchs, Bernhard Y. Renard, Thilo Muth, Lennart Martens
The unbiased translation of sequencing data to protein sequences by using all possible reading frames is also a viable option. If information about the coding strand is available, sequencing data is translated in three reading frames (three-frame translation), otherwise in all six reading frames (six-frame translation). For the human genome, a six-frame translation would lead to a search database that is 70 times larger than the reference proteome from UniProtKB [24]. Related to that, proteogenomic databases often contain many obsolete sequences from reading frames that are not transcribed [63]. In the case of prokaryotes, the increase in database size will most likely be smaller, due to the greater gene density found in these organisms compared to humans. Still, the translated database would largely consist of nonsense protein sequences that are not present in the sample. The performance of searches on databases constructed this way has been shown to be comparable to, yet slightly worse than, databases based on gene prediction tools [33].