Consensus sequences – Knowledge and References

Explore chapters and articles related to this topic

Advances in Non-Invasive Diagnosis of Single-Gene Disorders and Fetal Exome Sequencing

Published in Carlos Simón, Carmen Rubio, Handbook of Genetic Diagnostic Technologies in Reproductive Medicine, 2022

Liesbeth Vossaert, Roni Zemet, Ignatia B. Van den Veyver

NGS cannot cover all sequence variants.57,65,74 It performs well for single-nucleotide variants and small insertion/deletions, but regions with high CG content are more difficult to capture. Because the consensus sequence is built from the alignment of overlapping short fragments, regions with high homology to other sequences within the genome are challenging. These include duplicated genes or exons, repetitive sequences, short repeat expansions, pseudogenes, and highly homologous gene families. Structural chromosomal abnormalities or aneuploidy can be detected by genome sequencing (GS), but not currently as effectively by ES. Low-level mosaic variants are also challenging, but can be identified provided that the sequencing depth is adequate. Haplotype information can aid in detecting uniparental disomy.

Transcriptionally Regulatory Sequences of Phylogenetic Significance

View Chapter

Purchase Book

Published in S. K. Dutta, DNA Systematics, 2019

P. C. Huang

Genes, the products of which share related function, are often expressed coordinately. Coordinately inducible genes such as those coding for amino acid pathway enzymes or growth hormones, apparently share short repeating sequences unique to each gene family.148 Consensus sequences (… TGACTC …) that have been deduced are repeatedly located 5′ to the coding sequence and are distinguishable from promoters. In addition, as noted by Cheung et al.149 the trinucleotide GTG is present frequently and symmetrically in many gene sequences of prokaryotic as well as eukaryotic DNA, ranging from the consensus E. coli lac repressor binding site to the transcriptionally critical sequences (upstream −87) of the human β-globin gene. This trinucleotide is also present in LTRs and in the VDJ joints of immunoglobulin genes. Potentially, such a sequence, albeit short, is important in the B to Z transition of DNA; hence it may be involved in transcriptional regulation.

Next-Generation Sequencing (NGS) for Companion Diagnostics (CDx) and Precision Medicine

View Chapter

Purchase Book

Published in Il-Jin Kim, Companion Diagnostics (CDx) in Precision Medicine, 2019

Il-Jin Kim, Mendez Pedro, David Jablons

PacBio system (developed by Nanofluidics) uses single-molecule real-time (SMRT) long- read sequencing. The average read length is above 1 kB and up to 10–40 kB,24,26 which makes de novo assembly, structural variant detection, and mRNA transcriptome analysis much easier.1, 26, 38, 39 SMRT sequencing fixes a single DNA polymerase in the bottom of a nanophotonic chamber called zero- mode waveguides (ZMWs).26, 40, 41 Unlike other sequencing systems (i.e., SBS methods) where DNA is fixed, DNA has mobility and can access to the immobilized DNA polymerase in the ZMWs chamber, where the sequencing takes place.25 Fluoroscence-labeled dNTPs are incorporated, visualized, and recorded by a laser and camera in a real-time way.18 This system is known to have high error rates (up to 11%), especially for detecting indels.26 To overcome the high error rate, multiple sequencings at the same site forming circular consensus sequences are required.42–44 SMRT sequencing is less sensitive to GC contents compared to other sequencing platforms.45 and would be ideal for detecting structural variants or de novo assembly1, 39

An adapted consensus protein design strategy for identifying globally optimal biotherapeutics

View Article

Journal Information

Published in mAbs, 2022

Yanyun Liu, Kenny Tsang, Michelle Mays, Gale Hansen, Jeffrey Chiecko, Maureen Crames, Yangjie Wei, Weijie Zhou, Chase Fredrick, James Hu, Dongmei Liu, Douglas Gebhard, Zhong-Fu Huang, Akshita Datar, Anthony Kronkaitis, Kristina Gueneva-Boucheva, Daniel Seeliger, Fei Han, Saurabh Sen, Srinath Kasturirangan, Justin M. Scheer, Andrew E. Nixon, Tadas Panavas, Michael S. Marlow, Sandeep Kumar

Unlike previous Consensus Protein Design studies that benefited from numerous sequences to derive a single consensus sequence, the limited number of anti-CD3 sequences resulted in a library of 21 consensus variants with single-point mutations, which were found to be functional in cell-based assays. Here, we present data on these variants from different developability perspectives, biophysical, computational, and pharmacological (serum stability). We then used data science methods to better understand the molecular origins of beneficial attributes and determined a globally optimal sequence. Our results indicate this adapted Consensus Protein Design approach may have unique value for biotherapeutic discovery. Rather than a limited characterization on a large number of variants, this rapid iterative process takes advantage of greater characterization data on fewer molecules to generate a second cohort of optimized variants in a data-driven manner. In doing so, a deeper understanding of sequence–structure–property relationships is obtained with a reduced risk of unanticipated detrimental consequences of otherwise beneficial substitutions.

Next-generation sequencing for the diagnosis of hepatitis B: current status and future prospects

View Article

Journal Information

Published in Expert Review of Molecular Diagnostics, 2021

Selene Garcia-Garcia, Maria Francesca Cortese, Francisco Rodríguez-Algarra, David Tabernero, Ariadna Rando-Segura, Josep Quer, Maria Buti, Francisco Rodríguez-Frías

Despite its usefulness, Sanger sequencing is limited to detecting variants at relatively high frequencies (>15-20%) [53], without quantifying variant abundance in viral quasispecies. It is often used to obtain a consensus sequence through overall population sequencing. Alternatively, Sanger sequencing may be performed after molecular cloning. In this case, however, it is extremely difficult to process hundreds of clones, and the number of sequences does not represent the magnitude and complexity of the quasispecies population. Sanger sequencing is thus unsuitable for detecting minor circulating variants in viral quasispecies, as well as low-frequency novel mutations selected throughout the infection [54,55,56], which affect disease progression and/or antiviral response. Although this technology remains useful for applications where high throughput is not required [41], the study of quasispecies variability requires sequencing methods that can analyze a higher number of individual sequences.

Genetic characterisation of the North-West Indian populations: analysis of mitochondrial DNA control region variations

View Article

Journal Information

Published in Annals of Human Biology, 2021

Gagandeep Singh, Srinivas Yellapu, Harkirat Singh Sandhu, Indu Sharma, Varun Sharma, A. J. S. Bhanwer

Forward and reverse sequences were aligned using BioEdit v 7.2.5 (Hall 1999) to create a consensus sequence and deposited to GenBank with accession numbers MT312521-MT312729. The consensus sequences were compared with the revised Cambridge Reference Sequence (rCRS) (Andrews et al. 1999). The quality of the sequences was examined and confirmed by two independent researchers. The haplotype classification was carried out following the nomenclature guidelines for mtDNA typing (Bandelt and Parson 2008; Parson et al. 2014) using the mtDNA profiler web tool (http://mtprofiler.yonsei.ac.kr/index.php?cat=1). Haplogroups were assigned using Mitotool (Fan and Yao 2011) and mtDNAmanager (Lee et al. 2008) based on PhyloTree builds 16 and 17 and heterozygous variants were excluded while naming haplogroups. All the samples were checked manually using mtDNA tree Build 17 for the variants and the haplogroups assigned to the individual samples with respect to variants (Van Oven and Kayser 2009).