Explore chapters and articles related to this topic
ChIP-seq analysis
Published in Altuna Akalin, Computational Genomics with R, 2020
There are multiple sources of genomic annotation. UCSC, Genbank, and Ensembl databases represent stable resources, from which the annotation can be easily obtained.
An approach to pathogen discovery for viral infections of the nervous system
Published in Avindra Nath, Joseph R. Berger, Clinical Neurovirology, 2020
Prashanth S. Ramachandran, Michael R. Wilson
Numerous bioinformatic algorithms are available, and the programs used will depend on the study question, whether de novo assembly is required, the read length and whether data will be aligned to a reference genome [43]. For metagenomics, a key early step is an alignment to the human genome, for removal of the host sequences. This constitutes the vast majority of reads from a human sample such as CSF that typically has very low pathogen loads. Other key early steps are quality control measures to remove low-quality, low-complexity and redundant reads. The remaining data are aligned against the National Center for Biotechnology Information (NCBI) GenBank database (https://www.ncbi.nlm.nih.gov/) or in other pipelines, against more curated databases of known human pathogens, for example. The NCBI GenBank database contains the genome of every known and sequenced organism. By matching the highest aligning organisms at a nucleotide (nt) level, the organism with the best match is reported. Alignment at a protein level (non-redundant [nr]) is concomitantly performed as novel pathogens that are highly divergent may have similar amino acid sequences despite significant genetic variation [34,44].
The Role of the Computer in Estimates of DNA Nucleotide Sequence Divergence
Published in S. K. Dutta, DNA Systematics, 2019
Eventually, Bolt Beranek and Newman, Inc. (BBN), a private corporation located in Cambridge, Mass., with expertise in computer communications, won a contract to develop a national sequence database,33 now called GenBank™, the Genetic Sequence Data Bank. GenBank, a trademark of the NIH, is a U.S. government-sponsored internationally available repository of all reported nucleic acid sequences greater than 50 nucleotides in length, cataloged, and annotated for sites of biological interest and checked for accuracy. GenBank™ was created by the National Institue of General Medical Sciences of NIH in 1982. Co-sponsors include the National Cancer Institute, the National Institute of Allergy and Infectious Disease, the Division of Research Resources of NIH, the National Institute of Arthritis, Diabetes and Digestive and Kidney Diseases, as well as the National Science Foundation, the Department of Energy, and the Department of Defense.
Recent trends in next generation immunoinformatics harnessed for universal coronavirus vaccine design
Published in Pathogens and Global Health, 2023
Chin Peng Lim, Boon Hui Kok, Hui Ting Lim, Candy Chuah, Badarulhisam Abdul Rahman, Abu Bakar Abdul Majeed, Michelle Wykes, Chiuan Herng Leow, Chiuan Yee Leow
GenBank serves as a public database of genetic sequences, focusing on the expansion and dissemination of information. The repository relies on the submissions of sequence data from authors and whole-genome shotgun (WGS) as well as high-throughput data from sequencing centres and issued patents from The U.S. Patent and Trademark Office. GenBank is a partner of the International Nucleotide Sequence Database Collaboration (INSDC) along with European Nucleotide Archive (ENA) and Data Bank of Japan (DDBJ) in which data exchange is done on a daily basis so that a systematic collection of sequence information is accessible worldwide. GenBank also collects and stores amino acid sequences from databases like SWISS-PROT, Protein Research Foundation (PRF) and Protein Data Bank (PDB) [94]. GISAID has gained its reputation as a trustworthy means for international sharing of all influenza virus data including genetic sequence and metadata [95]. In response to the COVID-19 pandemic, related data have also been shared via this public domain recently. InterPro is a unified resource resulting from the integration of protein signature databases including PROSITE, PRINTS, ProDom, Pfam, SMART, TIGRFAMs, PIRSF, SUPERFAMILY, Gene3D and PANTHER. The major application is annotation and functional classification of uncharacterized sequences. Based on sequence positions and protein coverage, protein signatures that fall into the same family or functional domain are grouped into single entry with respective annotation and cross-references [96–98].
Absence of significant genetic alterations in the VSX1, SOD1, TIMP3, and LOX genes in Brazilian patients with Keratoconus
Published in Ophthalmic Genetics, 2022
Alessandro Garcia Lopes, Gildásio Castello de Almeida Jr, Marcos Paulo Miola, Ronan Marques Teixeira, Francielly Camilla Bazilio Laurindo Pires, Rodolfo Andrade Miani, Luiz Carlos de Mattos, Cinara Cássia Brandão, Lilian Castiglioni
Mutational screen of the forward and reverse strands were prepared with a DNA sequencing kit (BigDye Terminator v.3.1 Cycle Sequencing Kit, Foster City, California/USA) following the manufacturer’s instructions, using the ABI 3500 automated sequencer (Applied Biosystems). Nucleotide sequences were first analyzed with Sequence Scanner software 2 v2.0 (Applied Biosystems) and aligned and edited with BioEdit software v. 7.0.5 (Tom Hall, Ibis Therapeutics, Carlsbad, California/USA). For reference analyses, the sequences were checked against the National Center for Biotechnology Information GenBank database. The reference sequences used are registered under the accession numbers NG_008689.1 (SOD1), NG_008101.2 (VSX1), NG_009117.1 (TIMP3), and NG_008722.1 (LOX).
Current vaccine approaches and emerging strategies against herpes simplex virus (HSV)
Published in Expert Review of Vaccines, 2021
Vindya Nilakshi Wijesinghe, Isra Ahmad Farouk, Nur Zawanah Zabidi, Ashwini Puniyamurti, Wee Sim Choo, Sunil Kumar Lal
As other methods of vaccine development involve taking into consideration the genomics of a specific virus, host immune response and components that enhance the desired effects or reduce adverse effects, bioinformatic approaches include all of these factors as part of the in silico design of a candidate. Bioinformatic preliminary resources include the National Center for Biotechnology Information (NCBI) and the European Molecular Biology Laboratory (EMBL) which are databases that host extensive biological information with every available detail reported. These databases also provide connections to GenBank, Protein Data Bank (PDB), and PubChem which contain essential information for genes, proteins and chemical structures, respectively. Next, online tools are used to aid in comparative and homology research. Some examples of such tools include Basic Local Alignment Search Tool (BLAST), ClustalW and Cn3D [137]. Reverse vaccinology, a common model uses diverse online tools to screen whole genomes of viruses and narrow down genes that may lead to robust epitopes for better quality vaccines. Epitope predictions, followed by epitope accessibility predictions and 3D structure accessibility are all recommended analyses to be performed before a peptide segment can be recommended [137].