Explore chapters and articles related to this topic
Metabolomics and Proteomics
Published in Crystal D. Karakochuk, Kyly C. Whitfield, Tim J. Green, Klaus Kraemer, The Biology of the First 1,000 Days, 2017
Richard D. Semba, Marta Gonzalez-Freire
Bioinformatics has played a vital role in the acceleration of proteomics and metabolomics. Raw MS data from proteomic analyses can be analyzed using open source search engines such as X!Tandem and OMSSA, or proprietary databases such as Mascot and Sequest. The software assigns sequence information for peptides based upon the spectra, and then protein identifications based upon the specific peptides. Authoritative and comprehensive protein databases include neXtProt for human proteins [5]. Annotated databases such as Gene Ontology (GO) [17] and pathway databases such as Kyoto Encyclopedia of Genes and Genomes (KEGG) [18] and Database for Annotation, Visualization and Integrated Discovery (DAVID) [19] are particularly useful for the identification of biological pathways in the resulting data from proteomic and metabolomics investigations. Online resources and databases of metabolites include Metabolomics Workbench, METLIN, and BiGG [20].
Combining human platelet proteomes and transcriptomes: possibilities and challenges
Published in Platelets, 2023
Jingnan Huang, Johan W.M. Heemskerk, Frauke Swieringa
According to the curated knowledgebase neXtProt of human proteins and functions in disease, the annotated human genome and transcriptome predicts for the presence of over 20 380 protein-coding genes, although with the notion that many of the predicted proteins are not or scarcely detected in biological samples.9 In the Human Proteome Project, aiming to generate a strict blueprint of the human proteome, about 18 000 gene-linked protein entries have been confirmed at the highest confidence level (protein evidence 1, PE1),10 whereas additional proteins at PE2–4 levels are only predicted from transcripts or from the presence in gene models.11 This distinct level of protein identification is reflected as an annotation score in the UniProtKB database, used for the lookup of proteins from peptides identified in mass spectrometry analyses. The latter database contains two sections, which are referred to as “UniProtKB/Swiss-Prot” (reviewed, manually annotated) and “UniProtKB/TrEMBL” (unreviewed, automatically annotated).12 The first part includes common (human) genetic variants and global information on the intracellular location and function of an annotated protein per gene.
Current status of clinical proteogenomics in lung cancer
Published in Expert Review of Proteomics, 2019
Toshihide Nishimura, Haruhiko Nakamura, Ákos Végvári, György Marko-Varga, Naoki Furuya, Hisashi Saji
Identification of proteins expressed with both canonical and non-canonical proteoforms has still been challenging because they are functionally dynamic with various proteoforms, such as post-translational modifications (PTMs), truncations, variants, mutations, and rearrangements. The UniProt Knowledgebase (https://www.uniprot.org/) contains 559,229 entries (SwissProt), that are manually annotated and reviewed, and 146,106,279 entries (TrEMBL) that are automatically annotated and not reviewed [47]. The neXtprot database (https://www.nextprot.org/; data release: 2019-01-11) has 20,399 protein entries, 42,410 isoforms by splicing, 190,921 PTMs, and 6,019,871 variants, including disease-related mutations [48].
Proteogenomics in the context of the Human Proteome Project (HPP)
Published in Expert Review of Proteomics, 2019
José González-Gomariz, Elizabeth Guruceaga, Macarena López-Sánchez, Victor Segura
The international research groups that are part of the HPP have invested considerable resources to the chromosome centric initiative with the aim of characterizing and validating all the proteins of the human proteome. However, some of these proteins remain without experimental evidence despite the new experimental data generated, the re-analysis of huge amounts of public datasets or the implementation of more robust methodologies, including proteogenomic pipelines. This set of proteins, known as the MPs, is defined by the neXtProt database where multiple biological and experimental information about the human proteins, including their experimental evidence, is accessible. The evolution of the database since 2011 shows the advances of the HPP project regarding the characterization of the human proteome (Figure 1). The number of missing proteins has clearly decreased in the last years but new biological, experimental and analytical approaches are needed to reach the final goal of having experimental evidence for the whole human proteome. Initiatives as the ‘50 Missing Proteins Challenge’ or the annual special issue of the HPP project in the JPR journal have been created in order to accelerate the experimental validation of these proteins. In the next 2 or 3 years, we will be able to evaluate the results of these initiatives examining the rate of MPs conversion into PE1 proteins and estimate the remaining time for the complete characterization of the human proteome. Meanwhile, we will increase our knowledge about these proteins and the reasons why we are not able to detect them in a biological matrix. Therefore, the design of targeted experiments or another innovative experimental design could be developed for the detection of the most challenging MPs.