Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
On Using Samples of Known Protein Content to Assess the Statistical Calibration of Scores Assigned to Peptide-Spectrum Matches in Shotgun Proteomics
Stockholm University, Faculty of Science, Department of Biochemistry and Biophysics.
Stockholm University, Faculty of Science, Department of Biochemistry and Biophysics.
2011 (English)In: Journal of Proteome Research, ISSN 1535-3893, E-ISSN 1535-3907, Vol. 10, no 5, 2671-2678 p.Article in journal (Refereed) Published
Abstract [en]

In shotgun proteomics, the quality of a hypothesized match between an observed spectrum and a peptide sequence is quantified by a score function. Because the score function lies at the heart of any peptide identification pipeline, this function greatly affects the final results of a proteomics assay. Consequently, valid statistical methods for assessing the quality of a given score function are extremely important. Previously, several research groups have used samples of known protein composition to assess the quality of a given score function. We demonstrate that this approach is problematic, because the outcome can depend on factors other than the score function itself. We then propose an alternative use of the same type of data to validate a score function. The central idea of our approach is that database matches that are not explained by any protein in the purified sample comprise a robust representation of incorrect matches. We apply our alternative assessment scheme to several commonly used score functions, and we show that our approach generates a reproducible measure of the calibration of a given peptide identification method. Furthermore, we show how our quality test can be useful in the development of novel score functions.

Place, publisher, year, edition, pages
2011. Vol. 10, no 5, 2671-2678 p.
Keyword [en]
shotgun proteomics, peptide identification, calibration, p value, database search software, standard protein mix
National Category
Bioinformatics (Computational Biology)
Research subject
Biochemistry towards Bioinformatics
Identifiers
URN: urn:nbn:se:su:diva-68512DOI: 10.1021/pr1012619ISI: 000290234800047OAI: oai:DiVA.org:su-68512DiVA: diva2:473693
Note

authorCount :3

Available from: 2012-01-07 Created: 2012-01-04 Last updated: 2017-12-08Bibliographically approved
In thesis
1. The accuracy of statistical confidence estimates in shotgun proteomics
Open this publication in new window or tab >>The accuracy of statistical confidence estimates in shotgun proteomics
2014 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

High-throughput techniques are currently some of the most promising methods to study molecular biology, with the potential to improve medicine and enable new biological applications. In proteomics, the large scale study of proteins, the leading method is mass spectrometry. At present researchers can routinely identify and quantify thousands of proteins in a single experiment with the technique called shotgun proteomics.

A challenge of these experiments is the computational analysis and the interpretation of the mass spectra. A shotgun proteomics experiment easily generates tens of thousands of spectra, each thought to represent a peptide from a protein. Due to the immense biological and technical complexity, however, our computational tools often misinterpret these spectra and derive incorrect peptides. As a consequence, the biological interpretation of the experiment relies heavily on the statistical confidence that we estimate for the identifications.

In this thesis, I have included four articles from my research on the accuracy of the statistical confidence estimates in shotgun proteomics, how to accomplish and evaluate it. In the first two papers a new method to use pre-characterized protein samples to evaluate this accuracy is presented. The third paper deals with how to avoid statistical inaccuracies when using machine learning techniques to analyze the data. In the fourth paper, we present a new tool for analyzing shotgun proteomics results, and evaluate the accuracy of  its statistical estimates using the method from the first papers.

The work I have included here can facilitate the development of new and accurate computational tools in mass spectrometry-based proteomics. Such tools will help making the interpretation of the spectra and the downstream biological conclusions more reliable.

Place, publisher, year, edition, pages
Stockholm: Department of Biochemistry and Biophysics, Stockholm University, 2014. 40 p.
Keyword
Proteomics, Peptides, Statistics, Mass spectrometry, Tandem mass spectrometry
National Category
Bioinformatics (Computational Biology)
Research subject
Biochemistry towards Bioinformatics
Identifiers
urn:nbn:se:su:diva-100769 (URN)978-91-7447-787-0 (ISBN)
Public defence
2014-04-04, Magnélisalen, Kemiska övningslaboratoriet, Svante Arrhenius väg 16 B, Stockholm, 09:30 (English)
Opponent
Supervisors
Available from: 2014-03-13 Created: 2014-02-12 Last updated: 2014-02-14Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full text

Search in DiVA

By author/editor
Granholm, Viktor
By organisation
Department of Biochemistry and Biophysics
In the same journal
Journal of Proteome Research
Bioinformatics (Computational Biology)

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 47 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf