On Using Samples of Known Protein Content to Assess the Statistical Calibration of Scores Assigned to Peptide-Spectrum Matches in Shotgun Proteomics
2011 (English)In: Journal of Proteome Research, ISSN 1535-3893, E-ISSN 1535-3907, Vol. 10, no 5, 2671-2678 p.Article in journal (Refereed) Published
In shotgun proteomics, the quality of a hypothesized match between an observed spectrum and a peptide sequence is quantified by a score function. Because the score function lies at the heart of any peptide identification pipeline, this function greatly affects the final results of a proteomics assay. Consequently, valid statistical methods for assessing the quality of a given score function are extremely important. Previously, several research groups have used samples of known protein composition to assess the quality of a given score function. We demonstrate that this approach is problematic, because the outcome can depend on factors other than the score function itself. We then propose an alternative use of the same type of data to validate a score function. The central idea of our approach is that database matches that are not explained by any protein in the purified sample comprise a robust representation of incorrect matches. We apply our alternative assessment scheme to several commonly used score functions, and we show that our approach generates a reproducible measure of the calibration of a given peptide identification method. Furthermore, we show how our quality test can be useful in the development of novel score functions.
Place, publisher, year, edition, pages
2011. Vol. 10, no 5, 2671-2678 p.
shotgun proteomics, peptide identification, calibration, p value, database search software, standard protein mix
Bioinformatics (Computational Biology)
Research subject Biochemistry towards Bioinformatics
IdentifiersURN: urn:nbn:se:su:diva-68512DOI: 10.1021/pr1012619ISI: 000290234800047OAI: oai:DiVA.org:su-68512DiVA: diva2:473693
authorCount :32012-01-072012-01-042014-02-14Bibliographically approved