Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Improved protein model quality assessments by changing the target function
Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).ORCID-id: 0000-0003-2232-3006
Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).ORCID-id: 0000-0003-3534-2986
Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
Visa övriga samt affilieringar
Antal upphovsmän: 52018 (Engelska)Ingår i: Proteins: Structure, Function, and Bioinformatics, ISSN 0887-3585, E-ISSN 1097-0134, Vol. 86, nr 6, s. 654-663Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

Protein modeling quality is an important part of protein structure prediction. We have for more than a decade developed a set of methods for this problem. We have used various types of description of the protein and different machine learning methodologies. However, common to all these methods has been the target function used for training. The target function in ProQ describes the local quality of a residue in a protein model. In all versions of ProQ the target function has been the S-score. However, other quality estimation functions also exist, which can be divided into superposition- and contact-based methods. The superposition-based methods, such as S-score, are based on a rigid body superposition of a protein model and the native structure, while the contact-based methods compare the local environment of each residue. Here, we examine the effects of retraining our latest predictor, ProQ3D, using identical inputs but different target functions. We find that the contact-based methods are easier to predict and that predictors trained on these measures provide some advantages when it comes to identifying the best model. One possible reason for this is that contact based methods are better at estimating the quality of multi-domain targets. However, training on the S-score gives the best correlation with the GDT_TS score, which is commonly used in CASP to score the global model quality. To take the advantage of both of these features we provide an updated version of ProQ3D that predicts local and global model quality estimates based on different quality estimates.

Ort, förlag, år, upplaga, sidor
2018. Vol. 86, nr 6, s. 654-663
Nyckelord [en]
CASP, deep learning, estimation of model accuracy, model quality assessments, protein structure prediction
Nationell ämneskategori
Biologiska vetenskaper
Forskningsämne
biokemi med inriktning mot bioinformatik
Identifikatorer
URN: urn:nbn:se:su:diva-156779DOI: 10.1002/prot.25492ISI: 000431734800006PubMedID: 29524250OAI: oai:DiVA.org:su-156779DiVA, id: diva2:1213127
Tillgänglig från: 2018-06-04 Skapad: 2018-06-04 Senast uppdaterad: 2022-02-26Bibliografiskt granskad
Ingår i avhandling
1. Protein Model Quality Assessment: A Machine Learning Approach
Öppna denna publikation i ny flik eller fönster >>Protein Model Quality Assessment: A Machine Learning Approach
2017 (Engelska)Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
Abstract [en]

Many protein structure prediction programs exist and they can efficiently generate a number of protein models of a varying quality. One of the problems is that it is difficult to know which model is the best one for a given target sequence. Selecting the best model is one of the major tasks of Model Quality Assessment Programs (MQAPs). These programs are able to predict model accuracy before the native structure is determined. The accuracy estimation can be divided into two parts: global (the whole model accuracy) and local (the accuracy of each residue). ProQ2 is one of the most successful MQAPs for prediction of both local and global model accuracy and is based on a Machine Learning approach.

In this thesis, I present my own contribution to Model Quality Assessment (MQA) and the newest developments of ProQ program series. Firstly, I describe a new ProQ2 implementation in the protein modelling software package Rosetta. This new implementation allows use of ProQ2 as a scoring function for conformational sampling inside Rosetta, which was not possible before. Moreover, I present two new methods, ProQ3 and ProQ3D that both outperform their predecessor. ProQ3 introduces new training features that are calculated from Rosetta energy functions and ProQ3D introduces a new machine learning approach based on deep learning. ProQ3 program participated in the 12th Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP12) and was one of the best methods in the MQA category. Finally, an important issue in model quality assessment is how to select a target function that the predictor is trying to learn. In the fourth manuscript, I show that MQA results can be improved by selecting a contact-based target function instead of more conventional superposition based functions.

Ort, förlag, år, upplaga, sidor
Stockholm: Department of Biochemistry and Biophysics, Stockholm University, 2017. s. 46
Nyckelord
Protein Model Quality Assessment, structural bioinformatics, machine learning, deep learning, support vector machine, proq, Artificial Neural Network, protein structure prediction
Nationell ämneskategori
Bioinformatik och beräkningsbiologi
Forskningsämne
biokemi med inriktning mot bioinformatik
Identifikatorer
urn:nbn:se:su:diva-137695 (URN)978-91-7649-633-6 (ISBN)978-91-7649-634-3 (ISBN)
Disputation
2017-02-10, Magnélisalen, Kemiska övningslaboratoriet, Svante Arrhenius väg 16 B, Stockholm, 14:00 (Engelska)
Opponent
Handledare
Forskningsfinansiär
Vetenskapsrådet, VR-NT 2012-5046
Anmärkning

At the time of the doctoral defense, the following paper was unpublished and had a status as follows: Paper 3: Manuscript.

Tillgänglig från: 2017-01-18 Skapad: 2017-01-10 Senast uppdaterad: 2025-02-07Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltextPubMed

Person

Uziela, KarolisMenéndez Hurtado, DavidShu, NanjiangWallner, BjörnElofsson, Ame

Sök vidare i DiVA

Av författaren/redaktören
Uziela, KarolisMenéndez Hurtado, DavidShu, NanjiangWallner, BjörnElofsson, Ame
Av organisationen
Institutionen för biokemi och biofysikScience for Life Laboratory (SciLifeLab)
I samma tidskrift
Proteins: Structure, Function, and Bioinformatics
Biologiska vetenskaper

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetricpoäng

doi
pubmed
urn-nbn
Totalt: 90 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf