Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Estimation of model accuracy in CASP13
Stockholm University, Faculty of Science, Department of Biochemistry and Biophysics. Stockholm University, Science for Life Laboratory (SciLifeLab).
Show others and affiliations
2019 (English)In: Proteins: Structure, Function, and Bioinformatics, ISSN 0887-3585, E-ISSN 1097-0134Article in journal (Refereed) Epub ahead of print
Abstract [en]

Methods to reliably estimate the accuracy of 3D models of proteins are both a fundamental part of most protein folding pipelines and important for reliable identification of the best models when multiple pipelines are used. Here, we describe the progress made from CASP12 to CASP13 in the field of estimation of model accuracy (EMA) as seen from the progress of the most successful methods in CASP13. We show small but clear progress, that is, several methods perform better than the best methods from CASP12 when tested on CASP13 EMA targets. Some progress is driven by applying deep learning and residue‐residue contacts to model accuracy prediction. We show that the best EMA methods select better models than the best servers in CASP13, but that there exists a great potential to improve this further. Also, according to the evaluation criteria based on local similarities, such as lDDT and CAD, it is now clear that single model accuracy methods perform relatively better than consensus‐based methods.

Place, publisher, year, edition, pages
2019.
National Category
Bioinformatics and Systems Biology
Research subject
Biochemistry towards Bioinformatics
Identifiers
URN: urn:nbn:se:su:diva-172394DOI: 10.1002/prot.25767ISI: 000476102200001OAI: oai:DiVA.org:su-172394DiVA, id: diva2:1346725
Available from: 2019-08-28 Created: 2019-08-28 Last updated: 2019-09-10
In thesis
1. Structured Learning for Structural Bioinformatics: Applications of Deep Learning to Protein Structure Prediction
Open this publication in new window or tab >>Structured Learning for Structural Bioinformatics: Applications of Deep Learning to Protein Structure Prediction
2019 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Proteins are the basic molecular machines of the cell, performing a broad range of tasks, from structural support to catalysisof chemical reactions. Their function is determined by their 3D structure, which in turn is dictated by the order of their components, the amino acids.

This thesis is dedicated to applications of machine learning to the problems of contact prediction, ab-initio, and model quality assessment. In particular, my research has been focused on developing methods that are both effective, and easy to use.

In the first paper, we improved the already state-of-the-art model quality assessment (MQA) program ProQ3 replacing the underlying machine learning algorithm from svm to Deep Learning, baptised ProQ3D. The correlation between predicted and true scores was improved from 0.85 to 0.90, using the same training data and features.

The second paper joined several programs into a single pipeline for ab-initio structure prediction: contact prediction,folding, and model selection. We attempted to predict the structures of all 6379 PFAM families with unknown structure, ofwhich 558 we believe to be accurate. Of these, 415 had not been reported before.

The third paper uses advances in machine learning to build a contact predictor, PconsC4, that is fast and easy to deployin large-scale studies, since it requires a single Multiple Sequence Alignment (MSA), and no external dependencies. The predictions are state-of-the-art, yielding a 12% improvement in precision over PconsC3, and 244 times faster.

With ProQ4, in the fourth paper, we introduce a novel way of training deep networks for MQA in a way that minimises the bias of the training data, and emphasises model ranking, and demonstrate its viability with a minimal description ofthe protein. The ranking correlation was improved with respect to ProQ3D from 0.82 to 0.90.

Lastly, in the fifth paper, weshow the results of ProQ3D and ProQ4 in a completely blind test: CASP13.

Place, publisher, year, edition, pages
Stockholm: Department of Biochemistry and Biophysics, Stockholm University, 2019. p. 63
National Category
Bioinformatics and Systems Biology
Research subject
Biochemistry towards Bioinformatics
Identifiers
urn:nbn:se:su:diva-172395 (URN)978-91-7797-797-1 (ISBN)978-91-7797-798-8 (ISBN)
Public defence
2019-10-11, Magnélisalen, Kemiska övningslaboratoriet, Svante Arrhenius väg 16 B, Stockholm, 13:00 (English)
Opponent
Supervisors
Note

At the time of the doctoral defense, the following paper was unpublished and had a status as follows: Paper 4: Manuscript.

Available from: 2019-09-18 Created: 2019-08-28 Last updated: 2019-09-12Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Search in DiVA

By author/editor
Elofsson, ArneMenéndez-Hurtado, DavidUziela, Karolis
By organisation
Department of Biochemistry and BiophysicsScience for Life Laboratory (SciLifeLab)
In the same journal
Proteins: Structure, Function, and Bioinformatics
Bioinformatics and Systems Biology

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 14 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf