Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
PconsFam: An Interactive Database of Structure Predictions of Pfam Families
Stockholm University, Faculty of Science, Department of Biochemistry and Biophysics. Stockholm University, Science for Life Laboratory (SciLifeLab).
Stockholm University, Faculty of Science, Department of Biochemistry and Biophysics. Stockholm University, Science for Life Laboratory (SciLifeLab).
Stockholm University, Faculty of Science, Department of Biochemistry and Biophysics. Stockholm University, Science for Life Laboratory (SciLifeLab).ORCID iD: 0000-0003-3534-2986
Show others and affiliations
Number of Authors: 62019 (English)In: Journal of Molecular Biology, ISSN 0022-2836, E-ISSN 1089-8638, Vol. 431, no 13, p. 2442-2448Article in journal (Refereed) Published
Abstract [en]

At present, about half of the protein domain families lack a structural representative. However, in the last decade, predicting contact maps and using these to model the tertiary structure for these protein families have become an alternative approach to gain structural insight. At present, reliable models for several hundreds of protein families have been created using this approach. To increase the use of this approach, we present PconsFam, which is an intuitive and interactive database for predicted contact maps and tertiary structure models of the entire Pfam database. By modeling all possible families, both with and without a representative structure, using the PconsFold2 pipeline, and running quality assessment estimator on the models, we predict an estimation for how confident the contact maps and structures are for each family.

Place, publisher, year, edition, pages
2019. Vol. 431, no 13, p. 2442-2448
Keywords [en]
contact maps, structure prediction, folding pipeline, Pfam
National Category
Biological Sciences
Identifiers
URN: urn:nbn:se:su:diva-172037DOI: 10.1016/j.jmb.2019.01.047ISI: 000474675300006PubMedID: 30796988OAI: oai:DiVA.org:su-172037DiVA, id: diva2:1346198
Available from: 2019-08-27 Created: 2019-08-27 Last updated: 2022-02-26Bibliographically approved
In thesis
1. Transmembrane Proteins and Protein Structure Prediction: What we can learn from Computational Methods
Open this publication in new window or tab >>Transmembrane Proteins and Protein Structure Prediction: What we can learn from Computational Methods
2021 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

A protein’s 3D-structure is essential to understand how proteins function and interact and how biochemical processes proceed in organic life. Despite the advancement in experimental methods, it remains expensive and time-consuming to determine protein structure experimentally. There have been significant advances in machine learning and computational methods where, in many cases, models of protein structure can be determined to a high level of quality. Using computational methods helps predict protein 3D-structure and is often used complementary to experimental methods to give better insight and understanding of biological processes.

This thesis presents studies focusing on the simplicity and transparency of the 3D-structure pipeline. This is done with a new interactive database with full access to the pipeline’s data and code together with tools to analyse and compare models and structures. 

I present a new module for the last step in this pipeline, the final folding of the protein chain, which both simplifies the current pipeline and uses new input data based on the current research. This module predicts better models than its predecessor and produces models more than a magnitude faster than the current state-of-the-art tools. This module also contains a novel way of both folding and docking dimers in one single step. 

There are many examples of how machine learning models contain biases that originate in biased training data, translating into models that do not generalise well. I present a study where experts collaborate to create a high-quality database of Intrinsically Disordered Proteins. Through manual annotation and quality protocols, high-quality training data has been produced that is well suited for machine learning tasks and protein disorder analysis. In this thesis, I also present computational methods pertaining to transmembrane proteins and how they can increase our insight into membrane protein structure. In one study, we use computational methods together with experimental methods to investigate how differently charged residue pairs that form salt bridges inside the membrane of membrane proteins changes the insertion potential. We show that amino acid pairs that form salt bridges in this setting contribute 0.5-0.7 kcal/mol to membrane insertion’s apparent free energy. This gives new insight and advances in how we calculate insertion and can lead to better membrane protein topology predictors. In the final study, we investigate the CPA/AT-transporter family of transmembrane proteins and create a new integrated topology annotation method and structural classification, resulting in new insight into how this family evolved through time. The entire pipeline is published as an interactive database with complete transparency for both the method and data used. The study shows how this family has evolved by duplicating internal regions and how this has caused a structural symmetry in the family. 

This thesis, therefore, contributes to a more accessible and more transparent path of using computational methods to give a more extensive insight into protein structure prediction and how these structures pertain to biochemical processes.

Place, publisher, year, edition, pages
Stockholm: Department of Biochemistry and Biophysics, Stockholm University, 2021. p. 57
Keywords
protein structure prediction, contact prediction, transmembrane protein, topology prediction, machine learning
National Category
Bioinformatics (Computational Biology)
Research subject
Biochemistry towards Bioinformatics
Identifiers
urn:nbn:se:su:diva-191211 (URN)978-91-7911-456-5 (ISBN)978-91-7911-457-2 (ISBN)
Public defence
2021-04-30, Magnélisalen, Kemiska övningslaboratoriet, Svante Arrhenius väg 16 B, and online via Zoom, public link is available at the department website, Stockholm, 10:00 (English)
Opponent
Supervisors
Available from: 2021-04-07 Created: 2021-03-12 Last updated: 2022-02-25Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textPubMed

Authority records

Lamb, JohnJarmolinska, AleksandraMichel, MircoMenéndez-Hurtado, DavidElofsson, Arne

Search in DiVA

By author/editor
Lamb, JohnJarmolinska, AleksandraMichel, MircoMenéndez-Hurtado, DavidElofsson, Arne
By organisation
Department of Biochemistry and BiophysicsScience for Life Laboratory (SciLifeLab)
In the same journal
Journal of Molecular Biology
Biological Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 123 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf