Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Generalization of Malaria Incidence Prediction Models by Correcting Sample Selection Bias
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences. Eduardo Mondlane University, Mozambique .
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
2013 (English)In: Advanced Data Mining and Applications: Proceedings, Part II / [ed] Hiroshi Motoda et al., Springer Berlin/Heidelberg, 2013, 189-200 p.Conference paper, Published paper (Refereed)
Abstract [en]

Performance measurements obtained from dividing a single sample into training and test sets, e.g. by employing cross-validation, may not give an accurate picture of the performance of any model developed from the sample, on the set of examples to which the model will be applied. Such measurements, which are due to that training and test samples are drawn according to different distributions may hence be misleading. In this study, two support vector machine models for predicting malaria incidence developed from certain regions and time periods in Mozambique are evaluated on data from novel regions and time periods, and the use of selection bias correction is investigated. It is observed that significant reductions in the predicted error can be obtained using the latter approach, strongly suggesting that techniques of this kind should be employed if test data can be expected to be drawn from some other distribution than what is the origin of the training data.

Place, publisher, year, edition, pages
Springer Berlin/Heidelberg, 2013. 189-200 p.
Series
Lecture Notes in Computer Science, ISSN 0302-9743 ; 8347
Keyword [en]
prediction, generalization, sample selection bias, malaria incidence
National Category
Information Systems
Research subject
Computer and Systems Sciences
Identifiers
URN: urn:nbn:se:su:diva-97729DOI: 10.1007/978-3-642-53917-6_17ISBN: 978-3-642-53916-9 (print)ISBN: 978-3-642-53917-6 (print)OAI: oai:DiVA.org:su-97729DiVA: diva2:679959
Conference
9th International Conference, ADMA 2013, Hangzhou, China, December 14-16, 2013
Available from: 2013-12-17 Created: 2013-12-17 Last updated: 2015-11-09Bibliographically approved
In thesis
1. Mining Mozambique Health Data: The Case of Malaria: From Bayesian Incidence Risk to Incidence Case Predictions
Open this publication in new window or tab >>Mining Mozambique Health Data: The Case of Malaria: From Bayesian Incidence Risk to Incidence Case Predictions
2015 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

The health sector in Mozambique is piled with data, holding records of major public health diseases, such as malaria, cholera, etc. The process of scrutinizing such a mass of health data for useful information is challenging but essential for the health authorities and professionals. Statistical learning and inferential approaches can be used to provide health decision makers with appropriate tools for disease diagnosis and assessment, where the analysis is performed using Bayesian predictive techniques and data mining. The purpose of this thesis is to investigate how predictive data mining and Bayesian regression methods can be used effectively, so as to extract useful knowledge from reported malaria health data to support decision making and management. 

In summary, effective Bayesian predictive methods based on spatial and space-time reported cases of malaria have been derived, allowing the extraction of the main risk factors for malaria. Predictive models that combine consecutive temporal connections within the analysis of the space-time variations of the disease have been found to be relevant when the explicit modeling of seasonality is not required or is even unfeasible.

Investigation of the most effective ways to derive numerical predictive models was performed using several regression predictive methods. The conclusions are that effective numerical prediction of new cases of the disease can be achieved by training support vector machines using a time-window approach for the choice of different training sets based on a number of years and reducing the time towards the test set. The best performance is obtained for a smaller time-window. Another contribution of this thesis is the determining of the importance of predictors in the prediction of the incidence of malaria, performed by adopting the permutation accuracy strategy (from the random forests method) using the test set. Also, an additional contribution relates to a significant reduction in the predictive error, which has been obtained by the employment of a sample correction bias strategy, while testing the predictive models in different regions, other than where they were initially developed.

Place, publisher, year, edition, pages
Stockholm: Department of Computer and Systems Sciences, Stockholm University, 2015. 93 p.
Series
Report Series / Department of Computer & Systems Sciences, ISSN 1101-8526 ; 15-020
National Category
Public Health, Global Health, Social Medicine and Epidemiology
Research subject
Computer and Systems Sciences
Identifiers
urn:nbn:se:su:diva-122672 (URN)978-91-7649-304-5 (ISBN)
Public defence
2015-12-16, Aula NOD, NOD-huset, Borgarfjordsgtan 12, Kista, 13:00 (English)
Opponent
Supervisors
Available from: 2015-11-24 Created: 2015-11-08 Last updated: 2015-12-14Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full text

Search in DiVA

By author/editor
Zacarias, Orlando P.Boström, Henrik
By organisation
Department of Computer and Systems Sciences
Information Systems

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 16 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf