Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Comparing Support Vector Regression and Random Forests for Predicting Malaria Incidence in Mozambique
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences. Eduardo Mondlane University, Mozambique.
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
2013 (English)In: 2013 International Conference on Advances in ICT for Emerging Regions (ICTer), IEEE Computer Society, 2013, 217-221 p.Conference paper, Published paper (Refereed)
Abstract [en]

Accurate prediction of malaria incidence is essentialfor the management of several activities in the ministry of health in Mozambique. This study investigates the comparison ofsupport vector machines (SVMs) and random forests (RFs) forthis purpose. A dataset with records of malaria cases covering theperiod 1999-2008 was used to evaluate predictive models on thelast year when developed from one up to nine years of historicaldata. Mean squared error (MSE) was used as performancemetric. The scheme for estimating variable importance commonlyemployed for RFs was also adopted for SVMs. SVMs developedfrom two year of historical data obtained the best predictionaccuracy. Hence, if we are interested in predicting the actualnumber of malaria cases the support vector machines modelshould be chosen. In the analysis of variable importance, IndoorResidual Spray (IRS), the districts of Manhiça and Matola andmonth of January turned out to be the most important predictorsin both the SVM and RF models.

Place, publisher, year, edition, pages
IEEE Computer Society, 2013. 217-221 p.
National Category
Information Systems
Research subject
Computer and Systems Sciences
Identifiers
URN: urn:nbn:se:su:diva-97714DOI: 10.1109/ICTer.2013.6761181ISBN: 978-1-4799-1274-2 (print)OAI: oai:DiVA.org:su-97714DiVA: diva2:679944
Conference
2013 International Conference on Advances in ICT for Emerging Regions (ICTer), 11-15 December 2013, Colombo (Sri Lanka)
Available from: 2013-12-17 Created: 2013-12-17 Last updated: 2015-11-09Bibliographically approved
In thesis
1. Mining Mozambique Health Data: The Case of Malaria: From Bayesian Incidence Risk to Incidence Case Predictions
Open this publication in new window or tab >>Mining Mozambique Health Data: The Case of Malaria: From Bayesian Incidence Risk to Incidence Case Predictions
2015 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

The health sector in Mozambique is piled with data, holding records of major public health diseases, such as malaria, cholera, etc. The process of scrutinizing such a mass of health data for useful information is challenging but essential for the health authorities and professionals. Statistical learning and inferential approaches can be used to provide health decision makers with appropriate tools for disease diagnosis and assessment, where the analysis is performed using Bayesian predictive techniques and data mining. The purpose of this thesis is to investigate how predictive data mining and Bayesian regression methods can be used effectively, so as to extract useful knowledge from reported malaria health data to support decision making and management. 

In summary, effective Bayesian predictive methods based on spatial and space-time reported cases of malaria have been derived, allowing the extraction of the main risk factors for malaria. Predictive models that combine consecutive temporal connections within the analysis of the space-time variations of the disease have been found to be relevant when the explicit modeling of seasonality is not required or is even unfeasible.

Investigation of the most effective ways to derive numerical predictive models was performed using several regression predictive methods. The conclusions are that effective numerical prediction of new cases of the disease can be achieved by training support vector machines using a time-window approach for the choice of different training sets based on a number of years and reducing the time towards the test set. The best performance is obtained for a smaller time-window. Another contribution of this thesis is the determining of the importance of predictors in the prediction of the incidence of malaria, performed by adopting the permutation accuracy strategy (from the random forests method) using the test set. Also, an additional contribution relates to a significant reduction in the predictive error, which has been obtained by the employment of a sample correction bias strategy, while testing the predictive models in different regions, other than where they were initially developed.

Place, publisher, year, edition, pages
Stockholm: Department of Computer and Systems Sciences, Stockholm University, 2015. 93 p.
Series
Report Series / Department of Computer & Systems Sciences, ISSN 1101-8526 ; 15-020
National Category
Public Health, Global Health, Social Medicine and Epidemiology
Research subject
Computer and Systems Sciences
Identifiers
urn:nbn:se:su:diva-122672 (URN)978-91-7649-304-5 (ISBN)
Public defence
2015-12-16, Aula NOD, NOD-huset, Borgarfjordsgtan 12, Kista, 13:00 (English)
Opponent
Supervisors
Available from: 2015-11-24 Created: 2015-11-08 Last updated: 2015-12-14Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full text

Search in DiVA

By author/editor
Zacarias, Orlando P.Boström, Henrik
By organisation
Department of Computer and Systems Sciences
Information Systems

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 67 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf