Mining Mozambique Health Data: The Case of Malaria: From Bayesian Incidence Risk to Incidence Case Predictions
2015 (English)Doctoral thesis, comprehensive summary (Other academic)
The health sector in Mozambique is piled with data, holding records of major public health diseases, such as malaria, cholera, etc. The process of scrutinizing such a mass of health data for useful information is challenging but essential for the health authorities and professionals. Statistical learning and inferential approaches can be used to provide health decision makers with appropriate tools for disease diagnosis and assessment, where the analysis is performed using Bayesian predictive techniques and data mining. The purpose of this thesis is to investigate how predictive data mining and Bayesian regression methods can be used effectively, so as to extract useful knowledge from reported malaria health data to support decision making and management.
In summary, effective Bayesian predictive methods based on spatial and space-time reported cases of malaria have been derived, allowing the extraction of the main risk factors for malaria. Predictive models that combine consecutive temporal connections within the analysis of the space-time variations of the disease have been found to be relevant when the explicit modeling of seasonality is not required or is even unfeasible.
Investigation of the most effective ways to derive numerical predictive models was performed using several regression predictive methods. The conclusions are that effective numerical prediction of new cases of the disease can be achieved by training support vector machines using a time-window approach for the choice of different training sets based on a number of years and reducing the time towards the test set. The best performance is obtained for a smaller time-window. Another contribution of this thesis is the determining of the importance of predictors in the prediction of the incidence of malaria, performed by adopting the permutation accuracy strategy (from the random forests method) using the test set. Also, an additional contribution relates to a significant reduction in the predictive error, which has been obtained by the employment of a sample correction bias strategy, while testing the predictive models in different regions, other than where they were initially developed.
Place, publisher, year, edition, pages
Stockholm: Department of Computer and Systems Sciences, Stockholm University , 2015. , 93 p.
Report Series / Department of Computer & Systems Sciences, ISSN 1101-8526 ; 15-020
Public Health, Global Health, Social Medicine and Epidemiology
Research subject Computer and Systems Sciences
IdentifiersURN: urn:nbn:se:su:diva-122672ISBN: 978-91-7649-304-5OAI: oai:DiVA.org:su-122672DiVA: diva2:867936
2015-12-16, Aula NOD, NOD-huset, Borgarfjordsgtan 12, Kista, 13:00 (English)
Lavesson,, Niklas, Professor
Boström, Henrik, ProfessorDanielson, Mats, Professor
List of papers