Divulgação - Defesa Nº 259

Aluno: Igor Vitor Teixeira

Título: “Predicting congenital syphilis cases: a performance evaluation of different machine learning models”

Orientadora: Patricia Takako Endo - (PPGEC)

Examinadora Externa: Carmen Simone Grilo Diniz - ( USP )

Examinador Interno: Wellington Pinheiro Santos - (PPGEC)

Data-hora: 24 de outubro de 2022, às 14:00h.
Local: Formato Remoto (https://meet.google.com/tgw-qbaz-yex)


         Background: Communicable diseases represent a huge economic burden for healthcare systems and for society. Sexually transmitted infections (STIs) are a concerning issue, especially in developing and underdeveloped countries, in which environmental factors and other determinants of health play a role in contributing to the fast spread. In light of this situation, machine learning techniques have been explored to assess the incidence of syphilis and contribute to the epidemiological surveillance in this scenario. Objective: The main goal of this work is to evaluate the performance of different machine learning models on predicting undesirable outcomes of congenital syphilis. Method: For that, we use clinical and sociodemographic data from pregnant women that were assisted by a social program in Pernambuco, Brazil, named Mãe Coruja Pernambucana Program (PMCP). Based on a rigorous methodology, we propose six experiments using three feature selection techniques to select the most relevant attributes, pre-process and clean the data, apply hyperparameter optimization to tune the machine learning models, and train and test models to have a fair evaluation and discussion. Results: The AdaBoost-BODS-Expert model, an Adaptive Boosting (AdaBoost) model from the Balanced with One-hot Encoding Data Set (BODS) experiment that used attributes selected by health experts, presented the best results in terms of evaluated metrics, interpretability, and acceptance by health experts from PMCP. This can give more confidence and allow adoption in daily usage to classify possible outcomes of congenital syphilis using clinical and sociodemographic data.


Go to top Menu