Events

DEFESA DE TESE DE DOUTORADO Nº 29

Aluno: Arthur Flor de Sousa Neto

Título: "Overcoming Handwritten Data Scarcity: Synthesis and Recognition Using Residual Gated Convolution"

Orientador: Byron Leite Dantas Bezerra

Coorientador: Alejandro Héctor Toselli (NEU)

Examinador Externo: Thierry Paquet (URN)

Examinadora Externa: Nina Sumiko Tomita Hirata (USP)

Examinador Externo: Cleber Zanchettin (UFPE)

Examinador Interno: Bruno José Torres Fernandes

Data-hora: 15 de Abril de 2025 às 9h

Local: Formato Remoto - Google Meet.



Resumo:

         "Offline Handwritten Text Recognition (HTR) systems involve the automatic process of recognizing and transcribing handwritten text from scanned images into digital formats. The field has gained importance due to the increasing need for document digitization and the automation of data entry across various industrial sectors. However, achieving satisfactory recognition performance requires large and varied datasets for training optical models. The process of collecting and labeling such datasets is often time-consuming and impractical in many scenarios. To address this challenge, data augmentation is commonly applied; yet traditional augmentation methods may lead to model overfitting and performance degradation when data are scarce. Therefore, this work proposes integrating Conditional Generative Adversarial Networks (CGANs) for data synthesis into the optical model training to enhance handwriting recognition performance in data-scarce scenarios. To validate our proposal, we conducted a study that included: (i) a systematic literature review to identify gaps and trends in data augmentation for HTR; (ii) an exploration to establish an optimal configuration for traditional data augmentation; and (iii) extensive experiments using seven datasets. In addition, these datasets were partitioned into training subsets to simulate different data-scarce scenarios. The results indicate that, on average, data synthesis achieved a reduction of 40.1% in CER and 30.3% in WER, followed by transfer learning with reductions of 28.1% in CER and 21.4% in WER. In comparison, traditional data augmentation provided lower improvements, with average reductions of 19.3% in CER and 16.3% in WER. These findings demonstrate that both data synthesis and transfer learning enhance the performance of offline HTR systems, particularly in scenarios with limited training data. "

Defesa DOC 29

DEFESA DE DISSERTAÇÃO DE MESTRADO Nº 322

Aluna: Maria Eduarda Ferro de Mello

Título: "Evaluating Predictive Models for Silltibirth Using Sociodemographic and Maternal Data"

Orientadora: Patricia Takako Endo

Coorientador: Elisson da Silva Rocha (dotLAB)

Examinadora Externa: Ana Carla Silva Alexandre (IFPE)

Examinador Interno: Cleyton Mário de Oliveira Rodrigues

Data-hora: 31 de março de 2025 às 9h30min

Local: Formato Presencial - Miniauditório (PPGEC)



Resumo:

         " According to the World Health Organization (WHO), stillbirth or fetal death is defined as the death of a fetus during pregnancy, including babies who die from the 22nd week of gestation before complete expulsion or extraction from the mother's body. Stillbirth is considered potentially preventable with appropriate treatment; however, it is important to identify risk factors early. In this regard, machine learning models, due to their predictive potential, can be used to assist medical teams in decision-making processes for early diagnosis and monitoring. The objective of this dissertation is to evaluate tree-based machine learning models for the early identification of stillbirth cases, trained with data from pregnant women assisted by the Sistema Único de Saúde (SUS) of the state of Pernambuco, within the Mãe Coruja Pernambucana Program (PMCP). The PMCP dataset was used for the period from 2008 to 2022. Initially, the dataset consisted of 231,505 records and 71 attributes, including information about pregnancy and maternal health. After data analysis and understanding of the population characteristics, in collaboration with healthcare professionals, the dataset was reduced to 20 attributes identified as the most important for predicting stillbirth. The data were split into training and testing sets, with 70% and 30%, respectively. Due to the significant data imbalance issue, we use the Hybrid Undersampling 2x technique (H2X scenario) and the Random Undersampling technique (RU scenario) to address this problem. Finally, we selected four tree-based machine learning models: Decision Tree, Random Forest, AdaBoost, and XGBoost. In the H2X scenario, the models exhibited the highest specificity values, with XGBoost standing out in most evaluated metrics, except for sensitivity. In the RU scenario, the models demonstrated greater sensitivity compared to the H2X scenario, with AdaBoost excelling in terms of precision, specificity, and accuracy. After evaluating model performance, we analyzed the importance of each attribute in the learning process. Attributes related to maternal education, pregnancy risk, the first week of prenatal care, race, and maternal age were the most impactful for the models. By analyzing sociodemographic, clinical, and family health history data, the models aim to identify negative outcomes, such as stillbirth, providing early alerts and enabling timely interventions. Thus, these insights should guide future research aimed at improving the predictive accuracy of machine learning models in preventing stillbirth. "

Defesa MSC 322

DEFESA DE TESE DE DOUTORADO Nº 28

Aluno: João Luiz Vilar Dias

Título: "Towards Self-Aware Machines"

Orientador: Fernando Buarque de Lima Neto

Examinador Externo: Tshilidzi Marwala (UNU)

Examinador Externo: Daniel Corrêa Mograbi (PUC-Rio)

Examinador Interno: Wellington Pinheiro dos Santos

Data-hora: 29 de março de 2025 às 8h30min

Local: Formato Remoto - Google Meet.



Resumo:

         "The emergence of disruptive technologies and the principles of Industry 4.0, encompassing Control Systems, Artificial Intelligence, Data Science, and Intelligent Decision-Making have introduced a novel paradigm in assessing efficiency and quality objectives within production processes. Extensive discussions have revolved around enhancing machine productivity and refining decision-making through the use of intelligent decision agents, whether involving humans or purely algorithmic. The various tiers of decision-making—operational, managerial, and strategic—inevitably highlight the increasing necessity for seamless integration among such novel intelligent systems. To that matter, research on animal cognition suggests that more advanced species, beyond merely possessing reasoning capabilities to some extent, they also exhibit attributes categorized or related to self-awareness, which endow them with enhanced efficiency in decision-making and consequent actions. These sometimes-limited expressions of self-awareness allow them to perceive their position/role within an environment, both in isolation and in relation to other entities. Available opportunities and threats are important subjects tackled by such sentient and reasonable beings. By considering expressions of self-awareness on a smaller scale, e.g., intelligent industrial systems, one could likewise expect, greater efficiency. Such extended capabilities could also facilitate improved planning and operational decisions in complex scenarios, particularly in real-time contexts. This consideration raises the fundamental question of this thesis: how can intelligent computing systems be designed to incorporate self-awareness? This research explores the emergence of self-aware mechanisms in biological entities of varying cognitive complexity and their pragmatic approach for possible adoption in intelligent industrial systems. We assume that self-awareness is as a progressively adaptive response to environmental pressures, which organisms encounter when operating initially in isolation and later within intricate ecosystems. Even in its most rudimentary forms, self-awareness is employed by living organisms to guide decision-making, particularly through internal simulations that aid in navigating unpredictable circumstances. We argue that within an artificial framework, intelligent machines may derive significant benefits if using self-awareness attributes, mainly as these enable them to recognize their own needs, roles, contributions and interaction with others. From an evolutionary standpoint, self-awareness can be classified into three primary components: self-recognition (or bodily awareness), meta-cognition (the capacity to reflect on one’s own knowledge), and theory of mind (the ability to infer desires, beliefs, and knowledge in others). Despite its relevance, until now this perspective has not yet been comprehensively thought to be incorporated into intelligent computer system architectures. The present study aims to fully establish the concept of self-awareness in computational systems, with a focus on industrial machines. This thesis includes the proposition of a new classification hierarchy of levels (layers) for self-aware machines, introduces accompanying new Machine Learning methodologies based on Computational Semiotics to achieve profounder levels of self-awareness in machines, and provides simulations and argumentations to support these propositions, which ultimately aim (for now) at improved industrial performance. "

Defesa DOC 28

Page 16 of 79

Go to top Menu