DEFESA DE TESE DE DOUTORADO Nº16

Aluno: Luiz Felipe Vieira Verçosa

Título: “Handling Complexity in Event Logs with Network Science and Clustering”

Orientador: Byron Leite Dantas Bezerra - (PPGEC)

Coorientador: Carmelo José Albanez Bastos Filho - (PPGEC)

Examinador Externo: Marcelo Fantinato - (USP)

Examinador Externo: Hugo Valadares Siqueira - (UTFPR)

Examinador Interno: Fernando Buarque de Lima Neto - (PPGEC)

Examinador Interno: Diego Marconi Pinheiro Ferreira Silva - (PPGEC)

Data-hora: 26 de Março de 2024, às 14:00h.
Local:Formato Remoto - Google Meet.


Resumo:

         Real event logs from businesses are usually complex, with many variants and activities. This can lead to so-called "spaghetti" process models that are hard to comprehend and from which it is challenging to extract insights. In the literature, there are measurements that identify complexity in event logs by considering aspects such as size, distances, variance, and entropies. In addition, clustering techniques may be applied to disentangle complex processes into multiple simpler models. However, there is still room for improvement given the increasing data volume and complexity of businesses. In this work, we propose techniques to identify complexity in event logs, predict values of conformance checking metrics, and cluster complex processes. The identification of complexity is performed through Markovian abstraction derived from the event log. Next, characteristics of the Markovian graphs are extracted with network science metrics that capture centrality, clustering, and density of the nodes. We identified correlations between such metrics and the quality dimensions of fitness, precision, simplicity, and generalization of the respective discovered model. In some cases, the Markovian models outperform metrics from the literature. Similarly, we propose sequential clustering to discover more comprehensible models from the event log. It relies on the identification of a large number of well-behaved clusters in the first stage, followed by an agglomerative step able to identify families of process variants. We show the advantages of our approach in identifying processes in a real-case legal domain, instead of only splitting variants based on the traces' attributes.

Go to top Menú