Divulgação - Defesa Nº 248

Aluno: Thiego Buenos Aires de Carvalho

Título: “MapView - Exploring Datasets via Unsupervised View Recommendation”

Orientador: Fernando Buarque de Lima Neto - (PPGEC)

Co-orientador: Denis Mayr Lima Martins

Examinador Externo: Ana Carolina B. Salgado - (CESAR School)

Examinador Interno: João Fausto Lorenzato de Oliveira - (PPGEC)

Data-hora: 13/Junho/2022 (9:00h) - AM
Local: Formato Remoto (http://meet.google.com/czp-vucp-xcq)


Resumo:

Data are valuable assets to industries, government agencies, and research institutes. All these entities have a growing need of analyzing large data volumes that are generated from a variety of sources for helping users to communicate or to support their decision-making. Exploring even a simple database is not a trivial task, inasmuch as it requires technical knowledge which many new and non-technical data users do not have. This task includes writing SQL queries to retrieve a data set from larges database to uncover insights, patterns, and points of interest among them. Furthermore, in large volumes of data, finding valuable data that matches a certain user’s purpose requirement is challenging, especially under restrictive budget/time constraints. However, this task is typically manual, ad-hoc, and time consuming. To address these challenges, researches have proposed tools to support data exploration tasks, especially by means of View Recommendation. Under this research stream, a view can be seen as a visual representation of query’s result on database. Instead of showing a set of results produced by a query over a database, as a table like SQL represents, the result-set is then plotted using histogram or bar chart. Systems that use this approach start by creating all possible views, filter out non-informative candidates and recommend the most interesting views according to some objective functions. The goal of those solutions is to improve data exploration by guiding the user, showing the next best view to be explored, enabling users to quickly understand the data and find insights. View Recommendation is especially challenging in the context of Data Marketplaces since every data interaction incurs monetary cost. Due to this, instead of an iterative process of querying and analyzing unrelated views, each of which the user must pay for, a more suitable approach would consider a recommendation of bundles of related views. In this work, we propose and implement a new approach for View Recommendation called MapView, which is based on Self-Organizing Maps (SOM) and helps non-technical users with both technical expertise and time limited, in data exploratory tasks. Our proposed approach employs SOM as a clustering mechanism to group and recommend exploratory data views to users. This recommendation process can also be personalized to help meeting user’s intention in an interactive manner. To address View Recommendation in Data Marketplaces, we introduce the problem of recommending view bundles. In particular, we focus in cases where the data consumer’s budget to interact with the marketplace is limited. We investigate data exploration tasks that require several iterations to uncover valuable insights in the data, where view bundle recommendation allow for a multi-perspective view of the target data without overflow the user’s budget. We also investigate how SOM and Genetic Algorithm could be combined to recommend near-optimal view bundles while took a specified cost limit into account. The experimental results show that MapView is effective in recommending valuable views, hence, being of aid in data exploration tasks. Complementary views are recommended according to the user’s interest. This even within a tight budget.

Go to top Menú