Nome do aluno
|
Bruno Souza Cabral
|
Título do trabalho
|
Exploring Open Information Extraction for Portuguese Using Large Language Models
|
Resumo do trabalho
|
Open Information Extraction (OpenIE) is a critical area in computer science focusing on extracting structured information from unstructured text in an unsupervised fashion, without necessitating predefined relations. OpenIE extracts valuable information for enhancing language-understanding tasks, such as populating knowledge bases, link prediction, and text comprehension. The extraction of OpenIE relations for Portuguese text presents substantial challenges, primarily due to its highly inflected nature, rich grammar, and numerous linguistic peculiarities. Despite numerous OpenIE studies targeting English, few have concentrated on the Portuguese language utilizing Deep Learning methods. Recently, a new branch of Deep Learning research, Generative information extraction, has emerged as a fruitful approach to address various sequence labeling issues. Contrasting sequence labeling methods, generative techniques can input a sentence and autoregressively generate multi-structured semantic representations of the information conveyed. Although Portuguese appears in a limited number of previous OpenIE works, most Deep Learning approaches primarily target multilingual tasks, treating Portuguese as another dataset during training. Furthermore, most training datasets for Portuguese are automatically translated from English sources. This thesis investigates generative methods and sequence labeling for the automated extraction of OpenIE relations from Portuguese texts. The study proposes building both generative and sequence labeling models, training them on Portuguese data, and comparing their performance in extracting OpenIE relations from Portuguese text. This comprehensive analysis contributes to the growing body of literature on the application of Deep Learning techniques for OpenIE in the Portuguese language and lays the foundation for further advancements in this research field.
|
Orientador
|
Daniela Barreiro Claro
|
Co-orientador
|
Marlo Souza
|
Membro externo 1
|
Gabriel Stanovsky
|
Link para o curriculum lattes
|
https://scholar.google.co.il/
|
Membro interno 1
|
Tatiane Rios
|
Link para o curriculum lattes
|
http://lattes.cnpq.br/
|
Suplente do membro externo
|
Aline Paes
|
Link para o curriculum lattes
|
http://lattes.cnpq.br/
|
Suplente do membro interno
|
Ricardo Rios
|
Link para o curriculum lattes
|
http://lattes.cnpq.br/
|
Data do exame
|
13 Sep, 2023
|
Horário do exame
|
8:00 AM
|