Bug Analysis in Jupyter Notebook Projects: An Empirical Study

Nome do aluno

 

Taijara Loiola de Santana

 

Título do trabalho

 

Bug Analysis in Jupyter Notebook Projects: An Empirical Study

 

Resumo do trabalho

 

One of the new technologies driving data science is Computational Notebooks, which allow users to build data-oriented codes, emphasizing the analysis performed and the data obtained. Although computing notebooks have gained visibility, problems and solutions already discussed and studied by software engineering must be addressed, impacting the quality of the developed software and, consequently, data analysis. It can also lead to the spread of bad programming practices. Computational notebooks, such as Jupyter, have been widely adopted by data scientists to write code for analyzing and visualizing data.

Despite their growing adoption and popularity, few studies were found to understand Jupyter development challenges from the practitioners’ point of view. This study presents a systematic study of bugs and challenges that Jupyter practitioners face through a large-scale empirical investigation. We mined 14,740 commits from 105 GitHub open-source projects with Jupyter notebook code. Next, we analyzed 30,416 Stack Overflow posts, which gave us insights into bugs that practitioners face when developing Jupyter notebook projects. Finally, we conducted nineteen interviews with data scientists to uncover more details about Jupyter bugs and to gain insight into Jupyter developers’ challenges. We propose a bug taxonomy for Jupyter projects based on our results. We also highlight bug categories, their root causes, and the challenges that Jupyter practitioners face.

 

Orientador

 

Eduardo Santana de Almeida

 

Co-orientador (opcional)

 

Paulo Anselmo da Mota Silveira Neto

 

Membro Titular Externo (com afiliação)

 

Tayana Conte (UFAM)

 

Link para o curriculum lattes

 

http://lattes.cnpq.br/6682919653508224

 

Membro Titular Interno ou Titular Externo 2 (com afiliação)

 

Rodrigo Rocha Gomes e Souza

 

Link para o curriculum lattes

 

http://lattes.cnpq.br/7697794806460975

 

Membro Suplente Externo (com afiliação)

 

Leonardo Gresta Paulino Murta

 

Link para o curriculum lattes

 

http://lattes.cnpq.br/1565296529736448

 

Membro Suplente Interno ou Suplente Externo 2 (com afiliação)

 

Cláudio Nogueira Sant'Anna

 

Link para o curriculum lattes

 

http://lattes.cnpq.br/3228159608138969

 

Data da defesa

 

01 Mar, 2024

 

Horário da defesa

 

9:00 AM

 

 

Data da Defesa: 
01/03/2024 - 09:00
Tipo de Defesa: 
Defesa de Mestrado