Luigi Quaranta

Product Manager & Co-founder, PeoplewareAI

Department of Computer Science, University of Bari

Via E. Orabona, 4 · 70125 · Bari, Italy

Luigi Quaranta is a Co-founder and Product Manager at PeoplewareAI, a University of Bari spin-off, where he leads the development of AI-enabled products, including LLM applications and RAG systems. Alongside his startup role, he actively collaborates with the Collab Research Group at the University of Bari’s Department of Computer Science.

Operating at the intersection of Software Engineering and Artificial Intelligence, Luigi’s research explores AI Engineering and MLOps, with a special focus on the healthcare domain. His work spans the development of methodologies and tools for building, deploying, and maintaining AI-based systems, as well as collaborative software development practices, software quality assurance, and the application of AI techniques to Software Engineering challenges.

Research Interests

AI Engineering
MLOps
LLM Applications & RAG Systems
Agentic AI Systems
Software Quality Assurance
Healthcare AI
Collaborative Software Development
AI for Software Engineering

Academic Background

Postdoc
February 2023 – February 2026
University of Bari · Department of Computer Science · Collab Research Group
Funded by PNRR ‘FAIR – Future Artificial Intelligence Research’ project · Spoke 6: Symbiotic AI
Ph.D. in Computer Science
November 2022
University of Bari
Ph.D. Thesis awarded “cum Laude”
Master’s degree in Computer Science
July 2019
University of Bari
Full marks and honors
Bachelor’s degree in Computer Science
October 2016
University of Bari
Full marks and honors

Teaching

Software Engineering for AI-enabled Systems (laboratory)
Master’s Degree in Computer Science · University of Bari
2025/2026 academic year
Introduction to Machine Learning with Python
Short Master in Digital Health · University of Bari
2025/2026 academic year
Design of a Symbiotic AI System
Ph.D. in Digital Innovation and e-Health · University of Bari
2025/2026 academic year
Software Engineering for AI-enabled Systems (laboratory)
Master’s Degree in Computer Science · University of Bari
2024/2025 academic year
Design of a Symbiotic AI System
Ph.D. in Computer Science and Mathematics · University of Bari
2024/2025 academic year

Show 3 more

Design of a Symbiotic AI System
Ph.D. in Digital Innovation and e-Health · University of Bari
2024/2025 academic year
Design of a Symbiotic AI System
Ph.D. in Computer Science and Mathematics · University of Bari
2023/2024 academic year
Design of a Symbiotic AI System
Ph.D. in Digital Innovation and e-Health · University of Bari
2023/2024 academic year

Professional Service

Editorial Roles

Associate Editor
Automated Software Engineering (ASE) 2025–present
Guest Editor
Future Generation Computer Systems (FGCS) SI: MLOps Advancements: Improving Development, Management, and Interpretability in AI and Machine Learning 2026–present
Guest Editor
Frontiers in Digital Health SI: Implementing Digital Twins in Healthcare: Pathways to Person-Centric Solutions 2024–2025

Event Organization

MLOps25 Co-Chair
ESEM 2024 Proceedings Chair
SIESTA 2024 Co-Chair
ICSME 2024 Social Media/Publicity Chair
CHASE 2022 Web Chair

Show 3 more

ESEM 2021 Web Chair
SSBSE 2021 Web Chair
SSBSE 2020 Web Chair

Program Committees

CAIN 2026
CHASE 2026
MSR 2026 · Registered Reports
ICPC 2026 · Early Research Achievements (ERA)
ICSME 2025
CAIN 2025
MSR 2025 · Registered Reports
CHASE 2025
RCIS 2025

Show 8 more

ICSME 2024
ASE 2024 · Artifact Evaluation
MSR 2024 · Registered Reports
CHASE 2024
CAIN 2024
SANER 2024
PROFES 2024
ICSME 2023 · New Ideas and Emerging Results (NIER)

Reviewing

ACM Transactions on Software Engineering and Methodology (TOSEM)Association for Computing Machinery
IEEE Transactions on Software Engineering (TSE)IEEE Computer Society
Empirical Software Engineering (EMSE)Springer
Journal of Systems and Software (JSS)Elsevier
Information and Software Technology (IST)Elsevier

Show 1 more

Automated Software Engineering (ASE)Springer

Publications

My full publication record is available on the following academic profiles:

Scopus
Google Scholar
DBLP
ResearchGate

Below is a selected list of publications, grouped by year.

2026

Journal Articles

Smart Health

Integrating AI into Healthcare Systems: A Multivocal Literature Review

G. Mallardi, F. Calefato, L. Quaranta, F. Lanubile

Smart Health, Vol. 39, pp. 100631, 2026

Abstract

The integration of artificial intelligence (AI) into healthcare systems promises to improve patient care, enhance operational efficiency, and facilitate personalized medicine. The goal of this paper is to provide a comprehensive review of the current challenges that hinder the seamless adoption of AI in healthcare. Additionally, the paper aims to delineate the best practices for achieving optimal integration of AI within the medical domain. To achieve these objectives, we employ a Multivocal Literature Review (MLR), a systematic literature review methodology that incorporates both peer-reviewed publications and non-peer-reviewed sources, including technical blog posts and white papers. Substantial evidence in the literature points to challenges related to data quality, model bias, interoperability, patient privacy, and the susceptibility of AI systems to adversarial attacks. Additionally, there is growing awareness of challenges such as the distributional shift between training and production data, as well as the critical need for continuous monitoring and retraining of AI models within dynamic clinical settings. Based on our review, we advocate for the adoption of best practices aimed at mitigating the identified challenges, including rigorous model evaluation, standardization of data practices, and promotion of interdisciplinary collaboration. Furthermore, we emphasize the need for responsible AI that aligns with principles of fairness, transparency, security, and reliability, underscoring the importance of multi-stakeholder engagement.

DOI

2025

Journal Articles

IST

A multivocal literature review on the benefits and limitations of industry-leading AutoML tools

L. Quaranta, K. Azevedo, F. Calefato, M. Kalinowski

Information and Software Technology, Vol. 178, pp. 107608, 2025

Abstract

Context:
Rapid advancements in Artificial Intelligence (AI) and Machine Learning (ML) are revolutionizing software engineering in every application domain, driving unprecedented transformations and fostering innovation. However, despite these advances, several organizations are experiencing friction in the adoption of ML-based technologies, mainly due to the current shortage of ML professionals. In this context, Automated Machine Learning (AutoML) techniques have been presented as a promising solution to democratize ML adoption, even in the absence of specialized people.
Objective:
Our research aims to provide an overview of the evidence on the benefits and limitations of AutoML tools being adopted in industry.
Methods:
We conducted a Multivocal Literature Review, which allowed us to identify 54 sources from the academic literature and 108 sources from the grey literature reporting on AutoML benefits and limitations. We extracted explicitly reported benefits and limitations from the papers and applied the thematic analysis method for synthesis.
Results:
In general, we identified 18 reported benefits and 25 limitations. Concerning the benefits, we highlight that AutoML tools can help streamline the core steps of ML workflows, namely data preparation, feature engineering, model construction, and hyperparameter tuning—with concrete benefits on model performance, efficiency, and scalability. In addition, AutoML empowers both novice and experienced data scientists, promoting ML accessibility. However, we highlight several limitations that may represent obstacles to the widespread adoption of AutoML. For instance, AutoML tools may introduce barriers to transparency and interoperability, exhibit limited flexibility for complex scenarios, and offer inconsistent coverage of the ML workflow.
Conclusion:
The effectiveness of AutoML in facilitating the adoption of machine learning by users may vary depending on the specific tool and the context in which it is used. Today, AutoML tools are used to increase human expertise rather than replace it and, as such, require skilled users.

DOI
TOSEM

Self-monitoring of Developers’ Emotions: the Case of Agile Retrospective Meetings

D. Grassi, F. Lanubile, N. Novielli, L. Quaranta, A. Serebrenik

ACM Transactions on Software Engineering and Methodology, 2025

Abstract

Developers experience a wide range of emotions while creating software. Being able to identify the causes of one’s own and peers’ emotions can equip developers with the ability to regulate their behavior to restore positive moods and productivity. In this paper, we investigate to what extent self-monitoring of emotions can enhance agile retrospective meetings by improving the emotion awareness of participants. To this aim, we conducted a controlled experiment involving three software development teams involving two student teams and one professional developers team. The experiment design involves the collection of biometrics and self-reported information about emotions, which are then visualized before the retrospective meetings to inform discussion using EmoVizPhy, a tool that we designed and implemented for this aim. While students found that self-monitoring helped them recall significant emotional episodes, leading to more meaningful contributions during retrospectives, professional developers perceived limited benefits from this practice. Furthermore, based on the analysis of corrective actions identified by the participants during the study, we hypothesize that self-monitoring of emotions through EmoVizPhy may play a valuable role in facilitating the consolidation of new agile teams for which roles and collaboration dynamics are still being defined.

DOI

Conference & Workshop Papers

IRIM-3D

Robotic Applications for Safe Operations in Hospital Isolation Rooms

A. Bottalico, F. Lanubile, L. Quaranta

7th Italian Conference on Robotics and Intelligent Machines (IRIM-3D 2025), 2025

Abstract

Hospital isolation rooms expose healthcare workers to infection risks during routine tasks like IV bag replacement and material delivery. This paper presents a teleoperated TIAGo robot system controlled via Bluetooth joystick through a modular ROS 2 architecture including five specialized Python nodes to manage robot subsystems. Two Gazebo simulation scenarios validate the approach: IV replacement and meal delivery in modeled hospital rooms, demonstrating feasibility for reducing exposure.
SEAA

MLOps in the Healthcare Domain: a Systematic Literature Review

G. Mallardi, L. Quaranta, F. Calefato, F. Lanubile

Lecture Notes in Computer Science, Vol. 16082, pp. 334-349, 2025

Abstract

Machine Learning Operations (MLOps) refers to the set of practices and tools designed to streamline and automate machine learning pipelines, enabling the efficient deployment and continuous evolution of ML models in production environments. In the healthcare domain, where machine learning adoption is growing, MLOps plays a crucial role in ensuring reliable, compliant, and maintainable AI systems. This systematic literature review investigates the current use of MLOps in healthcare, focusing on the practices adopted, tools used, workflow stages supported, and medical specialties involved. We conducted a structured search on scholarly databases and selected 14 primary studies published between 2015 and 2024 based on defined inclusion and exclusion criteria. Our findings reveal that while several MLOps practices and tools are being adopted in healthcare, their coverage remains uneven across the ML workflow, with early stages such as data labeling receiving little attention. Regulatory constraints further limit automation, particularly in deployment. Moreover, applications tend to concentrate on a few medical specialties, reflecting the current narrow scope of adoption. Taken together, these insights offer a structured understanding of how MLOps is currently applied in healthcare and point toward opportunities for more reliable, effective, and regulation-aware integration of machine learning in clinical contexts.

DOI
RAIE

Towards Ensuring Responsible AI for Medical Device Certification

G. Mallardi, L. Quaranta, F. Calefato, F. Lanubile

2025 IEEE/ACM International Workshop on Responsible AI Engineering (RAIE), pp. 29-32, 2025

Abstract

Deploying and evolving machine learning (ML) solutions presents unique challenges in healthcare due to stringent regulatory requirements. This paper discusses the requirements for an extended MLOps framework that supports the certification of ML models as medical devices. By incorporating automated compliance checks, documentation generation, and continuous monitoring, we aim to facilitate adherence to standards and guidelines. This approach could enable healthcare ML models to maintain compliance throughout their lifecycle, fostering a smoother transition from prototype to clinical deployment.

DOI
MLOps25

MLOps-Driven Automation of Regulatory Documentation for AI-Based Medical Software

F. Rosmarino, G. Mallardi, L. Quaranta, F. Lanubile

Proceedings of the 1st ECAI Workshop on Machine Learning Operations (MLOps25), 2025

Abstract

In recent years, the increasing integration of AI within clinical software has led to the emergence of Software as a Medical Device (SaMD), a category of systems subject to strict regulatory oversight. A major challenge for developers in this domain is the continuous production of regulatory documentation, which remains largely manual and disconnected from development pipelines.
This paper proposes a strategy to automate documentation generation by embedding MLOps principles—such as traceability, reproducibility, and continuous integration—into the development workflow. Applied to a representative healthcare AI project, the approach produced consistent, audit-ready artefacts with minimal manual effort, demonstrating its potential to narrow the gap between rapid innovation and regulatory compliance.

CEUR-WS

2024

Journal Articles

IST

A lot of talk and a badge: An exploratory analysis of personal achievements in GitHub

F. Calefato, L. Quaranta, F. Lanubile

Information and Software Technology, Vol. 176, pp. 107561, 2024

Abstract

Context:
GitHub has introduced a new gamification element through personal achievements, whereby badges are unlocked and displayed on developers’ personal profile pages in recognition of their development activities.
Objective:
In this paper, we present an exploratory analysis using mixed methods to study the diffusion of personal badges in GitHub, in addition to the effects and reactions to their introduction.
Method:
First, we conduct an observational study by mining longitudinal data from more than 6,000 developers and performed correlation and regression analysis. Then, we conduct a survey and analyze over 300 GitHub community discussions on the topic of personal badges to gauge how the community responded to the introduction of the new feature.
Results:
We find that most of the developers sampled own at least a badge, but we also observe an increasing number of users who choose to keep their profile private and opt out of displaying badges. Additionally, badges are generally poorly correlated with developers’ skills and dispositions such as timeliness and desire to collaborate. We also find that, except for the Starstruck badge (reflecting the number of followers), their introduction does not have an effect. Finally, the reaction of the community has been in general mixed, as developers find them appealing in principle but without a clear purpose and hardly reflecting their abilities in the current form.
Conclusions:
We provide recommendations to the designers of the GitHubplatform on how to improve the current implementation of personal badges as both a gamification mechanism and as sources of reliable cues for assessing the abilities of developers.

DOI arXiv
JSS

Impact of data quality for automatic issue classification using pre-trained language models

G. Colavito, F. Lanubile, N. Novielli, L. Quaranta

Journal of Systems and Software, Vol. 210, pp. 111838, 2024

Abstract

Issue classification aims to recognize whether an issue reports a bug, a request for enhancement or support. In this paper we use pre-trained models for the automatic classification of issues and investigate how the quality of data affects the performance of classifiers. Despite the application of data quality filters, none of our attempts had a significant effect on model quality. As root cause we identify a threat to construct validity underlying the issue labeling.

DOI
SoftwareX

Pynblint: A quality assurance tool to improve the quality of Python Jupyter notebooks

L. Quaranta, F. Calefato, F. Lanubile

SoftwareX, Vol. 28, pp. 101959, 2024

Abstract

Jupyter Notebook is widely recognized as a crucial tool for data science professionals and students. Its interactive and self-documenting nature makes it particularly suitable for data-driven programming tasks. Nonetheless, it faces criticism for its limited support for software engineering best practices and its tendency to encourage bad programming habits, such as non-linear code execution. These issues often result in non-reproducible, poorly documented, and low-quality notebook code. In this paper, we introduce Pynblint, a static analyzer for Python Jupyter notebooks. Pynblint is designed to help data scientists write better notebooks, easy to understand and reproduce. We report on how we validated Pynblint with both professional data scientists and students, receiving overall positive feedback. Additionally, we discuss the potential of Pynblint to facilitate research inquiries into computational notebooks.

DOI

Conference & Workshop Papers

Ital-IA

An MLOps Solution Framework for Transitioning Machine Learning Models into eHealth Systems

A. Basile, F. Calefato, F. Lanubile, G. Mallardi, L. Quaranta

Proceedings of the Ital-IA Intelligenza Artificiale – Thematic Workshops co-located with the 4th CINI National Lab AIIS Conference on Artificial Intelligence (Ital-IA 2024), 2024

Abstract

Over the past few years, there has been a growing experimentation of machine learning (ML)-based technologies in the healthcare domain. However, most related initiatives struggle to progress beyond the prototypical research stage and transition to clinical use. Although this problem affects the adoption of ML across all industries, it is largely exacerbated in the highly regulated medical domain. Lately, MLOps has emerged as a new discipline encompassing practices and tools to streamline the development and maintenance of ML-enabled systems. Rooted in software engineering and inspired by DevOps, it places great emphasis on the automation of ML pipelines and model lifecycle. In this paper, we present an MLOps-based solution framework designed to streamline the transition of experimental ML models to production-ready components for eHealth systems. Our approach is designed to support the reliable integration and clinical deployment of ML-enabled tools that can assist healthcare professionals. The solution framework is being developed and validated in the context of “DARE – Digital Lifelong Prevention”, an Italian research project aimed at leveraging the potential of data to improve health promotion and prevention throughout the life course.

CEUR-WS
ITASEC

Security Risks and Best Practices of MLOps: A Multivocal Literature Review

F. Calefato, F. Lanubile, L. Quaranta

Proceedings of the 8th Italian Conference on Cyber Security (ITASEC 2024), Vol. 3731, 2024

Abstract

MLOps practices and tools are designed to streamline the deployment and maintenance of production-grade ML-enabled systems. As with any software workflow and component, they are susceptible to various security threats. In this paper, we present a Multivocal Literature Review (MLR) aimed at gauging current knowledge of the risks associated with the implementation of MLOps processes and the best practices recommended for their mitigation. By analyzing a varied range of sources of academic papers and non-peer-reviewed technical articles, we synthesize 15 risks and 27 related best practices, which we categorized into 8 themes. We find that while some of the risks are known security threats that can be mitigated through well-established cybersecurity best practices, others represent MLOps-specific risks, mostly concerning the management of data and models.

CEUR-WS
Ital-IA

Large Language Models for Issue Report Classification

G. Colavito, F. Lanubile, N. Novielli, L. Quaranta

Proceedings of the Ital-IA Intelligenza Artificiale – Thematic Workshops, Vol. 3762, 2024

Abstract

Effective issue classification is crucial for efficient software project management. However, labels assigned to issues are often inconsistent, which can negatively impact the performance of supervised classification models. In this work, we investigate how label consistency and training data size affect automatic issue classification. We first evaluate a few-shot learning approach on a manually validated dataset and compare it to fine-tuning on a larger crowd-sourced set. The results show that our approach achieves higher accuracy when trained and tested on consistent labels. We then examine zero-shot classification using GPT-3.5, finding that its performance is comparable to supervised models despite having no fine-tuning. This suggests that generative models can help classify issues when annotated data is limited. Overall, our findings provide insights into balancing data quantity and quality for issue classification.

CEUR-WS
MSR

Leveraging GPT-like LLMs to Automate Issue Labeling

G. Colavito, F. Lanubile, N. Novielli, L. Quaranta

Proceedings of the 21st International Conference on Mining Software Repositories, pp. 469-480, 2024

Abstract

Issue labeling is a crucial task for the effective management of software projects. To date, several approaches have been put forth for the automatic assignment of labels to issue reports. In particular, supervised approaches based on the fine-tuning of BERT-like language models have been proposed, achieving state-of-the-art performance. More recently, decoder-only models such as GPT have become prominent in SE research due to their surprising capabilities to achieve state-of-the-art performance even for tasks they have not been trained for. To the best of our knowledge, GPT-like models have not been applied yet to the problem of issue classification, despite the promising results achieved for many other software engineering tasks. In this paper, we investigate to what extent we can leverage GPT-like LLMs to automate the issue labeling task. Our results demonstrate the ability of GPT-like models to correctly classify issue reports in the absence of labeled data that would be required to fine-tune BERT-like LLMs.

DOI
BIBM

An MLOps Approach for Deploying Machine Learning Models in Healthcare Systems

G. Mallardi, F. Calefato, L. Quaranta, F. Lanubile

2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 6832-6837, 2024

Abstract

In recent years, there has been a remarkable increase in the use of machine learning (ML) technologies in healthcare settings. Despite this growth, a significant challenge persists: numerous promising initiatives remain confined to research laboratories, unable to make the critical transition into clinical practice. While the gap between research and production deployment affects ML projects across various sectors, the stringently regulated healthcare environment poses unique and heightened challenges. To address these challenges, MLOps has recently emerged as a specialized discipline that combines engineering best practices with operational excellence. Building upon software engineering foundations and DevOps principles, MLOps introduces a systematic approach to automating ML workflows and managing the complete model lifecycle. This paper introduces a practical and comprehensive MLOps-based framework. This framework is designed to facilitate the transformation of experimental ML models into production-ready healthcare solutions. It provides a structured approach that ensures the seamless integration of ML-powered tools into clinical environments and guarantees their reliability and compliance with medical standards, instilling confidence in their effectiveness. We are currently implementing and evaluating this framework within the "DARE – Digital Lifelong Prevention" project, a national Italian initiative aiming to harness data analytics to enhance preventive healthcare strategies across different life stages.

DOI
ESEM

Continuous Quality Improvement of AI-based Systems: the QualAI Project

N. Novielli, R. Oliveto, F. Palomba, F. Calefato, G. Colavito, V. De Martino, A. Della Porta, G. Giordano, E. Guglielmi, F. Lanubile, L. Quaranta, G. Recupito, S. Scalabrino, A. Spina, A. Vitale

Proceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, pp. 603-607, 2024

Abstract

QualAI is a two-year project aimed at defining a set of recommenders to continuously monitor, assess, and improve the quality of AI-based systems, with a particular focus on machine learning (ML) applications. We will develop recommenders for the quality assurance of both data and ML models to enable practitioners to mitigate technical debt. Special attention will be paid to communication challenges that may arise in hybrid teams comprising data scientists and software developers. This paper presents the project outline, provides an executive summary of the research activities, outlines the expected project outcomes, and reports the results obtained to date.

DOI
RCIS

QualAI: Continuous Quality Improvement of AI-based Systems

N. Novielli, R. Oliveto, F. Palomba, F. Calefato, G. Colavito, V. De Martino, A. Della Porta, G. Giordano, E. Guglielmi, F. Lanubile, L. Quaranta, G. Recupito, S. Scalabrino, A. Spina, A. Vitale

Joint Proceedings of RCIS 2024 Workshops and Research Projects Track, Vol. 3674, 2024

Abstract

QualAI is a two-year project that aims to define a set of recommenders to continuously monitor, assess, and improve the quality of AI-based systems, with a particular focus on ML-based systems. Quality assurance will be guaranteed from different perspectives and during both the development and operations phases. We will define recommenders for the quality assurance of both data and ML models to enable
practitioners to mitigate technical debt. Emphasis will be given to communication issues that could arise in hybrid teams including data scientists and software developers. In this paper, we present the project outline, provide an executive summary of the research activities, and present the expected project results.

CEUR-WS

2023

Journal Articles

IEEE Software

Training Future Machine Learning Engineers: A Project-Based Course on MLOps

F. Lanubile, S. Martínez-Fernández, L. Quaranta

IEEE Software, pp. 1-9, 2023

Abstract

In this paper, we present an overview of a project-based course on MLOps by showcasing a couple of sample projects developed by our students. Additionally, we share the lessons learned from offering the course at two different institutions.

DOI

Conference & Workshop Papers

ESEM

Assessing the Use of AutoML for Data-Driven Software Engineering

F. Calefato, L. Quaranta, F. Lanubile, M. Kalinowski

2023 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), pp. 1-12, 2023

Abstract

Background. Due to the widespread adoption of Artificial Intelligence (AI) and Machine Learning (ML) for building software applications, companies are struggling to recruit employees with a deep understanding of such technologies. In this scenario, AutoML is soaring as a promising solution to fill the AI/ML skills gap since it promises to automate the building of end-to-end AI/ML pipelines that would normally be engineered by specialized team members. Aims. Despite the growing interest and high expectations, there is a dearth of information about the extent to which AutoML is currently adopted by teams developing AI/ML-enabled systems and how it is perceived by practitioners and researchers. Method. To fill these gaps, in this paper, we present a mixed-method study comprising a benchmark of 12 end-to-end AutoML tools on two SE datasets and a user survey with follow-up interviews to further our understanding of AutoML adoption and perception. Results. We found that AutoML solutions can generate models that outperform those trained and optimized by researchers to perform classification tasks in the SE domain. Also, our findings show that the currently available AutoML solutions do not live up to their names as they do not equally support automation across the stages of the ML development workflow and for all the team members. Conclusions. We derive insights to inform the SE research community on how AutoML can facilitate their activities and tool builders on how to design the next generation of AutoML technologies.

DOI arXiv
ICSE-SEET

Teaching MLOps in Higher Education through Project-Based Learning

F. Lanubile, S. Martínez-Fernández, L. Quaranta

2023 IEEE/ACM 45th International Conference on Software Engineering: Software Engineering Education and Training (ICSE-SEET), pp. 95-100, 2023

Abstract

Building and maintaining production-grade ML-enabled components is a complex endeavor that goes beyond the current approach of academic education, focused on the optimization of ML model performance in the lab. In this paper, we present a project-based learning approach to teaching MLOps, focused on the demonstration and experience with emerging practices and tools to automatize the construction of ML-enabled components. We examine the design of a course based on this approach, including laboratory sessions that cover the end-to-end ML component life cycle, from model building to production deployment. Moreover, we report on preliminary results from the first edition of the course. During the present year, an updated version of the same course is being delivered in two independent universities; the related learning outcomes will be evaluated to analyze the effectiveness of project-based learning for this specific subject.

DOI arXiv

2022

Journal Articles

CSCW

Eliciting Best Practices for Collaboration with Computational Notebooks

L. Quaranta, F. Calefato, F. Lanubile

Proc. ACM Hum.-Comput. Interact., Vol. 6, n. CSCW1, Article 87, 2022

Abstract

Despite the widespread adoption of computational notebooks, little is known about best practices for their usage in collaborative contexts. In this paper, we fill this gap by eliciting a catalog of best practices for collaborative data science with computational notebooks. With this aim, we first look for best practices through a multivocal literature review. Then, we conduct interviews with professional data scientists to assess their awareness of these best practices. Finally, we assess the adoption of best practices through the analysis of 1,380 Jupyter notebooks retrieved from the Kaggle platform. Findings reveal that experts are mostly aware of the best practices and tend to adopt them in their daily work. Nonetheless, they do not consistently follow all the recommendations as, depending on specific contexts, some are deemed unfeasible or counterproductive due to the lack of proper tool support. As such, we envision the design of notebook solutions that allow data scientists not to have to prioritize exploration and rapid prototyping over writing code of quality.

DOI arXiv
Neuropsychiatr Dis Treat

Associations of High-Sensitivity C-Reactive Protein and Interleukin-6 with Depression in a Sample of Italian Adolescents During COVID-19 Pandemic

M. Serra, A. Presicci, L. Quaranta, M. Achille, E. Caputo, S. Medicamento, F. Margari, F. Croce, L. Margari

Neuropsychiatr Dis Treat., Vol. 18, pp. 1287-1297, 2022

Abstract

Introduction: Many studies highlighted the role of inflammation in the pathogenesis of depression, although not for every patient nor for every symptom. It is widely shared that stressors can increase inflammation and lead to depressive symptoms. Little is known about the symptom-specificity of the inflammation-depression link in adolescence, which we aimed to explore. The single symptom analysis is a core feature of the recent network approach to depression, supposing that psychiatric disorders consist of co-occurring symptoms and their tendency to cause each other.

Patients and methods: We recruited 52 adolescents diagnosed with a Depressive Disorder during the COVID-19 stressful period. We used regression analysis to measure associations between high sensitivity C-Reactive Protein (hs-CRP) and Interleukin-6 (IL-6) and depressive symptoms assessed by the Children's Depression Inventory 2 (CDI 2). For the study of symptom specificity, we selected 13 items from the CDI 2 Self Report corresponding with the DSM-5 diagnostic criteria for Major Depressive Disorder and we coded them as dichotomous variables to perform a regression analysis.

Results: We found that a higher CDI 2-Parent Version total score was significantly predicted by higher hs-CRP (coefficient 3.393; p 0.0128) and IL-6 (coefficient 3.128; p 0.0398). The endorsement of the symptom self-hatred, measuring the DSM-5 symptom "feelings of worthlessness", was significantly predicted by hs-CRP (OR 10.97; 95% CI 1.29-93.08; p 0.0282).

Conclusion: A novel symptom-specificity emerged, with hs-CRP significantly predicting the endorsement of the symptom self-hatred, recognized as a core feature of adolescent depression, following the network theory. We considered it a possible phenotypic expression of one depression endophenotype previously causally linked to inflammation. Due to the limited sample size, these preliminary findings require confirmation with future research focusing on the relationship between inflammation and self-hatred and other central nodes of the depression network, representing an opportunity for targeting interventions on crucial symptoms.
Ital J Pediatr

Depressive risk among Italian socioeconomically disadvantaged children and adolescents during COVID-19 pandemic: a cross-sectional online survey

M. Serra, A. Presicci, L. Quaranta, M.R.E. Urbano, L. Marzulli, E. Matera, F. Margari, L. Margari

Italian Journal of Pediatrics, Vol. 48, n. 1, pp. 68, 2022

Abstract

Background
Children and adolescents and low-income individuals are considered particularly vulnerable for mental health implications during the current COVID-19 pandemic. Depression is a frequent negative emotional response during an epidemic outbreak and is also prone importantly to environmental risk like stressors derived from income inequality. We aimed to assess depressive symptomatology in a sample of Italian low-income minors during the COVID-19 outbreak. We hypothesized that the stronger were the negative effects of the pandemic on socioeconomic conditions, the higher would have been the risk for showing depressive symptoms.

Methods
We performed a cross-sectional study during July 2020, at the end of the Italian first wave of COVID-19 pandemic. We recruited 109 Italian socioeconomically disadvantaged children and adolescents from 7 to 17 years. We used an online survey to collect socio-demographic and clinical data and information about pandemic-related stressors and to assess depressive symptoms with the Children’s Depression Inventory 2 (CDI 2), Parent Version (Emotional Problems subscale) and Self-Report Short Form. We performed logistic regression analysis to assess the association between depressive symptoms and potential risk factors for mental health.

Results
22% and 14% of participants showed depressive symptoms at the CDI 2 Parent Version and Self-Report, respectively. Participants coming from families experiencing a lack of basic supplies during the pandemic (34.9%) were more expected to show depressive symptoms at CDI 2 Parent Version. Participants with a pre-existing neuropsychiatric diagnosis (26.6%) were more likely to exhibit depressive symptoms measured by CDI 2 Parent Version.

Conclusions
The results of our study showed that a group of Italian socioeconomically disadvantaged children and adolescents were more vulnerable to depressive symptoms if they suffered from a paucity of essential supplies during the pandemic or had pre-existing neurodevelopmental disorders. The promotion of educational and child-care programs and activities could be crucial in sustaining the prevention of mental distress in those frail subjects who particularly need support outside the family. Further studies are needed to detect effective preventive and therapeutic strategies to adopt promptly in the case of another pandemic wave.

DOI
Children

Assessing Clinical Features of Adolescents Suffering from Depression Who Engage in Non-Suicidal Self-Injury

M. Serra, A. Presicci, L. Quaranta, E. Caputo, M. Achille, F. Margari, F. Croce, L. Marzulli, L. Margari

Children, Vol. 9, n. 2, pp. 201, 2022

Abstract

Depressive disorders (DDs) and non-suicidal self-injury (NSSI) are important juvenile mental health issues, showing alarming increasing rates. They frequently co-occur, mainly among adolescents, increasing the suicide risk. We aimed to compare the clinical features of two groups of adolescents with DDs, differed by their engagement or not in NSSI (“DD + NSSI” and “DD”). We hypothesized that NSSI would characterize particularly severe forms of DDs suitable for becoming specific phenotypes of adolescent depression. We enrolled 56 adolescents (11–17 years) diagnosed with a DD according to the DSM-5 criteria. They were assessed for NSSI endorsement (Ottawa Self-Injury Inventory), depressive symptoms (Children’s Depression Inventory 2), emotional dysregulation (Difficulties in Emotional Regulation Scale), and anxiety symptoms (Screen for Child Anxiety-Related Emotional Disorders). The two groups accounted for 31 (“DD + NSSI”) and 25 (“DD”) individuals. The “DD + NSSI” group had significantly higher suicidal ideation (p 0.0039), emotional dysregulation (p 0.0092), depressive symptoms (p 0.0138), and anxiety symptoms (p 0.0153) than the “DD” group. NSSI seemed to characterize more severe phenotypes of adolescent depression, applying for a potential role as a “specifier” of DDs, describing relevant information for their management. Further studies are needed to support this hypothesis and its potential opportunities for prevention and treatment.

DOI

Conference & Workshop Papers

ESEM

A Preliminary Investigation of MLOps Practices in GitHub

F. Calefato, F. Lanubile, L. Quaranta

ACM / IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), pp. 283-288, 2022

Abstract

Background. The rapid and growing popularity of machine learning (ML) applications has led to an increasing interest in MLOps, that is, the practice of continuous integration and deployment (CI/CD) of ML-enabled systems. Aims. Since changes may affect not only the code but also the ML model parameters and the data themselves, the automation of traditional CI/CD needs to be extended to manage model retraining in production. Method. In this paper, we present an initial investigation of the MLOps practices implemented in a set of ML-enabled systems retrieved from GitHub, focusing on GitHub Actions and CML, two solutions to automate the development workflow. Results. Our preliminary results suggest that the adoption of MLOps workflows in open-source GitHub projects is currently rather limited. Conclusions. Issues are also identified, which can guide future research work.

DOI arXiv
CAIN

Pynblint: a static analyzer for Python Jupyter notebooks

L. Quaranta, F. Calefato, F. Lanubile

Proceedings of the 1st International Conference on AI Engineering: Software Engineering for AI, pp. 48-49, 2022

Abstract

Jupyter Notebook is the tool of choice of many data scientists in the early stages of ML workflows. The notebook format, however, has been criticized for inducing bad programming practices; indeed, researchers have already shown that open-source repositories are inundated by poor-quality notebooks. Low-quality output from the prototypical stages of ML workflows constitutes a clear bottleneck towards the productization of ML models. To foster the creation of better notebooks, we developed Pynblint, a static analyzer for Jupyter notebooks written in Python. The tool checks the compliance of notebooks (and surrounding repositories) with a set of empirically validated best practices and provides targeted recommendations when violations are detected.

DOI arXiv
ICSE-DS

Assessing the Quality of Computational Notebooks for a Frictionless Transition from Exploration to Production

L. Quaranta

ICSE '22: Proceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings, pp. 256–260, 2022

Abstract

The massive trend of integrating data-driven AI capabilities into traditional software systems is rising new intriguing challenges. One of such challenges is achieving a smooth transition from the explorative phase of Machine Learning projects – in which data scientists build prototypical models in the lab – to their production phase – in which software engineers translate prototypes into production-ready AI components. To narrow down the gap between these two phases, tools and practices adopted by data scientists might be improved by incorporating consolidated software engineering solutions. In particular, computational notebooks have a prominent role in determining the quality of data science prototypes. In my research project, I address this challenge by studying the best practices for collaboration with computational notebooks and proposing proof-of-concept tools to foster guidelines compliance.

DOI arXiv

2021

Conference & Workshop Papers

Towards Productizing AI/ML Models: An Industry Perspective from Data Scientists

F. Lanubile, F. Calefato, L. Quaranta, M. Amoruso, F. Fumarola, M. Filannino

2021 IEEE/ACM 1st Workshop on AI Engineering – Software Engineering for AI (WAIN), pp. 129-132, 2021

Abstract

The transition from AI/ML models to production-ready AI-based systems is a challenge for both data scientists and software engineers. In this paper, we report the results of a workshop conducted in a consulting company to understand how this transition is perceived by practitioners. Starting from the need for making AI experiments reproducible, the main themes that emerged are related to the use of the Jupyter Notebook as the primary prototyping tool, and the lack of support for software engineering best practices as well as data science specific functionalities.

DOI arXiv
AIxIA

A Taxonomy of Tools for Reproducible Machine Learning Experiments

L. Quaranta, F. Calefato, F. Lanubile

The 20th International Conference of the Italian Association for Artificial Intelligence (AIxIA 2021), 2021

Abstract

The broad availability of machine learning (ML) libraries and frameworks makes the rapid prototyping of ML models a relatively easy task to achieve. However, the quality of prototypes is challenged by their reproducibility.
Reproducing an ML experiment typically entails repeating the whole process, from data collection to model building, other than multiple optimization steps that must be carefully tracked. In this paper, we define a comprehensive taxonomy to characterize tools for ML experiment tracking and review some of the most popular solutions under the lens of the taxonomy. The taxonomy and related recommendations may help data scientists to more easily orient themselves and make an informed choice when selecting appropriate tools to shape the workflow of their ML experiments.

CEUR-WS
MSR

KGTorrent: A Dataset of Python Jupyter Notebooks from Kaggle

L. Quaranta, F. Calefato, F. Lanubile

2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR), pp. 550-554, 2021

Abstract

Computational notebooks have become the tool of choice for many data scientists and practitioners for performing analyses and disseminating results. Despite their increasing popularity, the research community cannot yet count on a large, curated dataset of computational notebooks. In this paper, we fill this gap by introducing KGTorrent, a dataset of Python Jupyter notebooks with rich metadata retrieved from Kaggle, a platform hosting data science competitions for learners and practitioners with any levels of expertise. We describe how we built KGTorrent, and provide instructions on how to use it and refresh the collection to keep it up to date. Our vision is that the research community will use KGTorrent to study how data scientists, especially practitioners, use Jupyter Notebook in the wild and identify potential shortcomings to inform the design of its future extensions.

DOI arXiv

2019

Conference & Workshop Papers

SEmotion

EMTk – The Emotion Mining Toolkit

F. Calefato, F. Lanubile, N. Novielli, L. Quaranta

2019 IEEE/ACM 4th International Workshop on Emotion Awareness in Software Engineering (SEmotion), pp. 34-37, 2019

Abstract

The Emotion Mining Toolkit (EMTk) is a suite of modules and datasets offering a comprehensive solution for mining sentiment and emotions from technical text contributed by developers on communication channels. The toolkit is written in Java, Python, and R, and is released under the MIT open source license. In this paper, we describe its architecture and the benchmark against the previous, standalone versions of our sentiment analysis tools. Results show large improvements in terms of speed.

DOI arXiv
ICPC

A Replication Study on Code Comprehension and Expertise using Lightweight Biometric Sensors

D. Fucci, D. Girardi, N. Novielli, L. Quaranta, F. Lanubile

2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC), pp. 311-322, 2019

Abstract

Code comprehension has been recently investigated from physiological and cognitive perspectives using medical imaging devices. Floyd et al. (i.e., the original study) used fMRI to classify the type of comprehension tasks performed by developers and relate their results to their expertise. We replicate the original study using lightweight biometrics sensors. Our study participants-28 undergrads in computer science-performed comprehension tasks on source code and natural language prose. We developed machine learning models to automatically identify what kind of tasks developers are working on leveraging their brain-, heart-, and skin-related signals. The best improvement over the original study performance is achieved using solely the heart signal obtained through a single device (BAC 87%vs. 79.1%). Differently from the original study, we did not observe a correlation between the participants' expertise and the classifier performance (τ= 0.16, p= 0.31). Our findings show that lightweight biometric sensors can be used to accurately recognize comprehension opening interesting scenarios for research and practice.

DOI arXiv
SEmotion

Towards Recognizing the Emotions of Developers Using Biometrics: The Design of a Field Study

D. Girardi, F. Lanubile, N. Novielli, L. Quaranta, A. Serebrenik

2019 IEEE/ACM 4th International Workshop on Emotion Awareness in Software Engineering (SEmotion), pp. 13-16, 2019

Abstract

During their daily working activities, developers experience a wide range of emotions that are known to impact their personal wellbeing and, consequently, their work performance. As such, being aware of own and collaborators' emotions is crucial to enhance the collaborative development process. In this paper we present the design of a field study aimed at i) assessing the feasibility of emotion detection using non-invasive biometric sensors and ii) investigating the correlation between daily working activities and positive/negative emotions experienced by software developers. The long-term goal of our research is to provide recommendations to improve developers' mental well-being and productivity based on the emotions they experience.

DOI