Publications - QualAI - Continuous Quality Improvement of AI-based Systems

2026

- G. Colavito, F. Lanubile, N. Novielli, C. Arreza, Y. Shi. “Issue classification with LLMs: An empirical study of the NASA flight software systems“. Journal of Systems and Software, Vol. 237, July 2026, https://doi.org/10.1016/j.jss.2026.112851
- Vitale, A., Guglielmi, E., Scalabrino, S., Oliveto, R. (2026). On the Impact of Code Comments for Automated Bug-Fixing: An Empirical Study. In Proceedings of the International Conference on Program Comprehension (to appear)
- V. De Martino, S. Lambiase, F. Pecorelli, W.J. van den Heuvel, F. Ferrucci, F. Palomba, “Sustainability of Machine Learning-Enabled Systems: The Machine Learning Practitioner’s Perspective”, (2026), in press.
- G. Festa, G. Giordano, V. Pontillo, M. Di Penta, D. Tamburri, F. Palomba, “Pythonic vs Refactorable Pythocnic: On the Relationship between Pythonic Idioms and Code Quality in Machine Learning Projects”, (2026) in press

2025

- D. Grassi, F. Lanubile, N. Novielli, L. Quaranta, A. Serebrenik “Self-monitoring of Developers’ Emotions: the Case of Agile Retrospective Meetings“. ACM Transactions on Software Engineering and Methodology (Accepted: Sept. 2025), https://doi.org/10.1145/3766064
- G. Colavito, F. Lanubile, N. Novielli. “Benchmarking large language models for automated labeling: The case of issue report classification“. Information and Software Technology, 184 (2025). https://doi.org/10.1007/s10664-024-10611-z
- Daniela Grassi, Filippo Lanubile, Alberta Mocta-Schnabel, Nicole Novielli. “A Cluster-based Approach for Emotion Recognition in Software Development“. In Proceedings of the 18th International Conference on Cooperative and Human Aspects of Software Engineering (CHASE 2025), doi: 10.1109/CHASE66643.2025.00034.
- Vitale, A., Oliveto, R., & Scalabrino, S. (2025). A catalog of data smells for coding tasks. ACM Transactions on Software Engineering and Methodology, 34(4), 1-32.
- Spina, A., Russodivito, M., Scalabrino, S., & Oliveto, R. (2025). Peeking Inside the Black Box: Training Data Exposure in Code Language Models. Journal of Systems and Software, 112729.
- A. Parziale, G. Voria, G. Giordano, G. Catolino, G. Robles, F. Palomba. “Fairness on a budget, across the board: A cost-effective evaluation of fairness-aware practices across contexts, tasks, and sensitive attributes.” Information and Software Technology (2025): 107858.
- A. Parziale, G. Voria, G. Giordano, G. Catolino, G. Robles, F. Palomba. “Contextual fairness-aware practices in ML: A cost-effective empirical evaluation.” In 2025 IEEE International Conference on Software Analysis, Evolution and Reengineering-Companion (SANER-C), pp. 1-8. IEEE, 2025.
- G. Voria, B. Scala, L. Todisco, C. Venditto, G. Giordano, G. Catolino, F. Palomba. “Fair and square? Evaluating fairness of LLM-generated synthetic datasets.” Information and Software Technology (2025): 107980.
- G. Recupito, G. Giordano, F. Ferrucci, D. Di Nucci, F. Palomba, “When Code Smells Meet ML: On the Lifecycle of ML-specific Code Smells in ML-enabled Systems”, (2025) Vol. 30, article number 139.
- V. De Martino, G. Recupito, G. Giordano, F. Ferrucci, D. Di Nucci, F. Palomba, “Into the ML-Universe: An Improved Classification and Characterization of Machine-Learning Projects”, (2025) Vol. 230, 112471.
- S. Lambiase, G. Catolino, F. Palomba, F. Ferrucci, D. Russo, “Investigating the Role of Cultural Values in Adopting Large Language Models for Software Engineering”, ACM Transactions on Software Engineering and Methodology, 35(1), 1-43.
- V. De Martino, F. Palomba, “Classification and Challenges of Non-Functional Requirements in ML-Enabled Systems: A Systematic Literature Review, Information and Software Technology, 181, 107678.
- G. Annunziata, S. Lambiase, D. Tamburri,, W.J. Van Den Heuvel, F. Palomba,G. Catolino, F. Ferrucci, A. De Lucia (2025). Uncovering community smells in machine learning-enabled systems: Causes, effects, and mitigation strategies. ACM Transactions on Software Engineering and Methodology, 34(6), 1-48.
- A. Della Porta, S. Lambiase, F. Palomba (2025). Do prompt patterns affect code quality? a first empirical assessment of chatgpt-generated code. In Proceedings of the 29th International Conference on Evaluation and Assessment in Software Engineering (pp. 181-192).
- G. Annunziata, S. Lambiase, F. Palomba, G. Catolino, F. Ferrucci (2025). How do communities of ML-enabled systems smell? a cross-sectional study on the prevalence of community smells. In Proceedings of the 29th International Conference on Evaluation and Assessment in Software Engineering (pp. 272-282).
- V. De Martino, J. Castano, F. Palomba, X. Franch, S. Martínez-Fernández (2025). A framework for using llms for repository mining studies in empirical software engineering. In 2025 IEEE/ACM International Workshop on Methodological Issues with Empirical Studies in Software Engineering (WSESE)(pp. 6-11). IEEE.
- V. De Martino, S. Martínez-Fernández, F. Palomba (2025). Do developers adopt green architectural tactics for ml-enabled systems? a mining software repository study. In 2025 IEEE/ACM 47th International Conference on Software Engineering: Software Engineering in Society (ICSE-SEIS) (pp. 135-139). IEEE.
- G. Recupito, G. Giordano, D. Di Nucci, F. Palomba, “Detecting Semantic Data Smells with BERT: A Transformer-Based Approach to Data Quality”, (2025), ECAI Workshop on MLOps, 2025.
- A. Della Porta, G. Recupito, S. Lambiase, D. Di Nucci, F. Palomba, “Unlocking Code Simplicity: The Role of Prompt Patterns in Managing LLM Code Complexity”, 2025 IEEE International Conference on Software Analysis, Evolution and Reengineering – Companion (SANER-C), (2025).
- G. Recupito, V. De Martino, D. Di Nucci, F. Palomba, “A First Look at the Lifecycle of DL-Specific Self-Admitted Technical Debt”, 2025 IEEE International Conference on Software Analysis, Evolution and Reengineering – Companion (SANER-C), (2025) pp. 150–157.
- Daniela Grassi, Fabio Calefato, Darja Smite, Nicole Novielli, Filippo Lanubile. “Exploring Engagement in Hybrid Meetings“. In Proceedings of the 19th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM 2025), 10.1109/ESEM64174.2025.00032

2024

- Russodivito, M., Spina, A., Scalabrino, S., & Oliveto, R. (2024). Black-Box Reconstruction Attacks on LLMs: A Preliminary Study in Code Summarization. In Proceedings of the International Conference on the Quality of Information and Communications Technology (pp. 391-398). Cham: Springer Nature Switzerland.
- Rosa, G., Scalabrino, S., Robles, G., & Oliveto, R. (2024). Not all dockerfile smells are the same: An empirical evaluation of hadolint writing practices by experts. In Proceedings of the 21st International Conference on Mining Software Repositories (pp. 231-241).
- Rosa, G., Zappone, F., Scalabrino, S., & Oliveto, R. (2024). Fixing Dockerfile smells: an empirical study. Empirical Software Engineering, 29(5), 108.
- Vitale, A., Mastropaolo, A., Oliveto, R., Di Penta, M., & Scalabrino, S. (2025). Optimizing datasets for code summarization: Is code-comment coherence enough?. In Proceedings of the International Conference on Program Comprehension (pp. 237-249).
- Nicole Novielli, Rocco Oliveto, Fabio Palomba, Fabio Calefato, Giuseppe Colavito, Vincenzo De Martino, Antonio Della Porta, Giammaria Giordano, Emanuela Guglielmi, Filippo Lanubile, Luigi Quaranta, Gilberto Recupito, Simone Scalabrino, Angelica Spina and Antonio Vitale, “QualAI: Continuous Quality Improvement of AI-based Systems“, in Proceedings of 18th International Conference on Research Challenges in Information Science (RCIS 2024), Research Project Track
- Giuseppe Colavito, Filippo Lanubile, Nicole Novielli, Luigi Quaranta, “Impact of Data Quality for Automatic Issue Classification Using Pre-trained Language Models“, Journal of Systems and Software, 2024, DOI: 10.1016/j.jss.2023.111838
- Giuseppe Colavito, Filippo Lanubile, Nicole Novielli, Luigi Quaranta. “Leveraging GPT-like LLMs to Automate Issue Labeling“. In Proceedings of the 21st International Conference on Mining Software Repositories (MSR 2024), to appear
- Carmine Ferrara, Francesco Casillo, Carmine Gravino, Andrea De Lucia, Fabio Palomba, “ReFAIR: Toward a Context-Aware Recommender for Fairness Requirements Engineering“. In Proceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering (ICSE ’24), April 14–20, 2024, Lisbon, Portugal. ACM, New York, NY, USA, 13 pages. https://doi.org/ 10.1145/3597503.3639185
- Gilberto Recupito, Raimondo Rapacciuolo, Dario Di Nucci, Fabio Palomba, “Unmasking Data Secrets: An Empirical Investigation into Data Smells and Their Impact on Data Quality“. In Proceedings of 3rd International Conference on AI Engineering (CAIN 2024), to appear
- G. Recupito, F. Pecorelli, G. Catolino, V. Lenarduzzi, D. Taibi, D. Di Nucci, F. Palomba (2024). Technical debt in ai-enabled systems: On the prevalence, severity, impact, and management strategies for code and architecture. Journal of Systems and Software, 216, 112151.
- G. Recupito, R. Rapacciuolo, D. Di Nucci, F. Palomba (2024). Unmasking data secrets: An empirical investigation into data smells and their impact on data quality. In Proceedings of the IEEE/ACM 3rd International Conference on AI Engineering-Software Engineering for AI (pp. 53-63).
- A. Della Porta, V. De Martino, G. Recupito, C. Iemmino, G. Catolino, D. Di Nucci, F. Palomba, “Using Large Language Models to Support Software Engineering Documentation in Waterfall Life Cycles: Are We There Yet?”, Ital-IA, (2024) pp. 42–47.
2023
- Gianmaria Giordano, Giusy Annunziata, Andrea De Lucia and Fabio Palomba, “Understanding Developer Practices and Code Smells Diffusion in AI-Enabled Software: A Preliminary Study“. IWSM/MENSURA 23, September 14–15, 2023, Rome, Italy