When Code Smells Meet ML: On the Lifecycle of ML-specific Code Smells in ML-enabled Systems
Context. The adoption of Machine Learning (ML)-enabled systems is steadily increasing. Nevertheless, there is a shortage of ML-specific quality assurance approaches, possibly because of the limited knowledge of how quality-related concerns emerge and evolve in ML-enabled systems.
Objective. We aim to investigate the emergence and evolution of specific types of quality-related concerns known as ML-specific code smells, i.e., sub-optimal implementation solutions applied on ML pipelines that may significantly decrease both quality and maintainability of ML-enabled systems. More specifically, we present a plan to study ML-specific code smells by empirically analyzing (i) their prevalence in real ML-enabled systems, (ii) how they are introduced and removed, and (iii) their survivability.
Method. We will conduct an exploratory study, mining a large dataset of ML-enabled systems and analyzing over 400k commits about 337 projects. We will track and inspect the introduction and evolution of ML smells through CodeSmile, a novel ML smell detector that we will build to enable our investigation and to detect ML-specific code smells.
Mon 15 AprDisplayed time zone: Lisbon change
14:00 - 15:30 | Software QualityTechnical Papers / Registered Reports / Data and Tool Showcase Track at Grande Auditório Chair(s): Gopi Krishnan Rajbahadur Centre for Software Excellence, Huawei, Canada | ||
14:00 12mTalk | Not all Dockerfile Smells are the Same: An Empirical Evaluation of Hadolint Writing Practices by Experts Technical Papers Giovanni Rosa University of Molise, Simone Scalabrino University of Molise, Gregorio Robles Universidad Rey Juan Carlos, Rocco Oliveto University of Molise | ||
14:12 12mTalk | Supporting High-Level to Low-Level Requirements Coverage Reviewing with Large Language Models Technical Papers Anamaria-Roberta Hartl Johannes Kepler University Linz, Christoph Mayr-Dorn JOHANNES KEPLER UNIVERSITY LINZ, Atif Mashkoor Johannes Kepler University Linz, Alexander Egyed Johannes Kepler University Linz DOI Authorizer link Pre-print | ||
14:24 12mTalk | On the Executability of R Markdown Files Technical Papers Md Anaytul Islam Lakehead University, Muhammad Asaduzzman University of Windsor, Shaowei Wang Department of Computer Science, University of Manitoba, Canada | ||
14:36 12mTalk | APIstic: A Large Collection of OpenAPI Metrics Technical Papers souhaila serbout Software Institute @ USI, Cesare Pautasso Software Institute, Faculty of Informatics, USI Lugano | ||
14:48 6mTalk | Improving Automated Code Reviews: Learning From Experience Technical Papers Hong Yi Lin The University of Melbourne, Patanamon Thongtanunam University of Melbourne, Christoph Treude Singapore Management University, Wachiraphan (Ping) Charoenwet The University of Melbourne | ||
14:55 4mTalk | Multi-faceted Code Smell Detection at Scale using DesigniteJava 2.0 Data and Tool Showcase Track Tushar Sharma Dalhousie University Pre-print | ||
14:59 4mTalk | SATDAUG - A Balanced and Augmented Dataset for Detecting Self-Admitted Technical Debt Data and Tool Showcase Track Edi Sutoyo Bernoulli Institute for Mathematics, Computer Science and Artificial Intelligence, University of Groningen, Andrea Capiluppi University of Groningen | ||
15:03 4mTalk | Curated Email-Based Code Reviews Datasets Data and Tool Showcase Track Mingzhao Liang The University of Melbourne, Wachiraphan (Ping) Charoenwet The University of Melbourne, Patanamon Thongtanunam University of Melbourne | ||
15:07 4mTalk | TestDossier: A Dataset of Tested Values Automatically Extracted from Test Execution Data and Tool Showcase Track Andre Hora UFMG Pre-print Media Attached | ||
15:11 4mTalk | Greenlight: Highlighting TensorFlow APIs Energy Footprint Data and Tool Showcase Track Saurabhsingh Rajput Dalhousie University, Maria Kechagia University College London, Federica Sarro University College London, Tushar Sharma Dalhousie University Pre-print | ||
15:15 5mTalk | When Code Smells Meet ML: On the Lifecycle of ML-specific Code Smells in ML-enabled Systems Registered Reports Gilberto Recupito University of Salerno, Giammaria Giordano University of Salerno, Filomena Ferrucci University of Salerno, Dario Di Nucci University of Salerno, Fabio Palomba University of Salerno | ||
15:20 5mTalk | Comparison of Static Analysis Architecture Recovery Tools for Microservice Applications Registered Reports Simon Schneider Hamburg University of Technology, Alexander Bakhtin University of Oulu, Xiaozhou Li University of Oulu, Jacopo Soldani University of Pisa, Italy, Antonio Brogi Università di Pisa, Tomas Cerny University of Arizona, Riccardo Scandariato Hamburg University of Technology, Davide Taibi University of Oulu and Tampere University |