MSR 2024
Mon 15 - Tue 16 April 2024 Lisbon, Portugal
co-located with ICSE 2024
Mon 15 Apr 2024 15:03 - 15:07 at Grande Auditório - Software Quality Chair(s): Gopi Krishnan Rajbahadur

Code review is an important practice that improves the overall quality of a proposed patch (i.e. code changes). While much research focused on tool-based code reviews (e.g. a Gerrit code review tool, GitHub), many traditional open-source software (OSS) projects still conduct code review through emails. However, due to the nature of unstructured email-based data, it can be challenging to mine email-based code reviews, hindering researchers from delving into the code review practice of such long-standing OSS projects. Therefore, this paper presents large-scale datasets of email-based code reviews of 160 projects across three OSS communities (i.e. Linux Kernel, OzLabs, and FFmpeg). We mined the data from Patchwork, a web-based patch-tracking system for email-based code review, and curated the data by grouping a submitted patch and its revised versions and grouping email aliases. Our datasets include a total of 3.8M patches with 1.9M patch groups and 155K email addresses belonging to 130K individuals. Our published artefacts include the datasets as well as a tool suite to crawl, curate, and store Patchwork data. With our datasets, future work can directly delve into an email-based code review practice of large OSS projects without additional effort in data collection and curation.

Mon 15 Apr

Displayed time zone: Lisbon change

14:00 - 15:30
Software QualityTechnical Papers / Registered Reports / Data and Tool Showcase Track at Grande Auditório
Chair(s): Gopi Krishnan Rajbahadur Centre for Software Excellence, Huawei, Canada
14:00
12m
Talk
Not all Dockerfile Smells are the Same: An Empirical Evaluation of Hadolint Writing Practices by Experts
Technical Papers
Giovanni Rosa University of Molise, Simone Scalabrino University of Molise, Gregorio Robles Universidad Rey Juan Carlos, Rocco Oliveto University of Molise
14:12
12m
Talk
Supporting High-Level to Low-Level Requirements Coverage Reviewing with Large Language Models
Technical Papers
Anamaria-Roberta Hartl Johannes Kepler University Linz, Christoph Mayr-Dorn JOHANNES KEPLER UNIVERSITY LINZ, Atif Mashkoor Johannes Kepler University Linz, Alexander Egyed Johannes Kepler University Linz
DOI Authorizer link Pre-print
14:24
12m
Talk
On the Executability of R Markdown Files
Technical Papers
Md Anaytul Islam Lakehead University, Muhammad Asaduzzman University of Windsor, Shaowei Wang Department of Computer Science, University of Manitoba, Canada
14:36
12m
Talk
APIstic: A Large Collection of OpenAPI Metrics
Technical Papers
souhaila serbout Software Institute @ USI, Cesare Pautasso Software Institute, Faculty of Informatics, USI Lugano
14:48
6m
Talk
Improving Automated Code Reviews: Learning From Experience
Technical Papers
Hong Yi Lin The University of Melbourne, Patanamon Thongtanunam University of Melbourne, Christoph Treude Singapore Management University, Wachiraphan (Ping) Charoenwet The University of Melbourne
14:55
4m
Talk
Multi-faceted Code Smell Detection at Scale using DesigniteJava 2.0
Data and Tool Showcase Track
Tushar Sharma Dalhousie University
Pre-print
14:59
4m
Talk
SATDAUG - A Balanced and Augmented Dataset for Detecting Self-Admitted Technical Debt
Data and Tool Showcase Track
Edi Sutoyo Bernoulli Institute for Mathematics, Computer Science and Artificial Intelligence, University of Groningen, Andrea Capiluppi University of Groningen
15:03
4m
Talk
Curated Email-Based Code Reviews Datasets
Data and Tool Showcase Track
Mingzhao Liang The University of Melbourne, Wachiraphan (Ping) Charoenwet The University of Melbourne, Patanamon Thongtanunam University of Melbourne
15:07
4m
Talk
TestDossier: A Dataset of Tested Values Automatically Extracted from Test Execution
Data and Tool Showcase Track
Pre-print Media Attached
15:11
4m
Talk
Greenlight: Highlighting TensorFlow APIs Energy Footprint
Data and Tool Showcase Track
Saurabhsingh Rajput Dalhousie University, Maria Kechagia University College London, Federica Sarro University College London, Tushar Sharma Dalhousie University
Pre-print
15:15
5m
Talk
When Code Smells Meet ML: On the Lifecycle of ML-specific Code Smells in ML-enabled Systems
Registered Reports
Gilberto Recupito University of Salerno, Giammaria Giordano University of Salerno, Filomena Ferrucci University of Salerno, Dario Di Nucci University of Salerno, Fabio Palomba University of Salerno
15:20
5m
Talk
Comparison of Static Analysis Architecture Recovery Tools for Microservice Applications
Registered Reports
Simon Schneider Hamburg University of Technology, Alexander Bakhtin University of Oulu, Xiaozhou Li University of Oulu, Jacopo Soldani University of Pisa, Italy, Antonio Brogi Università di Pisa, Tomas Cerny University of Arizona, Riccardo Scandariato Hamburg University of Technology, Davide Taibi University of Oulu and Tampere University