MSR 2024 - Technical Papers

Time Zone

The program is currently displayed in (GMT+01:00) Lisbon.

Use conference time zone: (GMT+01:00) LisbonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

You're viewing the program in a time zone which is different from your device's time zone change time zone

Mon 15 Apr
Displayed time zone: Lisbon change

09:00 - 10:30	Day 1: OpeningTechnical Papers / MSR Awards / Social Events / Tutorials / Data and Tool Showcase Track / Mining Challenge / Registered Reports / Industry Track / MIP Award / Vision and Reflection / Keynotes at Grande Auditório Chair(s): Diomidis Spinellis Athens University of Economics and Business & Delft University of Technology

09:00 30m Day opening		Opening Session & Award Announcements MSR Awards
09:30 30m Awards		MSR 2024 Foundational Contribution Award talk MSR Awards Margaret-Anne Storey University of Victoria
10:00 30m Talk		Most Influential Paper Award talk MIP Award Eirini Kalliamvakou GitHub

10:30 - 11:00	Coffee BreakICSE Catering at Open Space

10:30 30m Coffee break		Break ICSE Catering

10:30 - 11:00	Coffee for MSR newcomersSocial Events at Open Space (reserved area) Chair(s): Federica Sarro University College London, Alexander Serebrenik Eindhoven University of Technology

10:30 30m Coffee break		Coffee for MSR newcomers Social Events Federica Sarro University College London, Alexander Serebrenik Eindhoven University of Technology

11:00 - 12:30	Ecosystems, Reuse and APIs & TutorialsData and Tool Showcase Track / Technical Papers / Tutorials at Almada Negreiros Chair(s): Mahmoud Alfadel University of Waterloo, Ayushi Rastogi University of Groningen, The Netherlands

11:00 12m Talk		Thirty-Three Years of Mathematicians and Software Engineers: A Case Study of Domain Expertise and Participation in Proof Assistant Ecosystems Technical Papers Gwenyth Lincroft Northeastern University, Minsung Cho Northeastern University, Mahsa Bazzaz Northeastern University, Katherine Hough Northeastern University, Jonathan Bell Northeastern University Pre-print Media Attached
11:12 12m Talk		Boosting API Misuse Detection via Integrating API Constraints from Multiple Sources Technical Papers Can Li Nanjing University of Aeronautics and Astronautics, Jingxuan Zhang Nanjing University of Aeronautics and Astronautics, Yixuan Tang Nanjing University of Aeronautics and Astronautics, Zhuhang Li Nanjing University of Aeronautics and Astronautics, Tianyue Sun Nanjing University of Aeronautics and Astronautics
11:24 6m Talk		Availability and Usage of Platform-Specific APIs: A First Empirical Study Technical Papers Ricardo Job IFPB, Andre Hora UFMG Pre-print Media Attached File Attached
11:30 4m Talk		AndroLibZoo: A Reliable Dataset of Libraries Based on Software Dependency Analysis Data and Tool Showcase Track Jordan Samhi CISPA Helmholtz Center for Information Security, Tegawendé F. Bissyandé University of Luxembourg, Jacques Klein University of Luxembourg
11:34 4m Talk		Goblin: A Framework for Enriching and Querying the Maven Central Dependency Graph Data and Tool Showcase Track Damien Jaime Sorbonne Université - Lip6 - SAP, Joyce El Haddad Paris Dauphine-PSL Université, CNRS, LAMSADE, Pascal Poizat Université Paris Nanterre & LIP6 Pre-print File Attached
11:38 4m Talk		Dataset: Copy-based Reuse in Open Source Software Data and Tool Showcase Track Mahmoud Jahanshahi Research Assistant, University of Tennessee Knoxville, Audris Mockus The University of Tennessee & Vilnius University Pre-print
11:45 45m Talk		Mining Our Way Back to Incremental Builds for DevOps Pipelines Tutorials Shane McIntosh University of Waterloo Pre-print

11:00 - 12:30	Defects, Bugs and IssuesTechnical Papers / MSR Awards / Social Events / Tutorials / Data and Tool Showcase Track / Mining Challenge / Registered Reports / Industry Track / MIP Award / Vision and Reflection / Keynotes at Grande Auditório Chair(s): Wesley Assunção North Carolina State University

11:00 12m Talk		Enhancing Performance Bug Prediction Using Performance Code Metrics Technical Papers Guoliang Zhao Computer Science of Queen's University, Stefanos Georgio , Safwat Hassan University of Toronto, Canada, Ying Zou Queen's University, Kingston, Ontario, Derek Truong IBM Canada, Toby Corbin IBM UK
11:12 12m Talk		CrashJS: A NodeJS Benchmark for Automated Crash Reproduction Technical Papers Philip Oliver Victoria University of Wellington, Jens Dietrich Victoria University of Wellington, Craig Anslow Victoria University of Wellington, Michael Homer Victoria University of Wellington
11:24 12m Talk		An Empirical Study on Just-in-time Conformal Defect Prediction Technical Papers Xhulja Shahini paluno - University of Duisburg-Essen, Andreas Metzger University of Duisburg-Essen, Klaus Pohl
11:36 12m Talk		Fine-Grained Just-In-Time Defect Prediction at the Block Level in Infrastructure-as-Code (IaC) Technical Papers Mahi Begoug , Moataz Chouchen ETS, Ali Ouni ETS Montreal, University of Quebec, Eman Abdullah AlOmar Stevens Institute of Technology, Mohamed Wiem Mkaouer University of Michigan - Flint
11:48 4m Talk		TrickyBugs: A Dataset of Corner-case Bugs in Plausible Programs Data and Tool Showcase Track Kaibo Liu Peking University, Yudong Han Peking University, Yiyang Liu Peking University, Zhenpeng Chen Nanyang Technological University, Jie M. Zhang King's College London, Federica Sarro University College London, Gang Huang Peking University, Yun Ma Peking University
11:52 4m Talk		GitBugs-Java: A Reproducible Java Benchmark of Recent Bugs Data and Tool Showcase Track André Silva KTH Royal Institute of Technology, Nuno Saavedra INESC-ID and IST, University of Lisbon, Martin Monperrus KTH Royal Institute of Technology
11:56 4m Talk		A Dataset of Partial Program Fixes Data and Tool Showcase Track Dirk Beyer LMU Munich, Lars Grunske Humboldt-Universität zu Berlin, Matthias Kettl LMU Munich, Marian Lingsch-Rosenfeld LMU Munich, Moeketsi Raselimo Humboldt-Universität zu Berlin
12:00 4m Talk		BugsPHP: A dataset for Automated Program Repair in PHP Data and Tool Showcase Track K.D. Pramod University of Moratuwa, Sri Lanka, W.T.N. De Silva University of Moratuwa, Sri Lanka, W.U.K. Thabrew University of Moratuwa, Sri Lanka, Ridwan Salihin Shariffdeen National University of Singapore, Sandareka Wickramanayake University of Moratuwa, Sri Lanka Pre-print
12:04 4m Talk		AW4C: A Commit-Aware C Dataset for Actionable Warning Identification Data and Tool Showcase Track Zhipeng Liu , Meng Yan Chongqing University, Zhipeng Gao Shanghai Institute for Advanced Study - Zhejiang University, Dong Li , Xiaohong Zhang Chongqing University, Dan Yang Chongqing University
12:08 5m Talk		Predicting the Impact of Crashes Across Release Channels Industry Track Suhaib Mujahid Mozilla, Diego Elias Costa Concordia University, Canada, Marco Castelluccio Mozilla
12:13 5m Talk		Zero Shot Learning based Alternatives for Class Imbalanced Learning Problem in Enterprise Software Defect Analysis Industry Track Sangameshwar Patil Dept. of CSE, IIT Madras and TRDDC, TCS, B Ravindran IITM

12:30 - 14:00	LunchICSE Catering at Open Space

12:30 90m Lunch		Lunch ICSE Catering

14:00 - 15:30	Mining ChallengeMining Challenge at Almada Negreiros Chair(s): Preetha Chatterjee Drexel University, USA, Fabio Palomba University of Salerno

14:00 5m Talk		ChatGPT Chats Decoded: Uncovering Prompt Patterns for Superior Solutions in Software Development Lifecycle Mining Challenge Liangxuan Wu Huazhong University of Science and Technology, Yanjie Zhao Huazhong University of Science and Technology, Xinyi Hou Huazhong University of Science and Technology, Tianming Liu Monash Univerisity, Haoyu Wang Huazhong University of Science and Technology
14:05 5m Talk		Write me this Code: An Analysis of ChatGPT Quality for Producing Source Code Mining Challenge Konstantinos Moratis Electrical and Computer Engineering Dept., Aristotle University of Thessaloniki, Themistoklis Diamantopoulos Electrical and Computer Engineering Dept, Aristotle University of Thessaloniki, Dimitrios-Nikitas Nastos Electrical and Computer Engineering Dept., Aristotle University of Thessaloniki, Andreas Symeonidis Aristotle University of Thessaloniki Pre-print
14:10 5m Talk		Quality Assessment of ChatGPT Generated Code and their Use by Developers Mining Challenge Mohammed Latif Siddiq University of Notre Dame, Lindsay Roney University of Notre Dame, Jiahao Zhang , Joanna C. S. Santos University of Notre Dame Pre-print Media Attached File Attached
14:15 5m Talk		Analyzing Developer Use of ChatGPT Generated Code in Open Source GitHub Projects Mining Challenge Balreet Grewal University of Alberta, Wentao Lu University of Alberta, Sarah Nadi New York University Abu Dhabi, University of Alberta, Cor-Paul Bezemer University of Alberta Pre-print
14:20 5m Talk		How I Learned to Stop Worrying and Love ChatGPT Mining Challenge Piotr Przymus Nicolaus Copernicus University in Toruń, Poland, Mikołaj Fejzer Nicolaus Copernicus University in Toruń, Jakub Narębski Nicolaus Copernicus University in Toruń, Krzysztof Stencel University of Warsaw Pre-print
14:25 5m Talk		Can ChatGPT Support Developers? An Empirical Evaluation of Large Language Models for Code Generation. Mining Challenge Kailun Jin York University, Chung-Yu Wang York University, Hung Viet Pham York University, Hadi Hemmati York University Pre-print
14:30 5m Talk		The role of library versions in Developer-ChatGPT conversations Mining Challenge Rachna Raj Concordia University, Diego Elias Costa Concordia University, Canada Pre-print
14:35 5m Talk		AI Writes, We Analyze: The ChatGPT Python Code Saga Mining Challenge Md Fazle Rabbi Idaho State University, Arifa Islam Champa Idaho State University, Minhaz F. Zibran Idaho State University, Md Rakibul Islam Lamar University DOI Pre-print
14:40 5m Talk		ChatGPT in Action: Analyzing Its Use in Software Development Mining Challenge Arifa Islam Champa Idaho State University, Md Fazle Rabbi Idaho State University, Costain Nachuma Idaho State University, Minhaz F. Zibran Idaho State University DOI Pre-print
14:45 5m Talk		Chatting with AI: Deciphering Developer Conversations with ChatGPT Mining Challenge Suad Mohamed Belmont University, Abdullah Parvin Belmont University, Esteban Parra Belmont University
14:50 5m Talk		Does Generative AI Generate Smells Related to Container Orchestration?: An Exploratory Study with Kubernetes Manifests Mining Challenge Yue Zhang Auburn University, Rachel Meredith Auburn University, Wilson Reaves Auburn University, Julia Coriolano Federal University of Pernambuco, Muhammad Ali Babar School of Computer Science, The University of Adelaide, Akond Rahman Auburn University Pre-print
14:55 5m Talk		On the Taxonomy of Developers' Discussion Topics with ChatGPT Mining Challenge Ertugrul Sagdic Lamar University, Arda Bayram Lamar University, Md Rakibul Islam Lamar University
15:00 5m Talk		How to refactor this code? An exploratory study on developer-ChatGPT refactoring conversations Mining Challenge Eman Abdullah AlOmar Stevens Institute of Technology, AnushKrishna Venkatakrishnan Rochester Institute of Technology, USA, Mohamed Wiem Mkaouer University of Michigan - Flint, Christian Newman , Ali Ouni ETS Montreal, University of Quebec
15:05 5m Talk		Analyzing Developer-ChatGPT Conversations for Software Refactoring: An Exploratory Study Mining Challenge Omkar Sandip Chavan Rochester Institute of Technology, Divya Dilip Hinge Rochester Institute of Technology, Soham Sanjay Deo Rochester Institute of Technology, Yaxuan (Olivia) Wang Rochester Institute of Technology, Mohamed Wiem Mkaouer University of Michigan - Flint
15:10 5m Talk		How Do Software Developers Use ChatGPT? An Exploratory Study on GitHub Pull Requests Mining Challenge Moataz Chouchen ETS, Narjes Bessghaier ETS Montreal, University of Quebec, Mahi Begoug , Ali Ouni ETS Montreal, University of Quebec, Eman Abdullah AlOmar Stevens Institute of Technology, Mohamed Wiem Mkaouer University of Michigan - Flint
15:15 5m Talk		Investigating the Utility of ChatGPT in the Issue Tracking System: An Exploratory Study Mining Challenge Joy Krishan Das University of Saskatchewan, Saikat Mondal University of Saskatchewan, Chanchal K. Roy University of Saskatchewan, Canada Pre-print
15:20 5m Talk		Enhancing User Interaction in ChatGPT: Characterizing and Consolidating Multiple Prompts for Issue Resolution Mining Challenge Saikat Mondal University of Saskatchewan, Suborno Deb Bappon Department of Computer Science, University of Saskatchewan, Canada, Chanchal K. Roy University of Saskatchewan, Canada Pre-print

14:00 - 15:30	Software QualityTechnical Papers / Registered Reports / Data and Tool Showcase Track at Grande Auditório Chair(s): Gopi Krishnan Rajbahadur Centre for Software Excellence, Huawei, Canada

14:00 12m Talk		Not all Dockerfile Smells are the Same: An Empirical Evaluation of Hadolint Writing Practices by Experts Technical Papers Giovanni Rosa University of Molise, Simone Scalabrino University of Molise, Gregorio Robles Universidad Rey Juan Carlos, Rocco Oliveto University of Molise
14:12 12m Talk		Supporting High-Level to Low-Level Requirements Coverage Reviewing with Large Language Models Technical Papers Anamaria-Roberta Hartl Johannes Kepler University Linz, Christoph Mayr-Dorn JOHANNES KEPLER UNIVERSITY LINZ, Atif Mashkoor Johannes Kepler University Linz, Alexander Egyed Johannes Kepler University Linz DOI Authorizer link Pre-print
14:24 12m Talk		On the Executability of R Markdown Files Technical Papers Md Anaytul Islam Lakehead University, Muhammad Asaduzzaman University of Windsor, Shaowei Wang Department of Computer Science, University of Manitoba, Canada
14:36 12m Talk		APIstic: A Large Collection of OpenAPI Metrics Technical Papers Souhaila Serbout Software Institute @ USI, Cesare Pautasso Software Institute, Faculty of Informatics, USI Lugano
14:48 6m Talk		Improving Automated Code Reviews: Learning From Experience Technical Papers Hong Yi Lin The University of Melbourne, Patanamon Thongtanunam University of Melbourne, Christoph Treude Singapore Management University, Wachiraphan (Ping) Charoenwet The University of Melbourne
14:55 4m Talk		Multi-faceted Code Smell Detection at Scale using DesigniteJava 2.0 Data and Tool Showcase Track Tushar Sharma Dalhousie University Pre-print
14:59 4m Talk		SATDAUG - A Balanced and Augmented Dataset for Detecting Self-Admitted Technical Debt Data and Tool Showcase Track Edi Sutoyo Bernoulli Institute for Mathematics, Computer Science and Artificial Intelligence, University of Groningen, Andrea Capiluppi University of Groningen
15:03 4m Talk		Curated Email-Based Code Reviews Datasets Data and Tool Showcase Track Mingzhao Liang The University of Melbourne, Wachiraphan (Ping) Charoenwet The University of Melbourne, Patanamon Thongtanunam University of Melbourne
15:07 4m Talk		TestDossier: A Dataset of Tested Values Automatically Extracted from Test Execution Data and Tool Showcase Track Andre Hora UFMG Pre-print Media Attached
15:11 4m Talk		Greenlight: Highlighting TensorFlow APIs Energy Footprint Data and Tool Showcase Track Saurabhsingh Rajput Dalhousie University, Maria Kechagia University College London, Federica Sarro University College London, Tushar Sharma Dalhousie University Pre-print
15:15 5m Talk		When Code Smells Meet ML: On the Lifecycle of ML-specific Code Smells in ML-enabled Systems Registered Reports Gilberto Recupito University of Salerno, Giammaria Giordano University of Salerno, Filomena Ferrucci University of Salerno, Dario Di Nucci University of Salerno, Fabio Palomba University of Salerno
15:20 5m Talk		Comparison of Static Analysis Architecture Recovery Tools for Microservice Applications Registered Reports Simon Schneider Hamburg University of Technology, Alexander Bakhtin University of Oulu, Xiaozhou Li University of Oulu, Jacopo Soldani University of Pisa, Italy, Antonio Brogi Università di Pisa, Tomas Cerny University of Arizona, Riccardo Scandariato Hamburg University of Technology, Davide Taibi University of Oulu and Tampere University

15:30 - 16:00	Coffee BreakICSE Catering at Open Space

15:30 30m Coffee break		Break ICSE Catering

16:00 - 17:30	Mobile AppsData and Tool Showcase Track / Technical Papers at Almada Negreiros Chair(s): Dario Di Nucci University of Salerno

16:00 12m Talk		Automating GUI-based Test Oracles for Mobile Apps Technical Papers Kesina Baral CQSE America, Jack Johnson , Junayed Mahmud George Mason University, Sabiha Salma George Mason University, Mattia Fazzini University of Minnesota, Julia Rubin University of British Columbia, Jeff Offutt George Mason University, Kevin Moran University of Central Florida
16:12 12m Talk		Global Prosperity or Local Monopoly? Understanding the Geography of App Popularity Technical Papers Liu Wang Beijing University of Posts and Telecommunications, Conghui Zheng Beijing University of Posts and Telecommunications, Haoyu Wang Huazhong University of Science and Technology, Xiapu Luo The Hong Kong Polytechnic University, Gareth Tyson Queen Mary University of London, Yi Wang , Shangguang Wang Beijing University of Posts and Telecommunications
16:24 12m Talk		GuiEvo: Automated Evolution of Mobile App UIs Technical Papers Sabiha Salma George Mason University, S M Hasan Mansur George Mason University, Yule Zhang George Mason University, Kevin Moran University of Central Florida
16:36 12m Talk		Comparing Apples to Androids: Discovery, Retrieval, and Matching of iOS and Android Apps for Cross-Platform Analyses Technical Papers Magdalena Steinböck TU Wien, Jakob Bleier TU Wien, Mikka Rainer CISPA Helmholtz Center for Information Security, Tobias Urban Institute for Internet Security & secunet Security Networks AG, Christine Utz CISPA Helmholtz Center for Information Security, Martina Lindorfer TU Wien
16:48 12m Talk		Keep Me Updated: An Empirical Study on Embedded Javascript Engines in Android Apps Technical Papers Elliott Wen The University of Auckland, Jiaxiang Liu The Hong Kong Polytechnic University, Xiapu Luo The Hong Kong Polytechnic University, Giovanni Russello University of Auckland, Jens Dietrich Victoria University of Wellington
17:00 12m Talk		Large Language Model vs. Stack Overflow in Addressing Android Permission Related Challenges Technical Papers Sahrima Jannat Oishwee University of Saskatchewan, Natalia Stakhanova University of Saskatchewan, Zadia Codabux University of Saskatchewan, Canada
17:12 4m Talk		DATAR: A Dataset for Tracking App Releases Data and Tool Showcase Track Yasaman Abedini Sharif University of Technology, Mohammad Hadi Hajihosseini Sharif University of Technology, Abbas Heydarnoori Bowling Green State University
17:16 4m Talk		AndroZoo: A Retrospective with a Glimpse into the Future Data and Tool Showcase Track Marco Alecci University of Luxembourg, Pedro Jesús Ruiz Jiménez University of Luxembourg, Kevin Allix Independent Researcher, Tegawendé F. Bissyandé University of Luxembourg, Jacques Klein University of Luxembourg

16:00 - 17:30	Machine learning for Software EngineeringTechnical Papers at Grande Auditório Chair(s): Diego Elias Costa Concordia University, Canada

16:00 12m Talk		Whodunit: Classifying Code as Human Authored or GPT-4 Generated - A case study on CodeChef problems Technical Papers Oseremen Joy Idialu University of Waterloo, Noble Saji Mathews University of Waterloo, Canada, Rungroj Maipradit University of Waterloo, Joanne M. Atlee University of Waterloo, Mei Nagappan University of Waterloo DOI Pre-print
16:12 12m Talk		GIRT-Model: Automated Generation of Issue Report Templates Technical Papers Nafiseh Nikehgbal Sharif University of Technology, Amir Hossein Kargaran LMU Munich, Abbas Heydarnoori Bowling Green State University DOI Pre-print
16:24 12m Talk		MicroRec: Leveraging Large Language Models for Microservice Recommendation Technical Papers Ahmed Saeed Alsayed University of Wollongong, Hoa Khanh Dam University of Wollongong, Chau Nguyen University of Wollongong
16:36 12m Talk		PeaTMOSS: A Dataset and Initial Analysis of Pre-Trained Models in Open-Source Software Technical Papers Wenxin Jiang Purdue University, Jerin Yasmin Queen's University, Canada, Jason Jones Purdue University, Nicholas Synovic Loyola University Chicago, Jiashen Kuo Purdue University, Nathaniel Bielanski Purdue University, Yuan Tian Queen's University, Kingston, Ontario, George K. Thiruvathukal Loyola University Chicago and Argonne National Laboratory, James C. Davis Purdue University DOI Pre-print
16:48 12m Talk		Data Augmentation for Supervised Code Translation Learning Technical Papers Binger Chen Technische Universität Berlin, Jacek golebiowski Amazon AWS, Ziawasch Abedjan Leibniz Universität Hannover
17:00 12m Talk		On the Effectiveness of Machine Learning-based Call-Graph Pruning: An Empirical Study Technical Papers Amir Mir Delft University of Technology, Mehdi Keshani Delft University of Technology, Sebastian Proksch Delft University of Technology Pre-print
17:12 12m Talk		Leveraging GPT-like LLMs to Automate Issue Labeling Technical Papers Giuseppe Colavito University of Bari, Italy, Filippo Lanubile University of Bari, Nicole Novielli University of Bari, Luigi Quaranta University of Bari, Italy Pre-print

19:30 - 22:00	BanquetSocial Events at Casa do Alentejo

19:30 2h30m Dinner		Social Event Social Events

Tue 16 Apr
Displayed time zone: Lisbon change

09:00 - 10:30	Development: practices and humans Data and Tool Showcase Track / Technical Papers at Almada Negreiros Chair(s): Gema Rodríguez-Pérez University of British Columbia (UBC)

09:50 6m Talk		Exploring the Effect of Multiple Natural Languages on Code Suggestion Using GitHub Copilot Technical Papers Kei Koyanagi Kyushu University, Dong Wang Kyushu University, Japan, Kotaro Noguchi Kyushu University, Masanari Kondo Kyushu University, Alexander Serebrenik Eindhoven University of Technology, Yasutaka Kamei Kyushu University, Naoyasu Ubayashi Kyushu University Pre-print
09:56 4m Talk		A Four-Dimension Gold Standard Dataset for Opinion Mining in Software Engineering Data and Tool Showcase Track Md Rakibul Islam Lamar University, Md Fazle Rabbi Idaho State University, Youngeun Jo Lamar University, Arifa Islam Champa Idaho State University, Ethan J Young Lamar University, Camden M Wilson Lamar University, Gavin J Scott Lamar University, Minhaz F. Zibran Idaho State University
10:00 4m Talk		Opening the Valve on Pure-Data: Usage Patterns and Programming Practices of a Data-Flow Based Visual Programming Language Data and Tool Showcase Track Anisha Islam Department of Computing Science, University of Alberta, Kalvin Eng University of Alberta, Abram Hindle University of Alberta
10:04 4m Talk		The PIPr Dataset of Public Infrastructure as Code Programs Data and Tool Showcase Track Daniel Sokolowski University of St. Gallen, David Spielmann University of St. Gallen, Guido Salvaneschi University of St. Gallen Link to publication DOI Pre-print
10:08 4m Talk		A Dataset of Microservices-based Open-Source Projects Data and Tool Showcase Track Dario Amoroso d'Aragona Tampere University, Alexander Bakhtin University of Oulu, Xiaozhou Li University of Oulu, Ruoyu Su University of Oulu, Lauren Adams Baylor University, Ernesto Aponte Universidad del Sagrado Corazón, Francis Boyle Baylor University, Patrick Boyle Baylor University, Rachel Koerner Baylor University, Joseph Lee University of Richmond, Fangchao Tian University of Oulu, Yuqing Wang University of Oulu, Jesse Nyyssölä University of Helsinki, Ernesto Quevedo Baylor University, Shahidur Md Rahaman Baylor University, Amr Elsayed Baylor University, Mika Mäntylä University of Helsinki and University of Oulu, Tomas Cerny University of Arizona, Davide Taibi University of Oulu and Tampere University
10:12 4m Talk		SensoDat: Simulation-based Sensor Dataset of Self-driving Cars Data and Tool Showcase Track Christian Birchler Zurich University of Applied Sciences & University of Bern, Cyrill Rohrbach University of Bern, Switzerland, Timo Kehrer University of Bern, Sebastiano Panichella Zurich University of Applied Sciences
10:16 4m Talk		Incivility in Open Source Projects: A Comprehensive Annotated Dataset of Locked GitHub Issue Threads Data and Tool Showcase Track Ramtin Ehsani Drexel University, Mia Mohammad Imran Virginia Commonwealth University, Robert Zita Elmhurst University, Kostadin Damevski Virginia Commonwealth University, Preetha Chatterjee Drexel University, USA
10:20 4m Talk		A Dataset of Atoms of Confusion in the Android Open Source Project Data and Tool Showcase Track Davi Batista Tabosa Federal University of Ceará, Oton Pinheiro Federal University of Ceará, Lincoln Rocha Federal University of Ceará, Windson Viana Federal University of Ceará
10:24 4m Talk		PlayMyData: a curated dataset of multi-platform video games Data and Tool Showcase Track Andrea D'Angelo University of L'Aquila, Claudio Di Sipio University of L'Aquila, Cristiano Politowski DIRO, University of Montreal, Riccardo Rubei University of L'Aquila

09:00 - 10:30	Keynote and TutorialTutorials / Keynotes at Grande Auditório Chair(s): Romain Robbes CNRS, LaBRI, University of Bordeaux

09:00 45m Keynote		Questioning the questions we ask about the impact of AI on software engineering Keynotes Margaret-Anne Storey University of Victoria
09:45 45m Talk		Open Source Software Digital Sociology: Quantifying and Managing Complex Open Source Software Ecosystem Tutorials Minghui Zhou Peking University, Yuxia Zhang Beijing Institute of Technology, Xin Tan Beihang University

10:30 - 11:00	Coffee BreakICSE Catering at Open Space

10:30 30m Coffee break		Break ICSE Catering

11:00 - 12:30	Process automation & DevOps and Tutorial ITechnical Papers / Tutorials at Almada Negreiros Chair(s): Tom Mens University of Mons, Ayushi Rastogi University of Groningen, The Netherlands

11:00 12m Talk		Learning to Predict and Improve Build Successes in Package Ecosystems Technical Papers Harshitha Menon Lawrence Livermore National Lab, Daniel Nichols University of Maryland, College Park, Abhinav Bhatele University of Maryland, College Park, Todd Gamblin Lawrence Livermore National Laboratory
11:12 12m Talk		The Impact of Code Ownership of DevOps Artefacts on the Outcome of DevOps CI Builds Technical Papers Ajiromola Kola-Olawuyi University of Waterloo, Nimmi Rashinika Weeraddana University of Waterloo, Mei Nagappan University of Waterloo
11:24 12m Talk		A Mutation-Guided Assessment of Acceleration Approaches for Continuous Integration: An Empirical Study of YourBase Technical Papers Zhili Zeng University of Waterloo, Tao Xiao Nara Institute of Science and Technology, Maxime Lamothe Polytechnique Montreal, Hideaki Hata Shinshu University, Shane McIntosh University of Waterloo Pre-print
11:45 45m Talk		Cohort Studies for Mining Software Repositories Tutorials Nyyti Saarimäki Tampere University, Sira Vegas Universidad Politecnica de Madrid, Valentina Lenarduzzi University of Oulu, Davide Taibi University of Oulu and Tampere University , Mikel Robredo University of Oulu

11:00 - 12:30	Software Evolution & AnalysisTechnical Papers / Data and Tool Showcase Track / Industry Track at Grande Auditório Chair(s): Vladimir Kovalenko JetBrains Research

11:00 12m Talk		Unveiling ChatGPT's Usage in Open Source Projects: A Mining-based Study Technical Papers Rosalia Tufano Università della Svizzera Italiana, Antonio Mastropaolo Università della Svizzera italiana, Federica Pepe University of Sannio, Ozren Dabic Software Institute, Università della Svizzera italiana (USI), Switzerland, Massimiliano Di Penta University of Sannio, Italy, Gabriele Bavota Software Institute @ Università della Svizzera Italiana
11:12 12m Talk		DRMiner: A Tool For Identifying And Analyzing Refactorings In Dockerfile Technical Papers Emna Ksontini University of Michigan - Dearborn, Aycha Abid Oakland University, Rania Khalsi University of Michigan - Flint, Marouane Kessentini University of Michigan - Flint
11:24 12m Talk		A Large-Scale Empirical Study of Open Source License Usage: Practices and Challenges Technical Papers Jiaqi Wu Zhejiang University, Lingfeng Bao Zhejiang University, Xiaohu Yang Zhejiang University, Xin Xia Huawei Technologies, Xing Hu Zhejiang University
11:36 12m Talk		Analyzing the Evolution and Maintenance of ML Models on Hugging Face Technical Papers Joel Castaño Fernández Universitat Politècnica de Catalunya, Silverio Martínez-Fernández UPC-BarcelonaTech, Xavier Franch Universitat Politècnica de Catalunya, Justus Bogner Vrije Universiteit Amsterdam Link to publication Pre-print
11:48 12m Talk		On the Anatomy of Real-World R Code for Static Analysis Technical Papers Florian Sihler Ulm University, Lukas Pietzschmann Ulm University, Raphael Straub Ulm University, Matthias Tichy Ulm University, Germany, Andor Diera Ulm University, Abdelhalim Dahou GESIS Leibniz Institute for the Social Sciences Pre-print File Attached
12:00 6m Talk		Encoding Version History Context for Better Code Representation Technical Papers Huy Nguyen The University of Melbourne, Christoph Treude Singapore Management University, Patanamon Thongtanunam University of Melbourne Pre-print
12:06 4m Talk		CodeLL: A Lifelong Learning Dataset to Support the Co-Evolution of Data and Language Models of Code Data and Tool Showcase Track Martin Weyssow DIRO, Université de Montréal, Claudio Di Sipio University of L'Aquila, Davide Di Ruscio University of L'Aquila, Houari Sahraoui DIRO, Université de Montréal
12:10 4m Talk		Bidirectional Paper-Repository Tracing in Software Engineering Data and Tool Showcase Track Daniel Garijo , Miguel Arroyo Universidad Politécnica de Madrid, Esteban González Guardia Universidad Politécnica de Madrid, Christoph Treude Singapore Management University, Nicola Tarocco CERN
12:14 4m Talk		DistilKaggle: A Distilled Dataset of Kaggle Jupyter Notebooks Data and Tool Showcase Track Mojtaba Mostafavi Department of Computer Engineering of Sharif University of Technology, Arash Asgari Department of Computer Engineering of Sharif University of Technology, Mohammad Abolnejadian Department of Computer Engineering of Sharif University of Technology, Abbas Heydarnoori Bowling Green State University
12:18 5m Talk		Estimating Usage of Open Source Projects Industry Track Sophia Vargas Google LLC, Georg Link Bitergia, JaYoung Lee Google

12:30 - 14:00	LunchICSE Catering at Open Space

12:30 90m Lunch		Lunch ICSE Catering

14:00 - 15:30	Process automation & DevOps IITechnical Papers / Data and Tool Showcase Track at Almada Negreiros Chair(s): Shane McIntosh University of Waterloo

14:00 12m Talk		Options Matter: Documenting and Fixing Non-Reproducible Builds in Highly-Configurable Systems Technical Papers Georges Aaron RANDRIANAINA Université de Rennes 1, IRISA, Djamel Eddine Khelladi CNRS, IRISA, University of Rennes, Olivier Zendra Inria, Mathieu Acher University of Rennes, France / Inria, France / CNRS, France / IRISA, France
14:12 12m Talk		How do Machine Learning Projects use Continuous Integration Practices? An Empirical Study on GitHub Actions Technical Papers João Helis Bernardo Federal Institute of Education, Science and Technology of Rio Grande do Norte, Daniel Alencar Da Costa University of Otago, Sergio Queiroz de Medeiros Universidade Federal do Rio Grande do Norte, Uirá Kulesza Federal University of Rio Grande do Norte DOI Pre-print
14:24 4m Talk		A dataset of GitHub Actions workflow histories Data and Tool Showcase Track Guillaume Cardoen University of Mons, Tom Mens University of Mons, Alexandre Decan University of Mons; F.R.S.-FNRS
14:28 4m Talk		gawd: A Differencing Tool for GitHub Actions Workflows Data and Tool Showcase Track Pooya Rostami Mazrae University of Mons, Alexandre Decan University of Mons; F.R.S.-FNRS, Tom Mens University of Mons
14:32 4m Talk		RABBIT: A tool for identifying bot accounts based on their recent GitHub event history Data and Tool Showcase Track Natarajan Chidambaram University of Mons, Tom Mens University of Mons, Alexandre Decan University of Mons; F.R.S.-FNRS
14:36 12m Talk		An Investigation of Patch Porting Practices of the Linux Kernel Ecosystem Technical Papers Xingyu Li UC Riverside, Zheng Zhang UC Riverside, Zhiyun Qian University of California at Riverside, USA, Trent Jaeger UC Riverside, Chengyu Song University of California at Riverside, USA
14:48 4m Talk		BugsPHP: A dataset for Automated Program Repair in PHP Data and Tool Showcase Track K.D. Pramod University of Moratuwa, Sri Lanka, W.T.N. De Silva University of Moratuwa, Sri Lanka, W.U.K. Thabrew University of Moratuwa, Sri Lanka, Ridwan Salihin Shariffdeen National University of Singapore, Sandareka Wickramanayake University of Moratuwa, Sri Lanka Pre-print

14:00 - 15:30	Security and Vision & ReflectionData and Tool Showcase Track / Technical Papers / Registered Reports / Vision and Reflection at Grande Auditório Chair(s): Tim Menzies North Carolina State University

14:00 12m Talk		Quantifying Security Issues in Reusable JavaScript Actions in GitHub Workflows Technical Papers Hassan Onsori Delicheh University of Mons, Belgium, Alexandre Decan University of Mons; F.R.S.-FNRS, Tom Mens University of Mons Pre-print
14:12 12m Talk		What Can Self-Admitted Technical Debt Tell Us About Security? A Mixed-Methods Study Technical Papers Nicolás E. Díaz Ferreyra Hamburg University of Technology, Mojtaba Shahin RMIT University, Mansooreh Zahedi The Univeristy of Melbourne, Sodiq Quadri Hamburg University of Technology, Riccardo Scandariato Hamburg University of Technology Pre-print
14:24 12m Talk		Are Latent Vulnerabilities Hidden Gems for Software Vulnerability Prediction? An Empirical Study Technical Papers Triet Le The University of Adelaide, Xiaoning Du Monash University, Australia, Muhammad Ali Babar School of Computer Science, The University of Adelaide
14:36 4m Talk		MalwareBench: Malware samples are not enough Data and Tool Showcase Track Nusrat Zahan North Carolina State University, Philipp Burckhardt Socket, Inc, Mikola Lysenko Socket, Inc, Feross Aboukhadijeh Socket, Inc, Laurie Williams North Carolina State University
14:40 4m Talk		Hash4Patch: A Lightweight Low False Positive Tool for Finding Vulnerability Patch Commits Data and Tool Showcase Track Simone Scalco University of Trento, Ranindya Paramitha University of Trento
14:44 4m Talk		MegaVul: A C/C++ Vulnerability Dataset with Comprehensive Code Representations Data and Tool Showcase Track Chao Ni School of Software Technology, Zhejiang University, Liyu Shen Zhejiang University, Xiaohu Yang Zhejiang University, Yan Zhu Zhejiang University, Shaohua Wang Central University of Finance and Economics Pre-print
14:48 5m Talk		Analyzing and Mitigating (with LLMs) the Security Misconfigurations of Helm Charts from Artifact Hub Registered Reports Francesco Minna Vrije Universiteit Amsterdam, Fabio Massacci University of Trento; Vrije Universiteit Amsterdam, Katja Tuma Vrije Universiteit Amsterdam
14:53 5m Talk		Fixing Smart Contract Vulnerabilities: A Comparative Analysis of Literature and Developer's Practices Registered Reports Francesco Salzano University of Molise, Simone Scalabrino University of Molise, Rocco Oliveto University of Molise, Remo Pareschi University of Molise
15:00 30m Talk		Then, Now, and Next: Constants in Changing MSR Research Landscape Vision and Reflection Ayushi Rastogi University of Groningen, The Netherlands

15:30 - 16:00	Coffee BreakICSE Catering at Open Space

15:30 30m Coffee break		Break ICSE Catering

16:00 - 17:30	Day 2: ClosingMSR Awards / Vision and Reflection at Grande Auditório Chair(s): Alberto Bacchelli University of Zurich

16:00 30m Talk		MSR in the age of LLMs Vision and Reflection Christoph Treude Singapore Management University
16:30 30m Talk		Idealists and Pragmatists—An Only Somewhat Self-Indulgent Reflection on the Development of an MSR Paper (and Researcher) Vision and Reflection Shane McIntosh University of Waterloo
17:00 30m Day closing		Closing session MSR Awards Diomidis Spinellis Athens University of Economics and Business & Delft University of Technology, Olga Baysal

Accepted Papers

	Title
	A Large-Scale Empirical Study of Open Source License Usage: Practices and Challenges Technical Papers Jiaqi Wu, Lingfeng Bao , Xiaohu Yang, Xin Xia, Xing Hu
	A Mutation-Guided Assessment of Acceleration Approaches for Continuous Integration: An Empirical Study of YourBase Technical Papers Zhili Zeng, Tao Xiao, Maxime Lamothe, Hideaki Hata, Shane McIntosh Pre-print
	Analyzing the Evolution and Maintenance of ML Models on Hugging Face Technical Papers Joel Castaño Fernández, Silverio Martínez-Fernández, Xavier Franch, Justus Bogner Link to publication Pre-print
	An Empirical Study on Just-in-time Conformal Defect Prediction Technical Papers Xhulja Shahini, Andreas Metzger, Klaus Pohl
	An Investigation of Patch Porting Practices of the Linux Kernel Ecosystem Technical Papers Xingyu Li, Zheng Zhang, Zhiyun Qian, Trent Jaeger, Chengyu Song
	APIstic: A Large Collection of OpenAPI Metrics Technical Papers Souhaila Serbout, Cesare Pautasso
	Are Latent Vulnerabilities Hidden Gems for Software Vulnerability Prediction? An Empirical Study Technical Papers Triet Le, Xiaoning Du, Muhammad Ali Babar
	Automating GUI-based Test Oracles for Mobile Apps Technical Papers Kesina Baral, Jack Johnson, Junayed Mahmud, Sabiha Salma, Mattia Fazzini, Julia Rubin, Jeff Offutt, Kevin Moran
	Availability and Usage of Platform-Specific APIs: A First Empirical Study Technical Papers Ricardo Job, Andre Hora Pre-print Media Attached File Attached
	Boosting API Misuse Detection via Integrating API Constraints from Multiple Sources Technical Papers Can Li, Jingxuan Zhang, Yixuan Tang, Zhuhang Li, Tianyue Sun
	Comparing Apples to Androids: Discovery, Retrieval, and Matching of iOS and Android Apps for Cross-Platform Analyses Technical Papers Magdalena Steinböck, Jakob Bleier, Mikka Rainer, Tobias Urban, Christine Utz, Martina Lindorfer
	CrashJS: A NodeJS Benchmark for Automated Crash Reproduction Technical Papers Philip Oliver, Jens Dietrich, Craig Anslow, Michael Homer
	Data Augmentation for Supervised Code Translation Learning Technical Papers Binger Chen, Jacek golebiowski, Ziawasch Abedjan
	DRMiner: A Tool For Identifying And Analyzing Refactorings In Dockerfile Technical Papers Emna Ksontini, Aycha Abid, Rania Khalsi, Marouane Kessentini
	Encoding Version History Context for Better Code Representation Technical Papers Huy Nguyen, Christoph Treude, Patanamon Thongtanunam Pre-print
	Enhancing Performance Bug Prediction Using Performance Code Metrics Technical Papers Guoliang Zhao, Stefanos Georgio, Safwat Hassan, Ying Zou, Derek Truong, Toby Corbin
	Exploring the Effect of Multiple Natural Languages on Code Suggestion Using GitHub Copilot Technical Papers Kei Koyanagi, Dong Wang, Kotaro Noguchi, Masanari Kondo, Alexander Serebrenik, Yasutaka Kamei, Naoyasu Ubayashi Pre-print
	Fine-Grained Just-In-Time Defect Prediction at the Block Level in Infrastructure-as-Code (IaC) Technical Papers Mahi Begoug, Moataz Chouchen, Ali Ouni, Eman Abdullah AlOmar, Mohamed Wiem Mkaouer
	GIRT-Model: Automated Generation of Issue Report Templates Technical Papers Nafiseh Nikehgbal, Amir Hossein Kargaran, Abbas Heydarnoori DOI Pre-print
	Global Prosperity or Local Monopoly? Understanding the Geography of App Popularity Technical Papers Liu Wang, Conghui Zheng, Haoyu Wang, Xiapu Luo, Gareth Tyson, Yi Wang, Shangguang Wang
	GuiEvo: Automated Evolution of Mobile App UIs Technical Papers Sabiha Salma, S M Hasan Mansur, Yule Zhang, Kevin Moran
	How do Machine Learning Projects use Continuous Integration Practices? An Empirical Study on GitHub Actions Technical Papers João Helis Bernardo, Daniel Alencar Da Costa, Sergio Queiroz de Medeiros, Uirá Kulesza DOI Pre-print
	Improving Automated Code Reviews: Learning From Experience Technical Papers Hong Yi Lin, Patanamon Thongtanunam, Christoph Treude, Wachiraphan (Ping) Charoenwet
	Keep Me Updated: An Empirical Study on Embedded Javascript Engines in Android Apps Technical Papers Elliott Wen, Jiaxiang Liu, Xiapu Luo, Giovanni Russello, Jens Dietrich
	Large Language Model vs. Stack Overflow in Addressing Android Permission Related Challenges Technical Papers Sahrima Jannat Oishwee, Natalia Stakhanova, Zadia Codabux
	Learning to Predict and Improve Build Successes in Package Ecosystems Technical Papers Harshitha Menon, Daniel Nichols, Abhinav Bhatele, Todd Gamblin
	Leveraging GPT-like LLMs to Automate Issue Labeling Technical Papers Giuseppe Colavito, Filippo Lanubile, Nicole Novielli, Luigi Quaranta Pre-print
	MicroRec: Leveraging Large Language Models for Microservice Recommendation Technical Papers Ahmed Saeed Alsayed, Hoa Khanh Dam, Chau Nguyen
	Not all Dockerfile Smells are the Same: An Empirical Evaluation of Hadolint Writing Practices by Experts Technical Papers Giovanni Rosa, Simone Scalabrino, Gregorio Robles, Rocco Oliveto
	On the Anatomy of Real-World R Code for Static Analysis Technical Papers Florian Sihler, Lukas Pietzschmann, Raphael Straub, Matthias Tichy, Andor Diera, Abdelhalim Dahou Pre-print File Attached
	On the Effectiveness of Machine Learning-based Call-Graph Pruning: An Empirical Study Technical Papers Amir Mir, Mehdi Keshani, Sebastian Proksch Pre-print
	On the Executability of R Markdown Files Technical Papers Md Anaytul Islam, Muhammad Asaduzzaman, Shaowei Wang
	Options Matter: Documenting and Fixing Non-Reproducible Builds in Highly-Configurable Systems Technical Papers Georges Aaron RANDRIANAINA, Djamel Eddine Khelladi, Olivier Zendra, Mathieu Acher
	PeaTMOSS: A Dataset and Initial Analysis of Pre-Trained Models in Open-Source Software Technical Papers Wenxin Jiang, Jerin Yasmin, Jason Jones, Nicholas Synovic, Jiashen Kuo, Nathaniel Bielanski, Yuan Tian, George K. Thiruvathukal, James C. Davis DOI Pre-print
	Quantifying Security Issues in Reusable JavaScript Actions in GitHub Workflows Technical Papers Hassan Onsori Delicheh, Alexandre Decan , Tom Mens Pre-print
	Supporting High-Level to Low-Level Requirements Coverage Reviewing with Large Language Models Technical Papers Anamaria-Roberta Hartl, Christoph Mayr-Dorn, Atif Mashkoor, Alexander Egyed DOI Authorizer link Pre-print
	The Impact of Code Ownership of DevOps Artefacts on the Outcome of DevOps CI Builds Technical Papers Ajiromola Kola-Olawuyi, Nimmi Rashinika Weeraddana, Mei Nagappan
	Thirty-Three Years of Mathematicians and Software Engineers: A Case Study of Domain Expertise and Participation in Proof Assistant Ecosystems Technical Papers Gwenyth Lincroft, Minsung Cho, Mahsa Bazzaz, Katherine Hough, Jonathan Bell Pre-print Media Attached
	Unveiling ChatGPT's Usage in Open Source Projects: A Mining-based Study Technical Papers Rosalia Tufano, Antonio Mastropaolo, Federica Pepe, Ozren Dabic, Massimiliano Di Penta, Gabriele Bavota
	What Can Self-Admitted Technical Debt Tell Us About Security? A Mixed-Methods Study Technical Papers Nicolás E. Díaz Ferreyra, Mojtaba Shahin, Mansooreh Zahedi, Sodiq Quadri, Riccardo Scandariato Pre-print
	Whodunit: Classifying Code as Human Authored or GPT-4 Generated - A case study on CodeChef problems Technical Papers Oseremen Joy Idialu, Noble Saji Mathews, Rungroj Maipradit, Joanne M. Atlee, Mei Nagappan DOI Pre-print

Call for Papers

The International Conference on Mining Software Repositories (MSR) is the premier conference for data science (DS), machine learning (ML), and artificial intelligence (AI) in software engineering. There are vast amounts of data available in software-related repositories, such as source control systems, defect trackers, code review repositories, app stores, archived communications between project personnel, question-and-answer sites, CI build servers, package registries, and run-time telemetry. The MSR conference invites significant research contributions in which software data plays a central role. MSR research track submissions using data from software repositories, either solely or combined with data from other sources, can take many forms, including: studies applying existing DS/ML/AI techniques to better understand the practice of software engineering, software users, and software behavior; empirically-validated applications of existing or novel DS/ML/AI-based techniques to improve software development and support the maintenance of software systems; and cross-cutting concerns around the engineering of DS/ML/AI-enabled software systems.

The 21st International Conference on Mining Software Repositories will be held on April 15-16, 2024, in Lisbon, Portugal.

Evaluation Criteria

We invite both full (maximum ten pages, plus two additional pages of references) as well as short (four pages, plus references) papers to the Research Track. Full papers are expected to describe new techniques and/or novel research results, to have a high degree of technical rigor, and to be evaluated scientifically. Short papers are expected to discuss controversial issues in the field, or present interesting or thought-provoking ideas that are not yet fully developed. Submissions will be evaluated according to the following criteria:

Soundness: This aspect pertains to how well the paper’s contributions — whether they involve new methodologies, applications of existing techniques to unfamiliar problems, empirical studies, or other research methods — address the research questions posed and are backed by thorough application of relevant research procedures. For short papers, the expectation is for more limited evaluations given their narrower scope.
Relevance: The extent to which the paper successfully argues or illustrates that its contributions help bridge a significant knowledge gap or tackle a crucial practical issue within the field of software engineering.
Novelty: How original the paper’s contributions are in comparison to existing knowledge or how significantly they contribute to the current body of knowledge. Note that this doesn’t discourage well-motivated replication studies.
Presentation: How well-structured and clear the paper’s argumentation is, how clearly the contributions are articulated, the legibility of figures and tables, and the adequacy of English language usage. All papers should comply with the formatting instructions provided.
Replicability: The extent to which the paper’s claims can be independently verified through available replication packages and/or sufficient information included in the paper to understand how data was obtained, analyzed, and interpreted, or how a proposed technique works. All submissions are expected to adhere to the Open Science policy below.

Junior PC

Following two successful editions of the MSR Shadow PC in 2021 and 2022 (see also this paper and this presentation for more context), and the success of the Junior PC in MSR 2023, MSR 2024 will once again integrate the junior reviewers into the main technical track program committee!

The main goal remains unchanged: to train the next generation of MSR (and, more broadly, SE) reviewers and program committee members, in response to a widely-recognized challenge of scaling peer review capacity as the research community and volume of submissions grows over time. As with the previous Shadow and Junior PC, the primary audience for the Junior PC is early-career researchers (PhD students, postdocs, new faculty members, and industry practitioners) who are keen to get more involved in the academic peer-review process but have not yet served on a technical research track program committee at big international SE conferences (e.g., ICSE, ESEC/FSE, ASE, MSR, ICSME, SANER).

Prior to the MSR submission deadline, all PC members, including the junior reviewers, will receive guidance on review quality, confidentiality, and ethics standards, how to write good reviews, and how to participate in discussions (see ACM reviewers’ responsibilities). Junior reviewers will then serve alongside regular PC members on the main technical track PC, participating fully in the review process, including author responses and PC discussions to reach consensus. In addition, Junior PC members will receive feedback on how to improve their reviews throughout the process.

All submissions to the MSR research track will be reviewed jointly by both regular and junior PC members, as part of the same process. We expect that each paper will receive three reviews from regular PC members and two additional reviews from Junior PC members. The final decisions will be made by consensus among all reviewers, as always. Based on our experience with the MSR Shadow and Junior PC, we expect that the addition of junior reviewers to each paper will increase the overall quality of reviews the authors receive, since junior reviewers will typically have a deep understanding of recent topics, and can thus provide deep technical feedback on the subject.

Submission Process

All authors should use the official “ACM Primary Article Template”, as can be obtained from the ACM Proceedings Template page. LaTeX users should use the sigconf option, as well as the review (to produce line numbers for easy reference by the reviewers) and anonymous (omitting author names) options. To that end, the following LaTeX code can be placed at the start of the LaTeX document:

\documentclass[sigconf,review,anonymous]{acmart}
\acmConference[MSR 2024]{21st International Conference on Mining Software Repositories}{April 2024}{Lisbon, Portugal}

Submissions to the Research Track can be made via the submission site by the submission deadline. However, we encourage authors to submit at least the paper abstract and author details well in advance of the deadline, to leave enough time to properly enter conflicts of interest for anonymous reviewing. All submissions must adhere to the following requirements:

All submissions must not exceed 10 pages for the main text, inclusive of all figures, tables, appendices, etc. Two more pages containing only references are permitted. All submissions must be in PDF. Accepted papers will be allowed one extra page for the main text of the camera-ready version. The page limit is strict, and it will not be possible to purchase additional pages at any point in the process (including after acceptance).
Submissions must strictly conform to the ACM conference proceedings formatting instructions specified above. Alterations of spacing, font size, and other changes that deviate from the instructions may result in desk rejection without further review.
By submitting to MSR, authors acknowledge that they are aware of and agree to be bound by the ACM Policy and Procedures on Plagiarism and the IEEE Plagiarism FAQ. Papers submitted to MSR 2024 must not have been published elsewhere and must not be under review or submitted for review elsewhere whilst under consideration for MSR 2024. Contravention of this concurrent submission policy will be deemed a serious breach of scientific ethics, and appropriate action will be taken in all such cases. To check for double submission and plagiarism issues, the chairs reserve the right to (1) share the list of submissions with the PC Chairs of other conferences with overlapping review periods and (2) use external plagiarism detection software, under contract to the ACM or IEEE, to detect violations of these policies.
By submitting your article to an ACM Publication, you are hereby acknowledging that you and your co-authors are subject to all ACM Publications Policies, including ACM’s new Publications Policy on Research Involving Human Participants and Subjects. Alleged violations of this policy or any ACM Publications Policy will be investigated by ACM and may result in a full retraction of your paper, in addition to other potential penalties, as per ACM Publications Policy.
Please ensure that you and your co-authors obtain an ORCID ID, so you can complete the publishing process for your accepted paper. ACM has been involved in ORCID from the start and ICSE has recently made a commitment to collect ORCID IDs from all of the published authors. We are committed to improve author discoverability, ensure proper attribution and contribute to ongoing community efforts around name normalization; your ORCID ID will help in these efforts.
The MSR 2024 Technical Track will employ a double-anonymous review process. Thus, no submission may reveal its authors’ identities. The authors must make every effort to honor the double-anonymous review process. In particular:
- Authors’ names must be omitted from the submission.
- All references to the author’s prior work should be in the third person.
- While authors have the right to upload preprints on ArXiV or similar sites, they must avoid specifying that the manuscript was submitted to MSR 2024.
- During review, authors should not publicly use the submission title. We recommend using a different paper title for any pre-print in arxiv or similar websites.
Further advice, guidance, and explanation about the double-anonymous review process can be found in the Q&A page from ICSEs.
By submitting to MSR 2024, authors acknowledge that they conform to the authorship policy of the ACM, and the authorship policy of the IEEE.

Submissions should also include a supporting statement on the data availability, per the Open Science policy below.

Any submission that does not comply with these requirements is likely to be desk rejected by the PC Chairs without further review.

Authors will have a chance to see the reviews and respond to reviewer comments before any decision about the submission is made.

Upon notification of acceptance, all authors of accepted papers will be asked to fill a copyright form and will receive further instructions for preparing the camera-ready version of their papers. At least one author of each paper is expected to register and present the paper at the MSR 2024 conference. All accepted contributions will be published in the electronic proceedings of the conference.

A selection of the best papers will be invited to an Empirical Software Engineering (EMSE) Special Issue. The authors of accepted papers that show outstanding contributions to the FOSS community will have a chance to self-nominate their paper for the MSR FOSS Impact Award.

Open Science Policy

The MSR conference actively supports the adoption of open science principles. Indeed, we consider replicability as an explicit evaluation criterion. We expect all contributing authors to disclose the (anonymized and curated) data to increase reproducibility, replicability, and/or recoverability of the studies, provided that there are no ethical, legal, technical, economic, or sensible barriers preventing the disclosure. Please provide a supporting statement on the data availability in your submitted papers, including an argument for why (some of) the data cannot be made available, if that is the case.

Specifically, we expect all contributing authors to disclose:

the source code of relevant software used or proposed in the paper, including that used to retrieve and analyze data.
the data used in the paper (e.g., evaluation data, anonymized survey data, etc.)
instructions for other researchers describing how to reproduce or replicate the results.

Fostering artifacts as open data and open source should be done as:

Archived on preserved digital repositories such as zenodo.org, figshare.com, www.softwareheritage.org, osf.io, or institutional repositories. GitHub, GitLab, and similar services for version control systems do not offer properly archived and preserved data. Personal or institutional websites, consumer cloud storage such as Dropbox, or services such as Academia.edu and Researchgate.net may not provide properly archived and preserved data and may increase the risk of violating anonymity if used at submission time.
Data should be released under a recognized open data license such as the CC0 dedication or the CC-BY 4.0 license when publishing the data.
Software should be released under an open source license.
Different open licenses, if mandated by institutions or regulations, are also permitted.

We encourage authors to make artifacts available upon submission (either privately or publicly) and upon acceptance (publicly).

We recognize that anonymising artifacts such as source code is more difficult than preserving anonymity in a paper. We ask authors to take a best effort approach to not reveal their identities. We will also ask reviewers to avoid trying to identify authors by looking at commit histories and other such information that is not easily anonymized. Authors wanting to share GitHub repositories may also look into using https://anonymous.4open.science/, which is an open source tool that helps you to quickly double-anonymize your repository.

For additional information on creating open artifacts and open access pre- and post-prints, please see this ICSE 2023 page.

Submission Link

Papers must be submitted through HotCRP: https://msr2024-technical.hotcrp.com

Important Dates

Abstract Deadline: November 14, 2023 AoE
Paper Deadline: November 17, 2023 AoE
Author Response Period: December 19 – 22, 2023 AoE
Author Notification: January 12, 2024 AoE
Camera Ready Deadline: January 28, 2024 AoE

Accepted Papers and Attendance Expectation

Accepted papers will be permitted an additional page of content to allow authors to incorporate review feedback. Therefore, the page limit for published papers will be 11 pages for full papers (or 5 pages, for short papers), plus 2 pages which may only contain references.

The official publication date is the date the proceedings are made available in the ACM or IEEE Digital Libraries. This date may be up to two weeks prior to the first day of the ICSE 2024 conference week. The official publication date affects the deadline for any patent filings related to published work.
Purchases of additional pages in the proceedings are not allowed.

After acceptance, the list of paper authors can not be changed under any circumstances and the list of authors on camera-ready papers must be identical to those on submitted papers. After acceptance paper titles can not be changed except by permission of the Program Co-Chairs, and only then when referees recommended a change for clarity or accuracy with paper content.

If a submission is accepted, at least one author of the paper is required to register for MSR 2024 and present the paper.

Technical PapersMSR 2024

Program Display Configuration

Mon 15 AprDisplayed time zone: Lisbon change

Tue 16 AprDisplayed time zone: Lisbon change

Accepted Papers

Call for Papers

Evaluation Criteria

Junior PC

Submission Process

Open Science Policy

Submission Link

Important Dates

Accepted Papers and Attendance Expectation

Alberto BacchelliCo-chair

University of Zurich

Switzerland

Eleni ConstantinouCo-chair

University of Cyprus

Cyprus

Ajay Jha

North Dakota State University

United States

Lina Ochoa

Eindhoven University of Technology

Wesley Assunção

North Carolina State University

United States

Vadim Zaytsev

Yaroslav Golubev

JetBrains Research

Serbia

Alberto Martin-Lopez

Software Institute - USI, Lugano

Switzerland

Pouria Derakhshanfar

JetBrains Research

Netherlands

Kevin Jesse

Accenture

United States

Abdul Ali Bangash

Software Analysis and Intelligence Lab (SAIL), Queen's University, Canada

Canada

Jason Tsay

IBM Research

United States

Max Hort

Simula Research Laboratory

Norway

Shengcheng Yu

Nanjing University

China

Sidong Feng

Monash University

Australia

Tien N. Nguyen

University of Texas at Dallas

United States

Michael Fu

Monash University

Australia

Abigail Koay

University of Sunshine Coast

Jinqiu Yang

Concordia University

Canada

Neng Zhang

Sun Yat-sen University

China

Marcelo De Almeida Maia

Federal University of Uberlandia

Brazil

Chong Wang

Nanyang Technological University

Shamsa Abid

Singapore Management University, Singapore

Singapore

Bin Lin

Radboud University

Netherlands

Mon 15 Apr
Displayed time zone: Lisbon change

Tue 16 Apr
Displayed time zone: Lisbon change