MSR 2024
Mon 15 - Tue 16 April 2024 Lisbon, Portugal
co-located with ICSE 2024

The presence of toxic and gender-based derogatory language in open-source software (OSS) communities has recently become a focal point for researchers. Such comments not only lead to frustration and disengagement among developers but may also influence their leave from the OSS projects. Despite ample evidence suggesting that diverse teams enhance productivity, the existence of toxic communications poses a significant threat to the participation of individuals from marginalized groups and, as such, may act as a barrier to fostering diversity and inclusion in OSS projects. However, there is a notable lack of research dedicated to exploring the association between gender-based toxic and derogatory language with a perceptible diversity of open-source software teams. Consequently, this study aims to investigate how such content influences the gender and ethnic diversity of open-source software development teams. To achieve this, we extract data from active GitHub projects, assess various project characteristics, and identify instances of toxic and gender-discriminatory language within issue/pull request comments. Using these attributes, we construct a regression model to explore how they contribute to the diversity index (Blau Index) of these projects.