TY - GEN
T1 - On the Prevalence, Impact, and Evolution of SQL code smells in Data-Intensive Systems
AU - Asmare Muse, Biruk
AU - Rahman, Masud
AU - Nagy, Csaba
AU - Cleve, Anthony
AU - Khomh, Foutse
AU - Antoniol, Giuliano
N1 - Funding Information:
Acknowledgements: This research was partly supported by the Excellence of Science project 30446992 SECO-ASSIST, funded by the F.R.S.-FNRS, FWO and Natural Sciences and Engineering Research Council of Canada (NSERC).
Publisher Copyright:
© 2020 ACM.
PY - 2020/6/29
Y1 - 2020/6/29
N2 - Code smells indicate software design problems that harm software quality. Data-intensive systems that frequently access databases often suffer from SQL code smells besides the traditional smells. While there have been extensive studies on traditional code smells, recently, there has been a growing interest in SQL code smells. In this paper, we conduct an empirical study to investigate the prevalence and evolution of SQL code smells in open-source, data-intensive systems. We collected 150 projects and examined both traditional and SQL code smells in these projects. Our investigation delivers several important findings. First, SQL code smells are indeed prevalent in data-intensive software systems. Second, SQL code smells have a weak co-occurrence with traditional code smells. Third, SQL code smells have a weaker association with bugs than that of traditional code smells. Fourth, SQL code smells are more likely to be introduced at the beginning of the project lifetime and likely to be left in the code without a fix, compared to traditional code smells. Overall, our results show that SQL code smells are indeed prevalent and persistent in the studied data-intensive software systems. Developers should be aware of these smells and consider detecting and refactoring SQL code smells and traditional code smells separately, using dedicated tools.
AB - Code smells indicate software design problems that harm software quality. Data-intensive systems that frequently access databases often suffer from SQL code smells besides the traditional smells. While there have been extensive studies on traditional code smells, recently, there has been a growing interest in SQL code smells. In this paper, we conduct an empirical study to investigate the prevalence and evolution of SQL code smells in open-source, data-intensive systems. We collected 150 projects and examined both traditional and SQL code smells in these projects. Our investigation delivers several important findings. First, SQL code smells are indeed prevalent in data-intensive software systems. Second, SQL code smells have a weak co-occurrence with traditional code smells. Third, SQL code smells have a weaker association with bugs than that of traditional code smells. Fourth, SQL code smells are more likely to be introduced at the beginning of the project lifetime and likely to be left in the code without a fix, compared to traditional code smells. Overall, our results show that SQL code smells are indeed prevalent and persistent in the studied data-intensive software systems. Developers should be aware of these smells and consider detecting and refactoring SQL code smells and traditional code smells separately, using dedicated tools.
KW - Code smells
KW - SQL code smells
KW - data-intensive systems
KW - database access
UR - http://www.scopus.com/inward/record.url?scp=85093696468&partnerID=8YFLogxK
U2 - 10.1145/3379597.3387467
DO - 10.1145/3379597.3387467
M3 - Conference contribution
T3 - Proceedings - 2020 IEEE/ACM 17th International Conference on Mining Software Repositories, MSR 2020
SP - 327
EP - 338
BT - Proceedings - 2020 IEEE/ACM 17th International Conference on Mining Software Repositories, MSR 2020
PB - ACM Press
ER -