Noises in Interaction Traces Data and Their Impact on Previous Research Studies

Zéphyrin Soh, Thomas Drioul, Pierre Antoine Rappe, Foutse Khomh, Yann Gaël Guéhéneuc, Naji Habra

Résultats de recherche: Contribution dans un livre/un catalogue/un rapport/dans les actes d'une conférenceArticle dans les actes d'une conférence/un colloque

Résumé

Context: Developers' interaction traces (ITs) are commonly used in software engineering to understand how developers maintain and evolve software systems. Researchers make several assumptions when mining ITs, e.g., edit events are considered to be change activities and the time mined from ITs is considered to be the time spent by the developers performing the maintenance task. Goal: We investigate the extent to which these assumptions are correct. We examine noises in developers'''' ITs data and the impact of these noises on previous results derived from these traces. Approach: We perform an experiment with 15 participants, whom we asked to perform bug-fixing activities and collect Mylyn ITs and VLC video captures. We then investigate noises between the two data sets and propose an approach to correct noises in ITs. Results: We find that Mylyn ITs can miss on average about 6% of the time spent performing a task and contain on average about 28% of false edit-events. We report that these noises may have led researchers to mislabel some participants'''' editing styles in about 34% of the cases and that the numbers of edit-events performed by developers and the times that they spent on tasks are correlated, when they were considered not to be. Conclusion: We show that ITs must be carefully cleaned before being used in research studies.

langueAnglais
titreInternational Symposium on Empirical Software Engineering and Measurement
EditeurIEEE Computer Society Press
Pages1-10
Nombre de pages10
Volume2015-November
ISBN (imprimé)9781467378994
Les DOIs
étatPublié - 5 nov. 2015
EvénementACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2015 - Beijing, Chine
Durée: 22 oct. 201523 oct. 2015

Une conférence

Une conférenceACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2015
PaysChine
La villeBeijing
période22/10/1523/10/15

Empreinte digitale

Software engineering
Experiments

mots-clés

    Citer ceci

    Soh, Z., Drioul, T., Rappe, P. A., Khomh, F., Guéhéneuc, Y. G., & Habra, N. (2015). Noises in Interaction Traces Data and Their Impact on Previous Research Studies. Dans International Symposium on Empirical Software Engineering and Measurement (Vol 2015-November, p. 1-10). [7321209] IEEE Computer Society Press. DOI: 10.1109/ESEM.2015.7321209
    Soh, Zéphyrin ; Drioul, Thomas ; Rappe, Pierre Antoine ; Khomh, Foutse ; Guéhéneuc, Yann Gaël ; Habra, Naji. / Noises in Interaction Traces Data and Their Impact on Previous Research Studies. International Symposium on Empirical Software Engineering and Measurement. Vol 2015-November IEEE Computer Society Press, 2015. p. 1-10
    @inproceedings{7091f3ce92f646d1b9a307dd889df9d7,
    title = "Noises in Interaction Traces Data and Their Impact on Previous Research Studies",
    abstract = "Context: Developers' interaction traces (ITs) are commonly used in software engineering to understand how developers maintain and evolve software systems. Researchers make several assumptions when mining ITs, e.g., edit events are considered to be change activities and the time mined from ITs is considered to be the time spent by the developers performing the maintenance task. Goal: We investigate the extent to which these assumptions are correct. We examine noises in developers'''' ITs data and the impact of these noises on previous results derived from these traces. Approach: We perform an experiment with 15 participants, whom we asked to perform bug-fixing activities and collect Mylyn ITs and VLC video captures. We then investigate noises between the two data sets and propose an approach to correct noises in ITs. Results: We find that Mylyn ITs can miss on average about 6{\%} of the time spent performing a task and contain on average about 28{\%} of false edit-events. We report that these noises may have led researchers to mislabel some participants'''' editing styles in about 34{\%} of the cases and that the numbers of edit-events performed by developers and the times that they spent on tasks are correlated, when they were considered not to be. Conclusion: We show that ITs must be carefully cleaned before being used in research studies.",
    keywords = "interaction traces, maintenance effort, noises, Software maintenance, video captures",
    author = "Z{\'e}phyrin Soh and Thomas Drioul and Rappe, {Pierre Antoine} and Foutse Khomh and Gu{\'e}h{\'e}neuc, {Yann Ga{\"e}l} and Naji Habra",
    year = "2015",
    month = "11",
    day = "5",
    doi = "10.1109/ESEM.2015.7321209",
    language = "English",
    isbn = "9781467378994",
    volume = "2015-November",
    pages = "1--10",
    booktitle = "International Symposium on Empirical Software Engineering and Measurement",
    publisher = "IEEE Computer Society Press",

    }

    Soh, Z, Drioul, T, Rappe, PA, Khomh, F, Guéhéneuc, YG & Habra, N 2015, Noises in Interaction Traces Data and Their Impact on Previous Research Studies. Dans International Symposium on Empirical Software Engineering and Measurement. VOL. 2015-November, 7321209, IEEE Computer Society Press, p. 1-10, ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2015, Beijing, Chine, 22/10/15. DOI: 10.1109/ESEM.2015.7321209

    Noises in Interaction Traces Data and Their Impact on Previous Research Studies. / Soh, Zéphyrin; Drioul, Thomas; Rappe, Pierre Antoine; Khomh, Foutse; Guéhéneuc, Yann Gaël; Habra, Naji.

    International Symposium on Empirical Software Engineering and Measurement. Vol 2015-November IEEE Computer Society Press, 2015. p. 1-10 7321209.

    Résultats de recherche: Contribution dans un livre/un catalogue/un rapport/dans les actes d'une conférenceArticle dans les actes d'une conférence/un colloque

    TY - GEN

    T1 - Noises in Interaction Traces Data and Their Impact on Previous Research Studies

    AU - Soh,Zéphyrin

    AU - Drioul,Thomas

    AU - Rappe,Pierre Antoine

    AU - Khomh,Foutse

    AU - Guéhéneuc,Yann Gaël

    AU - Habra,Naji

    PY - 2015/11/5

    Y1 - 2015/11/5

    N2 - Context: Developers' interaction traces (ITs) are commonly used in software engineering to understand how developers maintain and evolve software systems. Researchers make several assumptions when mining ITs, e.g., edit events are considered to be change activities and the time mined from ITs is considered to be the time spent by the developers performing the maintenance task. Goal: We investigate the extent to which these assumptions are correct. We examine noises in developers'''' ITs data and the impact of these noises on previous results derived from these traces. Approach: We perform an experiment with 15 participants, whom we asked to perform bug-fixing activities and collect Mylyn ITs and VLC video captures. We then investigate noises between the two data sets and propose an approach to correct noises in ITs. Results: We find that Mylyn ITs can miss on average about 6% of the time spent performing a task and contain on average about 28% of false edit-events. We report that these noises may have led researchers to mislabel some participants'''' editing styles in about 34% of the cases and that the numbers of edit-events performed by developers and the times that they spent on tasks are correlated, when they were considered not to be. Conclusion: We show that ITs must be carefully cleaned before being used in research studies.

    AB - Context: Developers' interaction traces (ITs) are commonly used in software engineering to understand how developers maintain and evolve software systems. Researchers make several assumptions when mining ITs, e.g., edit events are considered to be change activities and the time mined from ITs is considered to be the time spent by the developers performing the maintenance task. Goal: We investigate the extent to which these assumptions are correct. We examine noises in developers'''' ITs data and the impact of these noises on previous results derived from these traces. Approach: We perform an experiment with 15 participants, whom we asked to perform bug-fixing activities and collect Mylyn ITs and VLC video captures. We then investigate noises between the two data sets and propose an approach to correct noises in ITs. Results: We find that Mylyn ITs can miss on average about 6% of the time spent performing a task and contain on average about 28% of false edit-events. We report that these noises may have led researchers to mislabel some participants'''' editing styles in about 34% of the cases and that the numbers of edit-events performed by developers and the times that they spent on tasks are correlated, when they were considered not to be. Conclusion: We show that ITs must be carefully cleaned before being used in research studies.

    KW - interaction traces

    KW - maintenance effort

    KW - noises

    KW - Software maintenance

    KW - video captures

    UR - http://www.scopus.com/inward/record.url?scp=84961612939&partnerID=8YFLogxK

    U2 - 10.1109/ESEM.2015.7321209

    DO - 10.1109/ESEM.2015.7321209

    M3 - Conference contribution

    SN - 9781467378994

    VL - 2015-November

    SP - 1

    EP - 10

    BT - International Symposium on Empirical Software Engineering and Measurement

    PB - IEEE Computer Society Press

    ER -

    Soh Z, Drioul T, Rappe PA, Khomh F, Guéhéneuc YG, Habra N. Noises in Interaction Traces Data and Their Impact on Previous Research Studies. Dans International Symposium on Empirical Software Engineering and Measurement. Vol 2015-November. IEEE Computer Society Press. 2015. p. 1-10. 7321209. Disponible �, DOI: 10.1109/ESEM.2015.7321209