VaryMinions: Leveraging RNNs to Identify Variants in Variability-intensive Systems’ Logs

Sophie Fortz; Paul Temple; Xavier Devroey; Patrick Heymans; Gilles Perrouin

VaryMinions: Leveraging RNNs to Identify Variants in Variability-intensive Systems’ Logs

Sophie Fortz, Paul Temple, Xavier Devroey, Patrick Heymans, Gilles Perrouin

Research output: Contribution to journal › Special issue › peer-review

31 Downloads (Pure)

Abstract

From business processes to course management, variability-intensive software systems (VIS) are now ubiquitous. One can configure these systems’behaviour by activating options, e.g., to derive variants handling building permits across municipalities or implementing different functionalities (quizzes, forums) for a given course. These customisation facilities allow VIS to support distinct relevant customer requirements while taking advantage of reuse for common parts. Customisation thus allows realising both scope and scale economies. Behavioural differences amongst variants manifest themselves in event logs. To re-engineer this kind of system, one must know which variant(s) have produced which behaviour. Since variant information is barely present in logs, this paper supports this task by employing machine learning techniques to classify behaviours (event sequences) among variants. Specifically, we train Long Short Term Memory (LSTMs) and Gated Recurrent Units (GRUs) recurrent neural networks to relate event sequences with the variants they belong to on six different datasets issued from the configurable process and VIS domains. After having evaluated 20 different architectures of LSTM/GRU, our results demonstrate that it is possible to effectively learn the trace-to-variant mapping with high accuracy (at least 80% and up to 99%) and at scale, i.e., identifying 50 variants using 5000+ traces for each variant.

Original language	English
Journal	Empirical Software Engineering
Publication status	Accepted/In press - 6 Mar 2024

Keywords

configurable processes
recurrent neutral networks
Variability Intensive Systems
variability mining
Software product lines

Access to Document

VaryMinions___Extended_Version___EMSE___2024Accepted author manuscript, 2.29 MBLicence: CC BY-NC

Cite this

@article{2243662b47f247b6bfff957c555d5355,

title = "VaryMinions: Leveraging RNNs to Identify Variants in Variability-intensive Systems{\textquoteright} Logs",

abstract = "From business processes to course management, variability-intensive software systems (VIS) are now ubiquitous. One can configure these systems{\textquoteright}behaviour by activating options, e.g., to derive variants handling building permits across municipalities or implementing different functionalities (quizzes, forums) for a given course. These customisation facilities allow VIS to support distinct relevant customer requirements while taking advantage of reuse for common parts. Customisation thus allows realising both scope and scale economies. Behavioural differences amongst variants manifest themselves in event logs. To re-engineer this kind of system, one must know which variant(s) have produced which behaviour. Since variant information is barely present in logs, this paper supports this task by employing machine learning techniques to classify behaviours (event sequences) among variants. Specifically, we train Long Short Term Memory (LSTMs) and Gated Recurrent Units (GRUs) recurrent neural networks to relate event sequences with the variants they belong to on six different datasets issued from the configurable process and VIS domains. After having evaluated 20 different architectures of LSTM/GRU, our results demonstrate that it is possible to effectively learn the trace-to-variant mapping with high accuracy (at least 80% and up to 99%) and at scale, i.e., identifying 50 variants using 5000+ traces for each variant.",

keywords = "configurable processes, recurrent neutral networks, Variability Intensive Systems, variability mining, Software product lines",

author = "Sophie Fortz and Paul Temple and Xavier Devroey and Patrick Heymans and Gilles Perrouin",

year = "2024",

month = mar,

day = "6",

language = "English",

journal = "Empirical Software Engineering ",

issn = "1382-3256",

publisher = "Springer",

}

TY - JOUR

T1 - VaryMinions

T2 - Leveraging RNNs to Identify Variants in Variability-intensive Systems’ Logs

AU - Fortz, Sophie

AU - Temple, Paul

AU - Devroey, Xavier

AU - Heymans, Patrick

AU - Perrouin, Gilles

PY - 2024/3/6

Y1 - 2024/3/6

N2 - From business processes to course management, variability-intensive software systems (VIS) are now ubiquitous. One can configure these systems’behaviour by activating options, e.g., to derive variants handling building permits across municipalities or implementing different functionalities (quizzes, forums) for a given course. These customisation facilities allow VIS to support distinct relevant customer requirements while taking advantage of reuse for common parts. Customisation thus allows realising both scope and scale economies. Behavioural differences amongst variants manifest themselves in event logs. To re-engineer this kind of system, one must know which variant(s) have produced which behaviour. Since variant information is barely present in logs, this paper supports this task by employing machine learning techniques to classify behaviours (event sequences) among variants. Specifically, we train Long Short Term Memory (LSTMs) and Gated Recurrent Units (GRUs) recurrent neural networks to relate event sequences with the variants they belong to on six different datasets issued from the configurable process and VIS domains. After having evaluated 20 different architectures of LSTM/GRU, our results demonstrate that it is possible to effectively learn the trace-to-variant mapping with high accuracy (at least 80% and up to 99%) and at scale, i.e., identifying 50 variants using 5000+ traces for each variant.

AB - From business processes to course management, variability-intensive software systems (VIS) are now ubiquitous. One can configure these systems’behaviour by activating options, e.g., to derive variants handling building permits across municipalities or implementing different functionalities (quizzes, forums) for a given course. These customisation facilities allow VIS to support distinct relevant customer requirements while taking advantage of reuse for common parts. Customisation thus allows realising both scope and scale economies. Behavioural differences amongst variants manifest themselves in event logs. To re-engineer this kind of system, one must know which variant(s) have produced which behaviour. Since variant information is barely present in logs, this paper supports this task by employing machine learning techniques to classify behaviours (event sequences) among variants. Specifically, we train Long Short Term Memory (LSTMs) and Gated Recurrent Units (GRUs) recurrent neural networks to relate event sequences with the variants they belong to on six different datasets issued from the configurable process and VIS domains. After having evaluated 20 different architectures of LSTM/GRU, our results demonstrate that it is possible to effectively learn the trace-to-variant mapping with high accuracy (at least 80% and up to 99%) and at scale, i.e., identifying 50 variants using 5000+ traces for each variant.

KW - configurable processes

KW - recurrent neutral networks

KW - Variability Intensive Systems

KW - variability mining

KW - Software product lines

M3 - Special issue

SN - 1382-3256

JO - Empirical Software Engineering

JF - Empirical Software Engineering

ER -

VaryMinions: Leveraging RNNs to Identify Variants in Variability-intensive Systems’ Logs

Abstract

Keywords

Access to Document

Fingerprint

Cite this