VaryMinions: Leveraging RNNs to Identify Variants in Variability-intensive Systems’ Logs

Research output: Contribution to journalSpecial issuepeer-review

31 Downloads (Pure)


From business processes to course management, variability-intensive software systems (VIS) are now ubiquitous. One can configure these systems’behaviour by activating options, e.g., to derive variants handling building permits across municipalities or implementing different functionalities (quizzes, forums) for a given course. These customisation facilities allow VIS to support distinct relevant customer requirements while taking advantage of reuse for common parts. Customisation thus allows realising both scope and scale economies. Behavioural differences amongst variants manifest themselves in event logs. To re-engineer this kind of system, one must know which variant(s) have produced which behaviour. Since variant information is barely present in logs, this paper supports this task by employing machine learning techniques to classify behaviours (event sequences) among variants. Specifically, we train Long Short Term Memory (LSTMs) and Gated Recurrent Units (GRUs) recurrent neural networks to relate event sequences with the variants they belong to on six different datasets issued from the configurable process and VIS domains. After having evaluated 20 different architectures of LSTM/GRU, our results demonstrate that it is possible to effectively learn the trace-to-variant mapping with high accuracy (at least 80% and up to 99%) and at scale, i.e., identifying 50 variants using 5000+ traces for each variant.
Original languageEnglish
JournalEmpirical Software Engineering
Publication statusAccepted/In press - 6 Mar 2024


  • configurable processes
  • recurrent neutral networks
  • Variability Intensive Systems
  • variability mining
  • Software product lines


Dive into the research topics of 'VaryMinions: Leveraging RNNs to Identify Variants in Variability-intensive Systems’ Logs'. Together they form a unique fingerprint.

Cite this