An Empirical Study of (Multi-) Database Models in Open-Source Projects

Research output: Contribution in Book/Catalog/Report/Conference proceedingConference contribution

Abstract

Managing data-intensive systems has long been recognized as an expensive and error-prone process. This is mainly due to the often implicit consistency relationships that hold between applications and their database. As new technologies emerged for specialized purposes (e.g., graph databases, document stores), the joint use of database models has also become popular. There are undeniable benefits of such multi-database models where developers combine various technologies. However, the side effects on design, querying, and maintenance are not well-known yet. In this paper, we study multi-database models in software systems by mining major open-source repositories. We consider four years of history, from 2017 to 2020, of a total number of 40,609 projects with databases. Our results confirm the emergence of hybrid data-intensive systems as we found (multi-) database models (e.g., relational and non-relational) used together in 16% of all database-dependent projects. One percent of the systems added, deleted, or changed a database during the four years. The majority (62%) of these systems had a single database before becoming hybrid, and another significant part (19%) became “mono-database” after initially using multiple databases. We examine the evolution of these systems to understand the rationale of the design choices of the developers. Our study aims to guide future research towards new challenges posed by those emerging data management architectures.
Original languageEnglish
Title of host publicationConceptual Modeling - 40th International Conference, ER 2021, Proceedings
Subtitle of host publication40th International Conference, ER 2021, Virtual Event, October 18–21, 2021, Proceedings
EditorsAditya Ghose, Jennifer Horkoff, Vítor E. Silva Souza, Jeffrey Parsons, Joerg Evermann
PublisherSpringer
Pages87-101
Number of pages15
ISBN (Electronic)978-3-030-89022-3
ISBN (Print)978-3-030-89021-6
DOIs
Publication statusPublished - 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13011 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Keywords

  • Data models
  • Open-source projects
  • Empirical study

Fingerprint

Dive into the research topics of 'An Empirical Study of (Multi-) Database Models in Open-Source Projects'. Together they form a unique fingerprint.

Cite this