Understanding database schema evolution: A case study

Anthony Cleve, Loup Meurice, Maxime Gobert, Jerome Maes, Jens Weber

Research output: Contribution to journalArticle

357 Downloads (Pure)

Abstract

Database reverse engineering (DRE) has traditionally been carried out by considering three main information sources: (1) the database schema, (2) the stored data, and (3) the application programs. Not all of these information sources are always available, or of sufficient quality to inform the DRE process. For example, getting access to real-world data is often extremely problematic for information systems that maintain private data. In recent years, the analysis of the evolution history of software programs have gained an increasing role in reverse engineering in general, but comparatively little such research has been carried out in the context of database reverse engineering. The goal of this paper is to contribute to narrowing this gap and exploring the use of the database evolution history as an additional information source to aid database schema reverse engineering. We present a tool-supported method for analyzing the evolution history of legacy databases, and we report on a large-scale case study of reverse engineering a complex information system and curate it as a benchmark for future research efforts within the community.

Original languageEnglish
Pages (from-to)113-121
Number of pages9
JournalScience of Computer Programming
Volume97
Issue numberP1
Early online date22 Nov 2013
DOIs
Publication statusPublished - 2015

Fingerprint

Reverse engineering
Information systems
Application programs

Keywords

  • Database understanding
  • Schema evolution
  • Software repository mining

Cite this

@article{08e48561e56540a3991eb82b7147c25d,
title = "Understanding database schema evolution: A case study",
abstract = "Database reverse engineering (DRE) has traditionally been carried out by considering three main information sources: (1) the database schema, (2) the stored data, and (3) the application programs. Not all of these information sources are always available, or of sufficient quality to inform the DRE process. For example, getting access to real-world data is often extremely problematic for information systems that maintain private data. In recent years, the analysis of the evolution history of software programs have gained an increasing role in reverse engineering in general, but comparatively little such research has been carried out in the context of database reverse engineering. The goal of this paper is to contribute to narrowing this gap and exploring the use of the database evolution history as an additional information source to aid database schema reverse engineering. We present a tool-supported method for analyzing the evolution history of legacy databases, and we report on a large-scale case study of reverse engineering a complex information system and curate it as a benchmark for future research efforts within the community.",
keywords = "Database understanding, Schema evolution, Software repository mining",
author = "Anthony Cleve and Loup Meurice and Maxime Gobert and Jerome Maes and Jens Weber",
year = "2015",
doi = "10.1016/j.scico.2013.11.025",
language = "English",
volume = "97",
pages = "113--121",
journal = "Science of Computer Programming",
issn = "0167-6423",
publisher = "Elsevier",
number = "P1",

}

Understanding database schema evolution : A case study. / Cleve, Anthony; Meurice, Loup; Gobert, Maxime; Maes, Jerome; Weber, Jens.

In: Science of Computer Programming, Vol. 97, No. P1, 2015, p. 113-121.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Understanding database schema evolution

T2 - A case study

AU - Cleve, Anthony

AU - Meurice, Loup

AU - Gobert, Maxime

AU - Maes, Jerome

AU - Weber, Jens

PY - 2015

Y1 - 2015

N2 - Database reverse engineering (DRE) has traditionally been carried out by considering three main information sources: (1) the database schema, (2) the stored data, and (3) the application programs. Not all of these information sources are always available, or of sufficient quality to inform the DRE process. For example, getting access to real-world data is often extremely problematic for information systems that maintain private data. In recent years, the analysis of the evolution history of software programs have gained an increasing role in reverse engineering in general, but comparatively little such research has been carried out in the context of database reverse engineering. The goal of this paper is to contribute to narrowing this gap and exploring the use of the database evolution history as an additional information source to aid database schema reverse engineering. We present a tool-supported method for analyzing the evolution history of legacy databases, and we report on a large-scale case study of reverse engineering a complex information system and curate it as a benchmark for future research efforts within the community.

AB - Database reverse engineering (DRE) has traditionally been carried out by considering three main information sources: (1) the database schema, (2) the stored data, and (3) the application programs. Not all of these information sources are always available, or of sufficient quality to inform the DRE process. For example, getting access to real-world data is often extremely problematic for information systems that maintain private data. In recent years, the analysis of the evolution history of software programs have gained an increasing role in reverse engineering in general, but comparatively little such research has been carried out in the context of database reverse engineering. The goal of this paper is to contribute to narrowing this gap and exploring the use of the database evolution history as an additional information source to aid database schema reverse engineering. We present a tool-supported method for analyzing the evolution history of legacy databases, and we report on a large-scale case study of reverse engineering a complex information system and curate it as a benchmark for future research efforts within the community.

KW - Database understanding

KW - Schema evolution

KW - Software repository mining

UR - http://www.scopus.com/inward/record.url?scp=84910118991&partnerID=8YFLogxK

U2 - 10.1016/j.scico.2013.11.025

DO - 10.1016/j.scico.2013.11.025

M3 - Article

VL - 97

SP - 113

EP - 121

JO - Science of Computer Programming

JF - Science of Computer Programming

SN - 0167-6423

IS - P1

ER -