Understanding database schema evolution: A case study

Anthony Cleve; Loup Meurice; Maxime Gobert; Jerome Maes; Jens Weber

doi:10.1016/j.scico.2013.11.025

Understanding database schema evolution: A case study

Anthony Cleve, Loup Meurice, Maxime Gobert, Jerome Maes, Jens Weber

Research output: Contribution to journal › Article › peer-review

543 Downloads (Pure)

Abstract

Database reverse engineering (DRE) has traditionally been carried out by considering three main information sources: (1) the database schema, (2) the stored data, and (3) the application programs. Not all of these information sources are always available, or of sufficient quality to inform the DRE process. For example, getting access to real-world data is often extremely problematic for information systems that maintain private data. In recent years, the analysis of the evolution history of software programs have gained an increasing role in reverse engineering in general, but comparatively little such research has been carried out in the context of database reverse engineering. The goal of this paper is to contribute to narrowing this gap and exploring the use of the database evolution history as an additional information source to aid database schema reverse engineering. We present a tool-supported method for analyzing the evolution history of legacy databases, and we report on a large-scale case study of reverse engineering a complex information system and curate it as a benchmark for future research efforts within the community.

Original language	English
Pages (from-to)	113-121
Number of pages	9
Journal	Science of Computer Programming
Volume	97
Issue number	P1
Early online date	22 Nov 2013
DOIs	https://doi.org/10.1016/j.scico.2013.11.025
Publication status	Published - 2015

Keywords

Database understanding
Schema evolution
Software repository mining

Access to Document

10.1016/j.scico.2013.11.025

CleveEtAl2015
© 2013 Elsevier B.V. All rights reserved.
Accepted author manuscript, 1.38 MB

Cite this

@article{08e48561e56540a3991eb82b7147c25d,

title = "Understanding database schema evolution: A case study",

abstract = "Database reverse engineering (DRE) has traditionally been carried out by considering three main information sources: (1) the database schema, (2) the stored data, and (3) the application programs. Not all of these information sources are always available, or of sufficient quality to inform the DRE process. For example, getting access to real-world data is often extremely problematic for information systems that maintain private data. In recent years, the analysis of the evolution history of software programs have gained an increasing role in reverse engineering in general, but comparatively little such research has been carried out in the context of database reverse engineering. The goal of this paper is to contribute to narrowing this gap and exploring the use of the database evolution history as an additional information source to aid database schema reverse engineering. We present a tool-supported method for analyzing the evolution history of legacy databases, and we report on a large-scale case study of reverse engineering a complex information system and curate it as a benchmark for future research efforts within the community.",

keywords = "Database understanding, Schema evolution, Software repository mining",

author = "Anthony Cleve and Loup Meurice and Maxime Gobert and Jerome Maes and Jens Weber",

year = "2015",

doi = "10.1016/j.scico.2013.11.025",

language = "English",

volume = "97",

pages = "113--121",

journal = "Science of Computer Programming",

issn = "0167-6423",

publisher = "Elsevier",

number = "P1",

}

TY - JOUR

T1 - Understanding database schema evolution

T2 - A case study

AU - Cleve, Anthony

AU - Meurice, Loup

AU - Gobert, Maxime

AU - Maes, Jerome

AU - Weber, Jens

PY - 2015

Y1 - 2015

N2 - Database reverse engineering (DRE) has traditionally been carried out by considering three main information sources: (1) the database schema, (2) the stored data, and (3) the application programs. Not all of these information sources are always available, or of sufficient quality to inform the DRE process. For example, getting access to real-world data is often extremely problematic for information systems that maintain private data. In recent years, the analysis of the evolution history of software programs have gained an increasing role in reverse engineering in general, but comparatively little such research has been carried out in the context of database reverse engineering. The goal of this paper is to contribute to narrowing this gap and exploring the use of the database evolution history as an additional information source to aid database schema reverse engineering. We present a tool-supported method for analyzing the evolution history of legacy databases, and we report on a large-scale case study of reverse engineering a complex information system and curate it as a benchmark for future research efforts within the community.

AB - Database reverse engineering (DRE) has traditionally been carried out by considering three main information sources: (1) the database schema, (2) the stored data, and (3) the application programs. Not all of these information sources are always available, or of sufficient quality to inform the DRE process. For example, getting access to real-world data is often extremely problematic for information systems that maintain private data. In recent years, the analysis of the evolution history of software programs have gained an increasing role in reverse engineering in general, but comparatively little such research has been carried out in the context of database reverse engineering. The goal of this paper is to contribute to narrowing this gap and exploring the use of the database evolution history as an additional information source to aid database schema reverse engineering. We present a tool-supported method for analyzing the evolution history of legacy databases, and we report on a large-scale case study of reverse engineering a complex information system and curate it as a benchmark for future research efforts within the community.

KW - Database understanding

KW - Schema evolution

KW - Software repository mining

UR - http://www.scopus.com/inward/record.url?scp=84910118991&partnerID=8YFLogxK

U2 - 10.1016/j.scico.2013.11.025

DO - 10.1016/j.scico.2013.11.025

M3 - Article

AN - SCOPUS:84910118991

SN - 0167-6423

VL - 97

SP - 113

EP - 121

JO - Science of Computer Programming

JF - Science of Computer Programming

IS - P1

ER -

Understanding database schema evolution: A case study

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Analyse empirique de la co-évolution et l'interaction sociale dans les systèmes logiciels orientés données

Evolution: Evolution

Analyzing, Understanding and Supporting the Evolution of Dynamic and Heterogeneous Data-Intensive Software Systems

Cite this

Understanding database schema evolution: A case study

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Projects

Analyse empirique de la co-évolution et l'interaction sociale dans les systèmes logiciels orientés données

Evolution: Evolution

Student theses

Analyzing, Understanding and Supporting the Evolution of Dynamic and Heterogeneous Data-Intensive Software Systems

Cite this