Title: Data
Integration for Multidimensional Data Models.
Abstract
Consider
a scenario where, as a result of company acquisitions and mergers, a
number of related, but possible disparate, data marts need to be
integrated into a global data warehouse. The ability to retrieve data
across these disparate, but related, data marts posed an important
challenge. For example, forming an all-inclusive data warehouse
includes the tedious tasks of identifying related fact and dimension
table attributes, as required for a schema merge. Additionally, the
evaluation of the combined set of correct answers to queries, likely
to be independently posed to such data marts, becomes difficult to
achieve.
Model
management refers to a high-level, abstract programming language
designed to efficiently manipulate schemas and mappings, with
applications in meta-data management, e-Commerce and data
integration, amongst others. Particularly, model and meta-data
management operations, in the form of Match
and Merge
algorithms,
offer a way to address the above-mentioned data integration and
schema matching issues within the data warehousing domain.
In
this presentation, we introduce an approach for the integration of
source data marts into a global data warehouse. We discuss the
development of three (3) streamlined steps to facilitate the
generation of a global data warehouse. That is, we present techniques
for deriving attribute correspondences, for schema mapping discovery,
as well as a merge algorithm, within the context of multidimensional
star schemas. Our approach focuses on delivering a polynomial time
and near-optimal solution, needed for expected large volume of data
and its associated large-scale query processing.