Previous Table of Contents Next


2.3.1 ETL Scenario


   Extract-Transform-Load (ETL) is a common term for the warehouse load process comprising a set of data movement operations, each from a data source to a data target with some transforming or restructuring logic applied.

   The ETL Scenario starts by defining a CWM Transformation model for movement from a data source to a data target. Parameters of the source data, target data, and transformation logic are assigned values in the model. Source data parameters depend on the type of the data source (object-oriented, relational, record-oriented, multidimensional, or XML). Target data parameters are similarly chosen. Transformation logic parameters include identification of a transformation component and of data sources and data targets. The transformation component is a method composed of a possibly large hierarchy of components (commercial tools, commercial libraries, custom scripts) whose detailed structure is defined elsewhere.

   An ETL process is realized by a number of components across several CWM packages. A CWM warehouse process may launch an ETL process as a scheduled operation consisting of a number of transformation steps executed in sequence.

   For example, the first transformation consists of the extraction and filtering of data from any of a number of possible data sources. A second transformation cleanses, combines, or otherwise reduces the data and then stores it in a normalized format in some primary relational database of the warehouse. A third transformation selects certain rows from the primary relational database and loads their values into the input cells of a multidimensional database. Finally, the CWM warehouse process might instruct the multidimensional database to re-calculate its aggregated cells based on the new input data.