keyboard_arrow_up
An Innovative Method to Extract Data in a Real-time Data Warehousing Environment

Authors

Flavio de Assis Vilela1 and Ricardo Rodrigues Ciferri2, 1Federal Institute of Goiás, Brazil, 2Federal University of São Carlos, Brazil

Abstract

ETL (Extract, Transform, and Load) is an essential process required to perform data extraction in knowledge discovery in databases and in data warehousing environments. The ETL process aims to gather data that is available from operational sources, process and store them into an integrated data repository. Also, the ETL process can be performed in a real-time data warehousing environment and store data into a data warehouse. This paper presents a new and innovative method named Data Extraction Magnet (DEM) to perform the extraction phase of ETL process in a real-time data warehousing environment based on non-intrusive, tag and parallelism concepts. DEM has been validated on a dairy farming domain using synthetic data. The results showed a great performance gain in comparison to the traditional trigger technique and the attendance of real-time requirements.

Keywords

ETL, real-time, data warehousing, data extraction.

Full Text  Volume 11, Number 24