An Innovative Method to Extract Data in a Real-time Data Warehousing Environment


Flavio de Assis Vilela1 and Ricardo Rodrigues Ciferri2, 1Federal Institute of Goiás, Brazil, 2Federal University of São Carlos, Brazil


ETL (Extract, Transform, and Load) is an essential process required to perform data extraction in knowledge discovery in databases and in data warehousing environments. The ETL process aims to gather data that is available from operational sources, process and store them into an integrated data repository. Also, the ETL process can be performed in a real-time data warehousing environment and store data into a data warehouse. This paper presents a new and innovative method named Data Extraction Magnet (DEM) to perform the extraction phase of ETL process in a real-time data warehousing environment based on non-intrusive, tag and parallelism concepts. DEM has been validated on a dairy farming domain using synthetic data. The results showed a great performance gain in comparison to the traditional trigger technique and the attendance of real-time requirements.


ETL, real-time, data warehousing, data extraction.

Full Text  Volume 11, Number 24