A Brief Review Along With a New Proposed Approach of Data De Duplication

Suprativ Saha1 and Avik Samanta2, 1Global Institute of Management and Technology, India and 2JIS College of Engineering, India; Suprativ Saha1 and Avik Samanta2, 1Global Institute of Management and Technology, India and 2JIS College of Engineering, India

A Brief Review Along With a New Proposed Approach of Data De Duplication

Authors

Suprativ Saha¹ and Avik Samanta², ¹Global Institute of Management and Technology, India and ²JIS College of Engineering, India

Abstract

Recently the incremental growth of the storage space and data is parallel. At any instant data may go beyond than storage capacity. A good RDBMS should try to reduce the redundancies as far as possible to maintain the consistencies and storage cost. Apart from that a huge database with replicated copies wastes essential spaces which can be utilized for other purposes. The first aim should be to apply some techniques of data deduplication in the field of RDBMS. It is obvious to check the accessing time complexity along with space complexity. Here different techniques of data de duplication approaches are discussed. Finally based on the drawback of those approaches a new approach involving row id, column id and domain-key constraint of RDBMS is theoretically illustrated. Though apparently this model seems to be very tedious and non-optimistic, but in reality for a large database with lot of tables containing lot of lengthy fields it can be proved that it reduces the space complexity vigorously with same accessing speed.

Keywords

DeDuplication, SQL, Chunk, Genetic Algorithm, Replicated copy, Domain Key Constraint

CS&IT Conference Proceedings

A Brief Review Along With a New Proposed Approach of Data De Duplication