Linux Symposium

Inline Block Level Data Deduplication for EXT3 File System

Amar Hemant More

Deduplication is basically an intelligent storage and compression technique that avoids saving redundant data onto the disk. Solid State Disk (SSD) media have gained popularity these days owing to their low power demands, resistance to natural shocks and vibrations and a high quality random access performance. But, these media come with limitations such as high cost, small capacity and a limited erase-write cycle lifespan. Inline deduplication approach helps alleviate these problems by avoiding redundant writes to the disk and making efficient use of disk space. In this paper a block level inline deduplication layer for EXT3 file system named the DEXT3 layer is proposed. This layer identifies the possibility of writing redundant data on to the disk by maintaining an in-core metadata structure of the previously written data. The data structure used is made persistent to the disk, ensuring that the deduplication process does not crumble owing to a system shutdown or reboot. The DEXT3 layer also takes care of the modification and the deletion a file whose blocks have been referred by other files, which otherwise would have created data loss issues for the referred files.

