Corporate PC backup: Data Deduplication Requirements
Posted by Puneesh Chaudhry on Fri, Aug 20, 2010 @ 10:00 AM
This post is part of a Series on planning for corporate PC backup in your organization. Previously, I’ve focused on backup requirements, but in this series of posts on data deduplication I’ll explore the various deduplication technologies that are available and which ones are best suited for desktop laptop backup. I’ll be exploring the topics listed below. I’ll explore why global deduplication is the only type of dedupe that makes sense for the desktop and laptop backup. I’ll also explain why Object-based Deduplication provides the best efficiency and how it overcomes the shortcomings of block level dedupe.
- Anatomy of a dedupe system: this post provides an introduction to the de-dup process, the various components and how they work together to identify duplicate data in your system.
- Where to deduplicate: The old adage: Location, location, location applies to dedupe also. This post looks at the various places in the system where deduplication is performed and analyzes their suitability to managing laptop and desktop data.
- Identifying duplicate data: This post evaluates 4 approaches to chunking data for purposes of deduplication: file based, delta-block based, block level and object based, and their relative merits.
- Block level deduplication challenges: This post identifies two major shortcomings of block level deduplication, namely 1) finding duplicate data for the first pass or first backup, and 2) finding duplicate data when the physical layout of a file changes e.g. when the slides in a PowerPoint are re-ordered.
- Variable length deduplication: This post examines what is commonly called “Variable Length Deduplication”, which is a slight variation of fixed length block Deduplication and the situations where it is useful?
- Object Based data deduplication: This post introduces the concept of Object level data deduplication and explains why it is superior to block level deduplication.