One thing that always gives me pause for thought when dealing with any of the new storage or HCI offerings is the discussion of dedupe or space savings. Some are presented as ratios while others are a percentage. I was recently telling a colleague I was getting 97% dedupe on some VDI workloads while he was getting 4:1 on another type of workload.
It took us a few minutes to get the maths right to work out the numbers so we could compare like for like, and so to that end I decided to create myself a little table to save my aging brain cells at a later date.
One thing that is not immediately apparent to most is that it is a case of diminishing returns the higher your ratio, hence why I have done more of the lower values and included some higher ones for completeness.
At some point I might do a little calculator that gives exact answers but as a rough and ready aid, it served me well.
[table id=1 /]
EDIT: While the table above allows you to compare apples with apples numbers wise, it is not a full list of things to consider when looking at storage savings vendor vs vendor. The comment below by Rich Fenton from Nimble Storage gives a lot more insight into the other considerations. Great points and all worth considering when storage shopping.
Yes it’s a diminishing return as 2(5:1) != 10:1 !!
In fact, one ought to really consider the Effective useable capacity of a solution rather than look at Space Efficiency in isolation. Let me explain with this example (I’ll keep the math simple):
Let’s say we have three vendors all providing 10TB.
Vendor A’s has efficiency savings claim of 4:1,
Vendor B gets 3.5:1, and
Vendor C gets 5:1.
Immediately you will think Vendor C is the most efficient solution, followed by Vendor A and finally Vendor B in last place.
However this doesn’t show the entire picture…..
In order to see the true efficiency, you ought to work out how much useable you have after the taxes (RAID, Filesystem reserves etc). There is a lot of variation here, some vendors do very well with a low overhead 70% (meaning from 10TB you get 7TB of useable before efficiency is applied). Other vendors do less so well (at around 55% = 5.5TB) and some are damn right atrocious (they can as low as 33% = 3.3TB useable).
So let’s say Vendor A is one that gets 55% (5.5TB) and then provides 4:1 efficiency (5.5TB x 4 = 22TB effective capacity)
Vendor B gets 70% (7TB) and then provides 3.5:1 efficiency (7TB x 3.5 = 24.5TB useable).
Vendor C gets 33% (3.3TB) and the provides 5:1 efficiency (3.3TB x 5 = 16.5TB)
On first glance, Vendor C was more effective but looking at the complete effective capacity picture Vendor B is now much more effective!
In fact Vendor C offers nearly 30% less…
Remember always look at the complete picture and not just a number in isolation!