[Gluster-users] What NAS device(s) do you use? And why?

Marc Villemade mastachand at gmail.com
Wed Dec 15 10:15:34 UTC 2010



Sent from my iPad

On Dec 13, 2010, at 11:28 AM, Rudi Ahlers <Rudi at SoftDux.com> wrote:

> On Mon, Dec 13, 2010 at 2:52 AM, Marc Villemade <mastachand at gmail.com> wrote:
>> 
>> On Dec 11, 2010, at 5:34 PM, Rudi Ahlers wrote:
>> 
>>> On Sat, Dec 11, 2010 at 6:27 PM, Joe Landman
>>> <landman at scalableinformatics.com> wrote:
>>>> On 12/11/2010 11:17 AM, Rudi Ahlers wrote:
>> 
>>>> [..]
>>>> 
>>>> --
>>> 
>>> [...]
>> 
>> At scality, we have developed such an object store which scales smoothly up to petabytes with off-the shelf servers logically brought together in a ring.
>> While other solutions' performance usually degrade with time, our performance is similar to a high-end SAN from the start and stays roughly the same as we scale up to petabytes.
>> 
> [...]
> 
> Thanx, It's the first time I hear of the term "Object Storage". In all
> honest, from a technical view point, how does this differ from NAS /
> SAN's?
> 

Hey Rudi,

[Once again, disclaimer: i work for Scality, an object store platform developer]

Sure, let me try and explain.

I guess the core difference between object storage and NAS/SAN is that there is no filesystem involved (whether we're talking about server-side (NAS) or client-side (SAN) managed filesystem. This means that there is none of the limitations inherent to filesystems: number of inodes, number of files in a directory, etc.

In a nutshell, object storage is a system where stored data is referenced by a key assigned to the object at creation and which is used for subsequent retrievals. There is no folders, or paths to a file in its core concept. Objects are usually replicated to ensure reliability and availability, with metadata attached to the objects for many uses (replication and retention policy, tiering, keyword tagging ...).

Object storage is sometimes refered to as cloud storage as well. It is true that the cloud storage services (a la Amazon, Rackspace in the US or Dunkel/ScaleUp in Europe) are storing objects basically, but the difference is that the underlying storage is not necesarily "object".
Object storage is also somewhat closely related to CAS (Content addressable storage) which is mostly used for fixed content storage, so very popular for archival and storage needing high levels of compliancy with government regulations. Objects in CAS are addressed through a hash of their payload (hence the name) which makes it hard to have modifiable content as the addresing would change for each new modification.

For the unstructured data, the most growing data set in the world right now, object storage is perfect as it is maps really easily with these datasets' needs:
	- "unstructured" storage (objects are not necesarily linked to each other, although they can be thru metadata tagging),
	- when correctly implemented, object storage should be a much more scalable system than regular filesystems so for exploding datasets, it makes much more sense.

Now, why should it be a much more scalable system, you might ask ? :-D

Without considering the economic aspect, object storage technology are void of volume management, as it should be a flat addressable space with virtually no limits, and without the filesystem limits, growing to billions of objects is possible (whereas storing billions of objects on a NAS/SAN without losing performance might prove difficult). It also depends on the technology, some have an object location database that creates a bottleneck and lowers the scalability and reliability of the system. 

Then, there is the economic aspect. Depending on the technology, off the shelf servers and disks can be used, which makes it easy to set up a new service in a competing market, or to move to object storage with a large existing dataset without investing millions and millions of $$.

Object storage is perfect for unstructured data and for other applications (email, backup, media, archiving..). It is not a good fit for relational databases, for example. But, as i said earlier, the most exploding datasets these days are in the unstructured data realm.

Depending on the type of data, access patterns and applications one needs to use, object storage is usually the way to go to control costs while having a reliable and durable storage environment when hitting hundreds of Terabytes and more.

Over here at Scality, our object storage platform has all these characteristics (no volume management, elastic growth, no central database ..). Our key differentiation with other people in the space is that we bring roughly the same performance than SAN/NAS systems and all the object storage advantages. And then some .. If you want more information, let me know ;)

There, I hope this helps you understand a bit more about object storage. Sorry i carried on so much ;)

Happy holidays everyone !

Cheers

-Marc Villemade
http://j.mp/e1pjfo


More information about the Gluster-users mailing list