[Gluster-users] ZFS + Linux + Glusterfs for a production ready 100+ TB NAS on cloud

Di Pe dipeit at gmail.com
Tue Sep 27 00:50:00 UTC 2011


On Sun, Sep 25, 2011 at 5:51 AM, Joe Landman
<landman at scalableinformatics.com> wrote:
> On 09/25/2011 03:56 AM, Di Pe wrote:
>
>> So far the discussion has been focusing on XFS vs ZFS. I admit that I
>> am a fan of ZFS and I have only used XFS for performance reasons on
>> mysql servers where it did well. When I read something like this
>> http://oss.sgi.com/archives/xfs/2011-08/msg00320.html that makes me
>> not want to use XFS for big data. You can assume that this is a real
>
> This is a corner case bug, and one we are hoping we can get more data to the
> XFS team for.  They asked for specific information that we couldn't provide
> (as we had to fix the problem).  Note: other file systems which allow for
> sparse files *may* have similar issues.  We haven't tried yet.

Fair enough, but one of the things LLNL pointed out was that you have
to do fsck in the first place (aka standard file systems are not self
healing)

>
> The issues with ZFS on Linux have to do with legal hazards.  Neither Oracle,
> nor those who claim ZFS violates their patents, would be happy to see
> license violations, or further deployment of ZFS on Linux.  I know the
> national labs in the US are happily doing the integration from source.  But
> I don't think Oracle and the patent holders would sit idly by while others
> do this.  So you'd need to use a ZFS based system such as Solaris 11 express
> to be able to use it without hassle.  BSD and Illumos may work without issue
> as well, and should be somewhat better on the legal front than Linux + ZFS.
>  I am obviously not a lawyer, and you should consult one before you proceed
> down this route.
>
>> recent bug because Joe is a smart guy who knows exactly what he is
>> doing. Joe and the Gluster guys are vendors who can work around these
>> issues and provide support. If XFS is the choice, may be you should
>> hire them for this gig.
>>
>> ZFS typically does not have these FS repair issues in the first place.
>> The motivation of Lawrence Livermore for porting ZFS to Linux was
>> quite clear:
>>
>> http://zfsonlinux.org/docs/SC10_BoF_ZFS_on_Linux_for_Lustre.pdf
>>
>> OK, they have 50PB and we are talking about much smaller deployments.
>> However some of the limitations they report I can confirm. Also,
>> recovering from a drive failure with this whole LVM/Linux Raid stuff
>> is unpredictable. Hot swapping does not always work and if you
>> prioritize the re-sync of data to the new drive you can strangle the
>> entire box (by default the priority of the re-sync process is low on
>> linux). If you are a Linux expert you can handle this kind of stuff
>> (or hire someone) but if you ever want to give this setup to a Storage
>> Administrator you better give them something that they can use with
>> confidence (may be less of an issue in the cloud).
>> Compare to this to ZFS: re-silvering works with a very predictable
>> result and timing. There is a ton of info out there on this topic.  I
>> think that gluster users may be getting around many of the linux raid
>> issues by simply taking the entire node down (which is ok in mirrored
>> node settings) or by using hardware raid controllers. (which are often
>> not available in the cloud )
>
> There are definite advantages to better technology.  But the issue in this
> case is the legal baggage that goes along with them.
>
> BTRFS may, eventually, be a better choice.  The national labs can do this
> with something of an immunity to prosecution for license violation, by
> claiming the work is part of a research project, and won't actively be used
> in a way that would harm Oracle's interests.  And it would be ... bad ...
> for Oracle (and others) to sue to government over a relatively trivial
> violation.
>

I am trying to make sense what people discuss regarding the ZFS
licensing issue. Did you hear anything from anyone at Oracle that
would indicate that they don't like ZFS on Linux? If I think through
it I can't see why this would make any sense. The ZFS on Linux
community is extremely small and will probably always be and the main
reason besides data size is that the GPL doesn't like the CDDL not
vice-versa so distros shy away from it.
The LLNL people have found a way around the GPL2 issue by implementing
it as a driver.
Why doesn't Oracle sue Nexenta? Those guys have deployed 330PB of
their storage and would be a worthy  target.
The only company that seems to have issues with ZFS in general is
NetApp and I'm sure that they don't care whether it's installed on
Solaris or on Linux. NetApp interestingly sued CoRaid, a disk shelf
vendor that was using Nexenta as OS but they did not sue not Nexenta
itself. NetApp knew that their case was very weak. If they had sued
Nexenta, Nexenta would have fought back because the very existence of
the company would have been at risk. NetApp feared that Nexenta might
have won which would have confirmed the legitimacy of ZFS. CoRaid on
the other hand was not dependent on their ZFS solution for their
business to be able to continue. They were not willing to take any
risk and this allowed NetApp to spread some excellent FUD.
I would be interested to hear if there have been any comments anywhere
(written or verbal) by noteworthy Oracle representatives regarding the
ZFS on Linux question (except saying: "Oh, we won't GPL ZFS")


> Until Oracle comes out with an absolute declaration that its OK to use ZFS
> with Linux in a commercial setting ... yeah ... most vendors are gonna stay
> away from that scenario.
>

>> Some in the Linux community seem to be slightly opposed to ZFS (I
>> assume because of the licensing issue) and make sometimes odd
>> suggestions ("You should use BTRFS").
>
> Licensing mainly.  BTRFS has a better design, but its not ready yet. Won't
> be for a while.

I wonder if you can compare the two. While ZFS is also a volume
manager, software raid and has multiple builtin caching layers the
other FS are just FS. BTRFS seems to have some volume management in
place. I would be interested to know why you think BTRFS has a better
design than ZFS?



>
>
> --
> Joseph Landman, Ph.D
> Founder and CEO
> Scalable Informatics, Inc.
> email: landman at scalableinformatics.com
> web  : http://scalableinformatics.com
>       http://scalableinformatics.com/sicluster
> phone: +1 734 786 8423 x121
> fax  : +1 866 888 3112
> cell : +1 734 612 4615
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>



More information about the Gluster-users mailing list