[Gluster-users] Rsync
Hiren Joshi
josh at moonfruit.com
Wed Oct 7 10:23:30 UTC 2009
The initial copy has to happen via gluster as I'm also using
distribution as well as replication....
> -----Original Message-----
> From: Stephan von Krawczynski [mailto:skraw at ithnet.com]
> Sent: 06 October 2009 16:39
> To: Hiren Joshi
> Cc: Pavan Vilas Sondur; gluster-users at gluster.org
> Subject: Re: [Gluster-users] Rsync
>
> Remember, the gluster-team does not like my way of
> data-feeding. If your setup
> blows up, don't blame them (or me :-)
> I can only tell you what I am doing: simply move (or copy)
> the initial data to
> the primary server of the replication setup and then start
> glusterfsd for
> exporting.
> You will notice that the data gets replicated as soon as
> stat is going on
> (first ls or the like). If you already exported the data via
> nfs before you
> probably only need to setup up glusterfs on the very same box
> and use it as
> primary server. Then there is no data copying at all.
>
> After months of experiments I can say that glusterfs runs
> pretty stable on
> _low_ performance setups. But you have to do one thing: lengthen the
> ping-timeout (something like "option ping-timeout 120").
> If you do not do that you will loose some of your server(s)
> at any time and
> that will turn your glusterfs setup in a mess.
> If your environment is ok, it works. If your environment
> fails it will fail,
> too, sooner or later. In other words: it exports data, but it
> does not fulfill
> the promise of keeping your setup alive during failures - at
> this stage.
> My advice for the team is to stop whatever they may work on
> and take for
> physical boxes (2 server, 2 client), run a lot of bonnies and
> unplug/re-plug
> the servers non-deterministic. You can find all kinds of
> weirdos this way.
>
> Regards,
> Stephan
>
>
> On Mon, 5 Oct 2009 16:49:53 +0100
> "Hiren Joshi" <josh at moonfruit.com> wrote:
>
> > My users are more pitch fork less shooting.....
> >
> > I don't understand what you're saying, should I have
> locally copied all
> > the files over not using gluster before attempting an rsync?
> >
> > > -----Original Message-----
> > > From: Stephan von Krawczynski [mailto:skraw at ithnet.com]
> > > Sent: 05 October 2009 14:13
> > > To: Hiren Joshi
> > > Cc: Pavan Vilas Sondur; gluster-users at gluster.org
> > > Subject: Re: [Gluster-users] Rsync
> > >
> > > It would be nice to remember my thread about _not_ copying
> > > data initially to
> > > gluster via the mountpoint. And one major reason for _local_
> > > feed was: speed.
> > > Obviously a lot of cases are merely impossible because of the
> > > pure waiting
> > > time. If you had a live setup people would have already
> shot you...
> > > This is why I talked about a feature and not an accepted bug
> > > behaviour.
> > >
> > > Regards,
> > > Stephan
> > >
> > >
> > > On Mon, 5 Oct 2009 11:00:36 +0100
> > > "Hiren Joshi" <josh at moonfruit.com> wrote:
> > >
> > > > Just a quick update: The rsync is *still* not finished.
> > > >
> > > > > -----Original Message-----
> > > > > From: gluster-users-bounces at gluster.org
> > > > > [mailto:gluster-users-bounces at gluster.org] On Behalf Of
> > > Hiren Joshi
> > > > > Sent: 01 October 2009 16:50
> > > > > To: Pavan Vilas Sondur
> > > > > Cc: gluster-users at gluster.org
> > > > > Subject: Re: [Gluster-users] Rsync
> > > > >
> > > > > Thanks!
> > > > >
> > > > > I'm keeping a close eye on the "is glusterfs DHT really
> > > distributed?"
> > > > > thread =)
> > > > >
> > > > > I tried nodelay on and unhashd no. I tarred about 400G to
> > > the share in
> > > > > about 17 hours (~6MB/s?) and am running an rsync now.
> > > Will post the
> > > > > results when it's done.
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Pavan Vilas Sondur [mailto:pavan at gluster.com]
> > > > > > Sent: 01 October 2009 09:00
> > > > > > To: Hiren Joshi
> > > > > > Cc: gluster-users at gluster.org
> > > > > > Subject: Re: Rsync
> > > > > >
> > > > > > Hi,
> > > > > > We're looking into the problem on similar setups and
> > > workng on it.
> > > > > > Meanwhile can you let us know if performance
> increases if you
> > > > > > use this option:
> > > > > >
> > > > > > option transport.socket.nodelay on' in each of your
> > > > > > protocol/client and protocol/server volumes.
> > > > > >
> > > > > > Pavan
> > > > > >
> > > > > > On 28/09/09 11:25 +0100, Hiren Joshi wrote:
> > > > > > > Another update:
> > > > > > > It took 1240 minutes (over 20 hours) to complete on
> > > the simplified
> > > > > > > system (without mirroring). What else can I do to debug?
> > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: gluster-users-bounces at gluster.org
> > > > > > > > [mailto:gluster-users-bounces at gluster.org] On Behalf Of
> > > > > > Hiren Joshi
> > > > > > > > Sent: 24 September 2009 13:05
> > > > > > > > To: Pavan Vilas Sondur
> > > > > > > > Cc: gluster-users at gluster.org
> > > > > > > > Subject: Re: [Gluster-users] Rsync
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: Pavan Vilas Sondur [mailto:pavan at gluster.com]
> > > > > > > > > Sent: 24 September 2009 12:42
> > > > > > > > > To: Hiren Joshi
> > > > > > > > > Cc: gluster-users at gluster.org
> > > > > > > > > Subject: Re: Rsync
> > > > > > > > >
> > > > > > > > > Can you let us know the following:
> > > > > > > > >
> > > > > > > > > * What is the exact directory structure?
> > > > > > > > /abc/def/ghi/jkl/[1-4]
> > > > > > > > now abc, def, ghi and jkl are one of a thousand dirs.
> > > > > > > >
> > > > > > > > > * How many files are there in each individual
> > > directory and
> > > > > > > > > of what size?
> > > > > > > > Each of the [1-4] dirs has about 100 files in, all
> > > under 1MB.
> > > > > > > >
> > > > > > > > > * It looks like each server process has 6 export
> > > > > > > > > directories. Can you run one server process each
> > > for a single
> > > > > > > > > export directory and check if the rsync speeds up?
> > > > > > > > I had no idea you could do that. How? Would I need to
> > > > > > create 6 config
> > > > > > > > files and start gluster:
> > > > > > > >
> > > > > > > > /usr/sbin/glusterfsd -f /etc/glusterfs/export1.vol
> > > or similar?
> > > > > > > >
> > > > > > > > I'll give this a go....
> > > > > > > >
> > > > > > > > > * Also, do you have any benchmarks with a
> > > similar setup on
> > > > > > > > say, NFS?
> > > > > > > > NFS will create the dir tree in about 20
> minutes then start
> > > > > > > > copying the
> > > > > > > > files over, it takes about 2-3 hours.
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Pavan
> > > > > > > > >
> > > > > > > > > On 24/09/09 12:13 +0100, Hiren Joshi wrote:
> > > > > > > > > > It's been running for over 24 hours now.
> > > > > > > > > > Network traffic is nominal, top shows about
> > > 200-400% cpu
> > > > > > > > (7 cores so
> > > > > > > > > > it's not too bad).
> > > > > > > > > > About 14G of memory used (the rest is being used as
> > > > > > disk cache).
> > > > > > > > > >
> > > > > > > > > > Thoughts?
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > <snip>
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > An update, after running the rsync
> for a day,
> > > > > > I killed it
> > > > > > > > > > > > > and remounted
> > > > > > > > > > > > > > all the disks (the underlying
> > > filesystem, not the
> > > > > > > > gluster)
> > > > > > > > > > > > > with noatime,
> > > > > > > > > > > > > > the rsync completed in about 600
> > > minutes. I'm now
> > > > > > > > going to
> > > > > > > > > > > > > try one level
> > > > > > > > > > > > > > up (about 1,000,000,000 dirs).
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > -----Original Message-----
> > > > > > > > > > > > > > > From: Pavan Vilas Sondur
> > > > > [mailto:pavan at gluster.com]
> > > > > > > > > > > > > > > Sent: 23 September 2009 07:55
> > > > > > > > > > > > > > > To: Hiren Joshi
> > > > > > > > > > > > > > > Cc: gluster-users at gluster.org
> > > > > > > > > > > > > > > Subject: Re: Rsync
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi Hiren,
> > > > > > > > > > > > > > > What glusterfs version are you
> using? Can you
> > > > > > > > send us the
> > > > > > > > > > > > > > > volfiles and the log files.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Pavan
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On 22/09/09 16:01 +0100, Hiren
> Joshi wrote:
> > > > > > > > > > > > > > > > I forgot to mention, the mount is
> > > mounted with
> > > > > > > > > > > > > direct-io, would this
> > > > > > > > > > > > > > > > make a difference?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > -----Original Message-----
> > > > > > > > > > > > > > > > > From:
> gluster-users-bounces at gluster.org
> > > > > > > > > > > > > > > > >
> > > [mailto:gluster-users-bounces at gluster.org] On
> > > > > > > > > Behalf Of
> > > > > > > > > > > > > > > Hiren Joshi
> > > > > > > > > > > > > > > > > Sent: 22 September 2009 11:40
> > > > > > > > > > > > > > > > > To: gluster-users at gluster.org
> > > > > > > > > > > > > > > > > Subject: [Gluster-users] Rsync
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Hello all,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I'm getting what I think is bizarre
> > > > > > > > > behaviour.... I have
> > > > > > > > > > > > > > > about 400G to
> > > > > > > > > > > > > > > > > rsync (rsync -av) onto a
> gluster share,
> > > > > > the data is
> > > > > > > > > > > > > in a directory
> > > > > > > > > > > > > > > > > structure which has about 1000
> > > directories
> > > > > > > > > per parent and
> > > > > > > > > > > > > > > about 1000
> > > > > > > > > > > > > > > > > directories in each of them.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > When I try to rsync an end leaf
> > > > > directory (this
> > > > > > > > > > > has about 4
> > > > > > > > > > > > > > > > > dirs and 100
> > > > > > > > > > > > > > > > > files in each) the operation
> > > takes about 10
> > > > > > > > > > > seconds. When I
> > > > > > > > > > > > > > > > > go one level
> > > > > > > > > > > > > > > > > above (1000 dirs with about 4
> > > dirs in each
> > > > > > > > > with about 100
> > > > > > > > > > > > > > > > > files in each)
> > > > > > > > > > > > > > > > > the operation takes about 10 minutes.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Now, if I then go one level
> above that
> > > > > > (that's 1000
> > > > > > > > > > > > dirs with
> > > > > > > > > > > > > > > > > 1000 dirs
> > > > > > > > > > > > > > > > > in each with about 4 dirs in each
> > > with about
> > > > > > > > > 100 files in
> > > > > > > > > > > > > > > each) the
> > > > > > > > > > > > > > > > > operation takes days! Top shows
> > > glusterfsd
> > > > > > > > > takes 300-600%
> > > > > > > > > > > > > > > cpu usage
> > > > > > > > > > > > > > > > > (2X4core), I have about 48G of memory
> > > > > > > > (usage is 0% as
> > > > > > > > > > > > > expected).
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Has anyone seen anything like
> > > this? How can I
> > > > > > > > > speed it up?
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Josh.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > _______________________________________________
> > > > > > > > > > > > > > > > Gluster-users mailing list
> > > > > > > > > > > > > > > > Gluster-users at gluster.org
> > > > > > > > > > > > > > > >
> > > > > > > > >
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
> > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > _______________________________________________
> > > > > > > > > > > > Gluster-users mailing list
> > > > > > > > > > > > Gluster-users at gluster.org
> > > > > > > > > > > >
> > > > > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
> > > > > > > > > > > >
> > > > > > > > >
> > > > > > > > _______________________________________________
> > > > > > > > Gluster-users mailing list
> > > > > > > > Gluster-users at gluster.org
> > > > > > > >
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
> > > > > > > >
> > > > > >
> > > > > _______________________________________________
> > > > > Gluster-users mailing list
> > > > > Gluster-users at gluster.org
> > > > > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
> > > > >
> > > > _______________________________________________
> > > > Gluster-users mailing list
> > > > Gluster-users at gluster.org
> > > > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
> > > >
> > >
> > >
> >
>
>
>
More information about the Gluster-users
mailing list