[Gluster-users] Problems with write-behind with large files on Gluster 3.8.4
rgowdapp at redhat.com
Tue Feb 27 02:44:25 UTC 2018
On Tue, Feb 27, 2018 at 2:49 AM, Jim Prewett <download at carc.unm.edu> wrote:
> I'm having problems when write-behind is enabled on Gluster 3.8.4.
> I have 2 Gluster servers each with a single brick that is mirrored between
> them. The code causing these issues reads two data files each approx. 128G
> in size. It opens a third file, mmap()'s that file, and subsequently reads
> and writes to it. The third file, on sucessful runs (without write-behind
> enabled) is ultimately approx. 224G in size.
What exactly is the problem you are facing with write-behind enabled? Is it
that the file size is smaller?
> The servers have the IP addresses 172.17.2.254 and 172.17.2.255 and the
> client has the IP address 172.17.1.61. These are all IP over InfiniBand.
> I'm attaching logfiles for the brick and for the volume from each of the
> servers and for the client. I'm also attaching the output of "gluster
> volume info" and "gluster volume get <volume> all".
> I have only noticed problems with write-behind being enabled with this one
> particular workload. When I ran it under strace, I see it seeking all over
> the place and reading and writing little bits of data to/from the third
What is the pattern you see when write-behind is disabled? Can you attach
strace of the application for both scenarios - write-behind enabled and
disabled? Can you also explain the workload and its data access pattern?
> For now, I'm leaving write-behind disabled. What are the performance
> implications of this for jobs that don't have this strange access pattern?
Disabling write-behind can bring down performance for sequential workloads.
> My co-worker who usually maintains the Gluster filesystems here is busy
> having a baby right now and I've gotten it while he's out, so I'm /really/
> new to Gluster and am not confident that anything is correct in my
> configuration (nor do I have a specific reason to doubt its correctness! :)
> I have checked the InfiniBand fabric for errors and do not see any beyond
> the normal PortXmitWait counter. There is no firewall on any of these
> machines. Their system clocks seem to all be synchronized.
> Is there anything additional I can provide to help diagnose this problem?
> Thanks for any help you can provide! :)
> James E. Prewett Jim at Prewett.org download at hpc.unm.edu
> Systems Team Leader LoGS: http://www.hpc.unm.edu/~download/LoGS/
> Designated Security Officer OpenPGP key: pub 1024D/31816D93
> HPC Systems Engineer III UNM HPC 505.277.8210
> Gluster-users mailing list
> Gluster-users at gluster.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Gluster-users