[Gluster-users] File IO issues during brick unreachable in replica config
David Coulson
david at davidcoulson.net
Sun Jun 3 15:05:09 UTC 2012
I've a volume in a 4 way replica configuration running 3.3.0 - Two
bricks are in one datacenter, two are in the other. We had some sort of
connectivity issue between the two facilities this morning, and
applications utilizing gluster mounts (via NFS; in this case only-read
work load) experienced IO timeouts.
I've a 5s network timeout on the volume, and a 20s timeout on the
application - I'd expect even if it went through 3 bricks before it
found a good one for a read, it would take 10s.
What is the expectation for a read which occurs when a brick is in the
process of failing? Should the IO fail, or should it be re-routed to an
available brick? I don't see anything specific in nfs.log indicating a
particular read failed, just that the bricks went up/down.
Info is below - Let me know if there are other logs I need to look at.
[root at dresproddns02 glusterfs]# gluster volume info svn
Volume Name: svn
Type: Replicate
Volume ID: fabe320d-5ef2-4f35-9720-eab617e13dde
Status: Started
Number of Bricks: 1 x 4 = 4
Transport-type: tcp
Bricks:
Brick1: rhesproddns01:/gluster/svn
Brick2: rhesproddns02:/gluster/svn
Brick3: dresproddns01:/gluster/svn
Brick4: dresproddns02:/gluster/svn
Options Reconfigured:
performance.write-behind-window-size: 128Mb
performance.cache-size: 256Mb
auth.allow: 10.250.53.*,10.252.248.*,169.254.*,127.0.0.1
nfs.register-with-portmap: on
nfs.disable: off
performance.stat-prefetch: 1
network.ping-timeout: 5
performance.flush-behind: on
performance.client-io-threads: 1
nfs.rpc-auth-allow: 127.0.0.1
nfs.log output is here:
http://pastebin.com/CNmP4s32
More information about the Gluster-users
mailing list