[Gluster-users] NFS I/O errors after replicated -> distributed+replicate add-brick

Tom Pepper tom at encoding.com
Thu Feb 26 22:25:40 UTC 2015


Hi, all:

We had a two-node gluster cluster (replicated, 2 replicas) that recently we added two more node/bricks to and performed a rebalance upon, thus making it a distributed-replicate volume.  Since doing so, we now see for any NFS access, read or write, a “Remote I/O error” whenever performing any operation (stat, read, write, whatever) although the operation appears to in fact succeed.

I don’t actually see any information in the gluster logs that would assist.  The bricks are backstored on ZFS vols.

Any hints?  It’s Gluster 3.6.2 on Ubuntu Trusty.

Clients using glusterfs throw some concerning errors as well - see bottom below for examples.

Thanks,
-t



Status of volume: edc1
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick fs4:/fs4/edc1					49154	Y	2435
Brick fs5:/fs5/edc1					49154	Y	2328
Brick hdfs5:/hdfs5/edc1					49152	Y	26725
Brick hdfs6:/hdfs6/edc1					49152	Y	4994
NFS Server on localhost					2049	Y	31503
Self-heal Daemon on localhost				N/A	Y	31510
NFS Server on 10.54.90.13				2049	Y	16310
Self-heal Daemon on 10.54.90.13				N/A	Y	16317
NFS Server on hdfs6					2049	Y	5006
Self-heal Daemon on hdfs6				N/A	Y	5013
NFS Server on hdfs5					2049	Y	26737
Self-heal Daemon on hdfs5				N/A	Y	26744
 
Task Status of Volume edc1
------------------------------------------------------------------------------
Task                 : Rebalance           
ID                   : b3095ab2-c428-4681-b545-36941a8816f6
Status               : completed           
 
Volume Name: edc1
Type: Distributed-Replicate
Volume ID: 2f6b5804-e2d8-4400-93e9-b172952b1aae
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: fs4:/fs4/edc1
Brick2: fs5:/fs5/edc1
Brick3: hdfs5:/hdfs5/edc1
Brick4: hdfs6:/hdfs6/edc1
Options Reconfigured:
performance.cache-size: 1GB
performance.write-behind-window-size: 1GB

volume edc1-client-0
    type protocol/client
    option send-gids true
    option transport-type tcp
    option remote-subvolume /fs4/edc1
    option remote-host fs4
    option ping-timeout 42
end-volume

volume edc1-client-1
    type protocol/client
    option send-gids true
    option transport-type tcp
    option remote-subvolume /fs5/edc1
    option remote-host fs5
    option ping-timeout 42
end-volume

volume edc1-client-2
    type protocol/client
    option send-gids true
    option transport-type tcp
    option remote-subvolume /hdfs5/edc1
    option remote-host hdfs5
    option ping-timeout 42
end-volume

volume edc1-client-3
    type protocol/client
    option send-gids true
    option transport-type tcp
    option remote-subvolume /hdfs6/edc1
    option remote-host hdfs6
    option ping-timeout 42
end-volume

volume edc1-replicate-0
    type cluster/replicate
    subvolumes edc1-client-0 edc1-client-1
end-volume

volume edc1-replicate-1
    type cluster/replicate
    subvolumes edc1-client-2 edc1-client-3
end-volume

volume edc1-dht
    type cluster/distribute
    subvolumes edc1-replicate-0 edc1-replicate-1
end-volume

volume edc1-write-behind
    type performance/write-behind
    option cache-size 1GB
    subvolumes edc1-dht
end-volume

volume edc1-read-ahead
    type performance/read-ahead
    subvolumes edc1-write-behind
end-volume

volume edc1-io-cache
    type performance/io-cache
    option cache-size 1GB
    subvolumes edc1-read-ahead
end-volume

volume edc1-quick-read
    type performance/quick-read
    option cache-size 1GB
    subvolumes edc1-io-cache
end-volume

volume edc1-open-behind
    type performance/open-behind
    subvolumes edc1-quick-read
end-volume

volume edc1-md-cache
    type performance/md-cache
    subvolumes edc1-open-behind
end-volume

volume edc1
    type debug/io-stats
    option count-fop-hits off
    option latency-measurement off
    subvolumes edc1-md-cache
end-volume





[2015-02-26 22:13:07.473839] I [dht-common.c:1822:dht_lookup_cbk] 0-edc1-dht: Entry /cc/aspera/sandbox/mwt-wea/.local missing on subvol edc1-replicate-0
[2015-02-26 22:13:07.474890] I [dht-common.c:1822:dht_lookup_cbk] 0-edc1-dht: Entry /cc/aspera/sandbox/mwt-wea/.local missing on subvol edc1-replicate-0
[2015-02-26 22:13:07.475891] I [dht-common.c:1822:dht_lookup_cbk] 0-edc1-dht: Entry /cc/aspera/sandbox/mwt-wea/.local missing on subvol edc1-replicate-0
[2015-02-26 22:13:07.531037] I [dht-common.c:1822:dht_lookup_cbk] 0-edc1-dht: Entry /cc/aspera/sandbox/mwt-wea/.local missing on subvol edc1-replicate-0
[2015-02-26 22:13:07.532210] I [dht-common.c:1822:dht_lookup_cbk] 0-edc1-dht: Entry /cc/aspera/sandbox/mwt-wea/.local missing on subvol edc1-replicate-0
[2015-02-26 22:13:07.533238] I [dht-common.c:1822:dht_lookup_cbk] 0-edc1-dht: Entry /cc/aspera/sandbox/mwt-wea/.local missing on subvol edc1-replicate-0
[2015-02-26 22:13:08.157109] I [dht-common.c:1822:dht_lookup_cbk] 0-edc1-dht: Entry /cc/aspera/sandbox/mwt-wea/.pam_environment missing on subvol edc1-replicate-0
[2015-02-26 22:13:08.158555] I [dht-common.c:1822:dht_lookup_cbk] 0-edc1-dht: Entry /cc/aspera/sandbox/mwt-wea/.pam_environment missing on subvol edc1-replicate-0
[2015-02-26 22:13:08.200168] I [dht-common.c:1822:dht_lookup_cbk] 0-edc1-dht: Entry /cc/aspera/sandbox/mwt-wea/.ssh missing on subvol edc1-replicate-0
[2015-02-26 22:13:08.252177] I [dht-common.c:1822:dht_lookup_cbk] 0-edc1-dht: Entry /cc/aspera/sandbox/mwt-wea/.local missing on subvol edc1-replicate-0
[2015-02-26 22:13:08.253435] I [dht-common.c:1822:dht_lookup_cbk] 0-edc1-dht: Entry /cc/aspera/sandbox/mwt-wea/.local missing on subvol edc1-replicate-0
[2015-02-26 22:13:08.254591] I [dht-common.c:1822:dht_lookup_cbk] 0-edc1-dht: Entry /cc/aspera/sandbox/mwt-wea/.local missing on subvol edc1-replicate-0
[2015-02-26 22:13:18.356816] I [dht-common.c:1822:dht_lookup_cbk] 0-edc1-dht: Entry /cc/aspera/sandbox/mwt-wea/.local missing on subvol edc1-replicate-0
[2015-02-26 22:13:18.358082] I [dht-common.c:1822:dht_lookup_cbk] 0-edc1-dht: Entry /cc/aspera/sandbox/mwt-wea/.local missing on subvol edc1-replicate-0
[2015-02-26 22:13:18.359225] I [dht-common.c:1822:dht_lookup_cbk] 0-edc1-dht: Entry /cc/aspera/sandbox/mwt-wea/.local missing on subvol edc1-replicate-0
[2015-02-26 22:13:23.915730] I [dht-common.c:1822:dht_lookup_cbk] 0-edc1-dht: Entry /cc/aspera/sandbox/mwt-wea/.pam_environment missing on subvol edc1-replicate-0
[2015-02-26 22:13:23.916856] I [dht-common.c:1822:dht_lookup_cbk] 0-edc1-dht: Entry /cc/aspera/sandbox/mwt-wea/.pam_environment missing on subvol edc1-replicate-0
[2015-02-26 22:13:23.968891] I [dht-common.c:1822:dht_lookup_cbk] 0-edc1-dht: Entry /cc/aspera/sandbox/mwt-wea/.ssh missing on subvol edc1-replicate-0
[2015-02-26 22:13:24.024415] I [dht-common.c:1822:dht_lookup_cbk] 0-edc1-dht: Entry /cc/aspera/sandbox/mwt-wea/.local missing on subvol edc1-replicate-0
[2015-02-26 22:13:24.025687] I [dht-common.c:1822:dht_lookup_cbk] 0-edc1-dht: Entry /cc/aspera/sandbox/mwt-wea/.local missing on subvol edc1-replicate-0
[2015-02-26 22:13:24.026904] I [dht-common.c:1822:dht_lookup_cbk] 0-edc1-dht: Entry /cc/aspera/sandbox/mwt-wea/.local missing on subvol edc1-replicate-0


More information about the Gluster-users mailing list