[Gluster-users] gluster 3.4.5: lots of permission problems after add-brick/rebalance

Colin Coghill colin.coghill at koordinates.com
Thu Oct 30 04:46:35 UTC 2014


Hi,

We run a fairly large two server replica with 5 2TB bricks on each. We
recently added two more bricks each and started a rebalance.

We immediately started getting client errors and halted the rebalance.

Since then we've been getting more and more errors.

Symptoms:

Client:

  gets Permission denied errors when accessing a file. If root, or
  occasionally another user accesses the file, it works, then works for the
  original user on that client.

  logs contain lots of:

     [2014-10-30 03:14:08.912942] W
[client-rpc-fops.c:259:client3_3_mknod_cbk] 0-storage-client-3: remote
operation failed: Permission denied. Path:
/wms_pyramid/r_00000525/12285/r_00000525/8/8.qix
(00000000-0000-0000-0000-000000000000)


   errors.


Server:

   link files and source files (that have caused problems above) seem to
have different ownership:

   $ ls -l brick*b/vol/raster/r_000002ff/000008ba.tif
    ---------T 2 otm       otm              0 Oct 30 17:03
brick0b/vol/raster/r_000002ff/000008ba.tif
    -rw-rw-r-- 2 dandelion dandelion 15867977 Mar 22  2013
brick1b/vol/raster/r_000002ff/000008ba.tif


  $ sudo getfattr -e hex -m- -d brick*b/vol/raster/r_000002ff/000008ba.tif
  # file: brick0b/vol/raster/r_000002ff/000008ba.tif
  trusted.afr.storage-client-0=0x000000000000000100000000
  trusted.afr.storage-client-1=0x000000000000000100000000
  trusted.gfid=0x54b3b2ae42504754a193505a933c30b7
  trusted.glusterfs.dht.linkto=0x73746f726167652d7265706c69636174652d3100

  # file: brick1b/vol/raster/r_000002ff/000008ba.tif
  trusted.afr.storage-client-2=0x000000000000000000000000
  trusted.afr.storage-client-3=0x000000000000000000000000
  trusted.gfid=0x54b3b2ae42504754a193505a933c30b7


  logs contain lots of:

   [2014-10-29 19:43:35.838942] I [server-rpc-fops.c:575:server_mknod_cbk]
0-storage-server: 15505: MKNOD
/media/settings/branding/mfe-logo-white-2-1.png
(1776f2b8-8857-4801-8c58-266eafcd7a87/mfe-logo-white-2-1.png) ==>
(Permission denied)

  and

   [2014-10-30 04:28:58.707352] E [marker.c:2080:marker_setattr_cbk]
0-storage-marker: Operation not permitted occurred during setattr of <nul>
[2014-10-30 04:28:58.707407] I [server-rpc-fops.c:1778:server_setattr_cbk]
0-storage-server: 1210198: SETATTR /raster/r_0000081a/0000005e.jp2.aux.xml
(0df0682b-0791-4327-bb5d-72ed916349fd) ==> (Operation not permitted)


  This is happening to old files that haven't been changed since long
before the rebalance, and is still happening, even though I believe the
rebalance has been stopped.

We have restarted gluster-server on both servers.   volume heal  shows no
current split-brain or heal-failed.


 It does seem, to me, to match:
     https://bugzilla.redhat.com/show_bug.cgi?id=884597

 Except that is supposedly fixed before 3.4.5.


Help!?


- Colin
-- 
--
Colin Coghill
DevOps Engineer
Koordinates
colin.coghill at koordinates.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20141030/351915e0/attachment.html>


More information about the Gluster-users mailing list