[Bugs] [Bug 1233518] New: [Backup]: Glusterfind session(s) created before starting the volume results in 'changelog not available' error, eventually

bugzilla at redhat.com bugzilla at redhat.com
Fri Jun 19 05:38:01 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1233518

            Bug ID: 1233518
           Summary: [Backup]: Glusterfind session(s) created before
                    starting the volume results in 'changelog not
                    available' error, eventually
           Product: GlusterFS
           Version: 3.7.0
         Component: glusterfind
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: khiremat at redhat.com
        QA Contact: bugs at gluster.org
                CC: avishwan at redhat.com, bugs at gluster.org,
                    khiremat at redhat.com, mchangir at redhat.com,
                    sanandpa at redhat.com
        Depends On: 1228598, 1232729
            Blocks: 1223636



+++ This bug was initially created as a clone of Bug #1232729 +++

+++ This bug was initially created as a clone of Bug #1228598 +++

Description of problem:

Have a 4+2 disperse volume 'disperse' and create glusterfind sessions for the
same. Mount it over nfs/fuse and create a couple of files/dirs. Execute the
glusterfind pre command it fails with 'Historical changelogs not available' on
both the nodes of the cluster


[root at dhcp43-140 ~]# 
[root at dhcp43-140 ~]# gluster v status dispersevol
Status of volume: dispersevol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.43.140:/rhs/thinbrick1/disperse 49188     0          Y       21579
Brick 10.70.43.140:/rhs/thinbrick2/disperse 49189     0          Y       21599
Brick 10.70.42.75:/rhs/thinbrick2/disperse  49184     0          Y       17892
Brick 10.70.42.75:/rhs/thinbrick1/disperse  49185     0          Y       17912
Brick 10.70.43.140:/rhs/thinbrick3/disperse 49190     0          Y       21619
Brick 10.70.42.75:/rhs/thinbrick3/disperse  49186     0          Y       17932
NFS Server on localhost                     2049      0          Y       21640
NFS Server on 10.70.42.75                   2049      0          Y       17953

Task Status of Volume dispersevol
------------------------------------------------------------------------------
There are no active volume tasks

[root at dhcp43-140 ~]# mkdir /mnt/disperse
[root at dhcp43-140 ~]# mount -t glusterfs 10.70.43.140:/disperse /mnt/disperse
Mount failed. Please check the log file for more details.
[root at dhcp43-140 ~]# mount -t glusterfs 10.70.43.140:/dispersevol /mnt/disperse
[root at dhcp43-140 ~]# cd /mnt/disperse
[root at dhcp43-140 disperse]# ls -a
.  ..  .trashcan
[root at dhcp43-140 disperse]# 
[root at dhcp43-140 disperse]# 
[root at dhcp43-140 disperse]# 
[root at dhcp43-140 disperse]# echo "what a day.. hmph" > test1
[root at dhcp43-140 disperse]# mkdir dir2
[root at dhcp43-140 disperse]# ln test1 test1_ln
[root at dhcp43-140 disperse]# echo "what a day.. hmph" > test2
[root at dhcp43-140 disperse]# ln -s test2 test2_sln
[root at dhcp43-140 disperse]# cd
[root at dhcp43-140 ~]# 
[root at dhcp43-140 ~]# 
[root at dhcp43-140 ~]# 
[root at dhcp43-140 ~]# gluster v info dispersevol

Volume Name: dispersevol
Type: Disperse
Volume ID: e70dccdf-9f39-494a-a166-aa142049de07
Status: Created
Number of Bricks: 1 x (4 + 2) = 6
Transport-type: tcp
Bricks:
Brick1: 10.70.43.140:/rhs/thinbrick1/disperse
Brick2: 10.70.43.140:/rhs/thinbrick2/disperse
Brick3: 10.70.42.75:/rhs/thinbrick2/disperse
Brick4: 10.70.42.75:/rhs/thinbrick1/disperse
Brick5: 10.70.43.140:/rhs/thinbrick3/disperse
Brick6: 10.70.42.75:/rhs/thinbrick3/disperse
Options Reconfigured:
changelog.changelog: on
storage.build-pgfid: on
performance.readdir-ahead: on
[root at dhcp43-140 ~]# gluster v start dispersevol
volume start: dispersevol: success
[root at dhcp43-140 ~]# 
[root at dhcp43-140 ~]# glusterfind list
SESSION                   VOLUME                    SESSION TIME             
---------------------------------------------------------------------------
testvol_sess              testvol                   2015-06-04 23:38:02      
cross32                   cross3vol                 2015-06-04 22:46:01      
sessd2                    dispersevol               2015-06-05 19:44:31      
sessd3                    dispersevol               2015-06-05 19:44:39      
sessd1                    dispersevol               2015-06-05 19:44:23      
cross31                   cross3vol                 2015-06-05 00:00:28      
sessc                     cross3vol                 2015-06-05 15:44:47      
[root at dhcp43-140 ~]# 
[root at dhcp43-140 ~]# glusterfind pre sessd1 dispersevol outd1.txt
--regenerate-outfile
10.70.43.140 - pre failed: /rhs/thinbrick2/disperse Historical Changelogs not
available: [Errno 2] No such file or directory

10.70.43.140 - pre failed: /rhs/thinbrick1/disperse Historical Changelogs not
available: [Errno 2] No such file or directory

10.70.43.140 - pre failed: /rhs/thinbrick3/disperse Historical Changelogs not
available: [Errno 2] No such file or directory

10.70.42.75 - pre failed: /rhs/thinbrick2/disperse Historical Changelogs not
available: [Errno 2] No such file or directory

10.70.42.75 - pre failed: /rhs/thinbrick1/disperse Historical Changelogs not
available: [Errno 2] No such file or directory

10.70.42.75 - pre failed: /rhs/thinbrick3/disperse Historical Changelogs not
available: [Errno 2] No such file or directory

Generated output file /root/outd1.txt
[root at dhcp43-140 ~]#

***********************************************************88
This is what is seen in the backend brick logs:


[2015-06-05 14:32:15.422293] I [rpcsvc.c:2213:rpcsvc_set_outstanding_rpc_limit]
0-rpc-service: Configured rpc.outstanding-rpc-limit with value 64
[2015-06-05 14:32:15.422417] W [options.c:936:xl_opt_validate]
0-dispersevol-server: option 'listen-port' is deprecated, preferred is
'transport.socket.listen-port', continuing with correction
[2015-06-05 14:32:15.425732] E [changelog-helpers.c:634:htime_open]
0-dispersevol-changelog: Error extracting HTIME_CURRENT: No data available.
[2015-06-05 14:32:15.427484] E [ctr-helper.c:250:extract_ctr_options]
0-gfdbdatastore: CTR Xlator is disabled.
...
...
...
[2015-06-05 14:42:29.646309] I [server-handshake.c:585:server_setvolume]
0-dispersevol-server: accepted client from
dhcp43-140.lab.eng.blr.redhat.com-21847-2015/06/05-14:42:29:558126-dispersevol-client-0-0-0
(version: 3.7.0)
[2015-06-05 14:42:53.867949] E [posix-helpers.c:1088:posix_handle_pair]
0-dispersevol-posix: /rhs/thinbrick1/disperse/dir2:
key:glusterfs.inodelk-dom-count flags: 1 length:23 error:Operation not
supported
[2015-06-05 14:42:53.868063] E [posix.c:1391:posix_mkdir] 0-dispersevol-posix:
setting xattrs on /rhs/thinbrick1/disperse/dir2 failed (Operation not
supported)
[2015-06-05 14:43:19.746591] E [posix-helpers.c:1088:posix_handle_pair]
0-dispersevol-posix: /rhs/thinbrick1/disperse/test2_sln:
key:glusterfs.inodelk-dom-count flags: 1 length:23 error:Operation not
supported
[2015-06-05 14:43:19.746643] E [posix.c:1897:posix_symlink]
0-dispersevol-posix: setting xattrs on /rhs/thinbrick1/disperse/test2_sln
failed (Operation not supported)
[2015-06-05 14:44:27.716197] I [server-handshake.c:585:server_setvolume]
0-dispersevol-server: accepted client from
dhcp42-75.lab.eng.blr.redhat.com-18098-2015/06/05-14:44:27:456748-dispersevol-client-0-0-0
(version: 3.7.0)
[2015-06-05 14:44:56.663817] W [socket.c:642:__socket_rwv]
0-dispersevol-changelog: readv on
/var/run/gluster/.0ea84248fa0f63dc12f2a29bd7f49b5921945.sock failed (No data
available)

*********************************************************



Version-Release number of selected component (if applicable):


How reproducible: 1:1

RCA:
---

This is nothing to do with disperse volume. It should happen with any other
volume if it is done in following way.

1. gluster volume is created first
2. Immediately glusterfind session is created which updates cuurent time as
   say 't1' as start time for changelogs available time.
   Even though, the changelog is enabled (marked 'on' in vol file), backend  
   .glusterfs and actual changelog files gets created during volume start.
3. gluster volume is started.
   The changelog and HITIME.TSTAMP gets created during volume start and the
   TSTAMP is current time say 't1+n'

  If above is the case, then glustefind pre is requesting History API with
  start time as 't1' where as the changelog is actually available from 't1+s'
  which will always fail.

Solution:
--------
With the following patches which is already merged, the glusterfind session
creation will fail unless volume is oneline. So that would fix the above issue


As mentioned in the RCA, changing the title of this bug.

Also, the patch mentioned above fixes the issue only when the status of the
volume is in 'stopped' state. The issue still persists if the volume status is
'created'.


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1223636
[Bug 1223636] 3.1 QE Tracker
https://bugzilla.redhat.com/show_bug.cgi?id=1228598
[Bug 1228598] [Backup]: Glusterfind session(s) created before starting the
volume results in 'changelog not available' error, eventually
https://bugzilla.redhat.com/show_bug.cgi?id=1232729
[Bug 1232729] [Backup]: Glusterfind session(s) created before starting the
volume results in 'changelog not available' error, eventually
-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list