[Bugs] [Bug 1147236] New: gluster 3.6.0 beta1 cannot write file on gluster 3.3 volume

bugzilla at redhat.com bugzilla at redhat.com
Sun Sep 28 10:09:25 UTC 2014


https://bugzilla.redhat.com/show_bug.cgi?id=1147236

            Bug ID: 1147236
           Summary: gluster 3.6.0 beta1 cannot write file on gluster 3.3
                    volume
           Product: GlusterFS
           Version: 3.6.0
         Component: protocol
          Severity: high
          Assignee: gluster-bugs at redhat.com
          Reporter: jiaowopan at gmail.com
                CC: bugs at gluster.org



Created attachment 941976
  --> https://bugzilla.redhat.com/attachment.cgi?id=941976&action=edit
mount log

Description of problem:

with gluster 3.6.0 beta1 as client, and gluster 3.3.0 as server, the write
operation fail   while  the mount operation and read operation success.

Version-Release number of selected component (if applicable):
glusterfs 3.6.0 beta1

How reproducible:

mount gluster 3.3 volume with gluster 3.6 client

Steps to Reproduce:
1. client install gluster 3.6.0 beta1
2. server install gluster 3.3.0 
3. client mount volume of gluster 3.3.0
4. write file under mount point.

Actual results:

cannot write successfully.

Expected results:

can write successfully.

Additional info:

hi  , maybe I need to comment some more infomation of the situation.

----->>>>>  version of client and server  <<<<<-----

client version :  glusterfs 3.6.0 beta1
server version:   glusterfs 3.3.0


---------->>>> volume info <<<<<-------------

Volume Name: myvol
Type: Distributed-Replicate
Volume ID: c36dfe1c-3f95-4d64-9dae-1b5916b56b19
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 10.10.10.10:/mnt/xfsd/myvol-0
Brick2: 10.10.10.10:/mnt/xfsd/myvol-1
Brick3: 10.10.10.10:/mnt/xfsd/myvol-2
Brick4: 10.10.10.10:/mnt/xfsd/myvol-3

-------------->>>> mount with --debug   <<<<<-------------------
/usr/sbin/glusterfs --volfile-server=10.10.10.10 --volfile-id=myvol
/mnt/myvol --debug

then, at another window of client, I  goto the mount point /mnt/myvol and
 execute "ls"
then a couple of lines present. I think no error message.   following, I
execute the command

echo "hello,gluster3.3" > /mnt/myvol/file

then some "error like" message are present .  Again, with "ls" , I can find
the file with name "file" under
mount point.  but "cat /mnt/myvol/file" will result in nothing. which means
, this is an empyt file !!!
After many tests, a conclusion is coming :   I cannot write anything to a
file, while I can create file successfully,
and can read an  existing file successfully.


-------------->>>>> mount log  (some "error like" message when write file )
<<<<<<----------------

[2014-09-28 07:41:28.585181] D [logging.c:1781:__gf_log_inject_timer_event]
0-logging-infra: Starting timer now. Timeout = 120, current buf size = 5
[2014-09-28 07:42:42.433338] D [MSGID: 0]
[dht-common.c:621:dht_revalidate_cbk] 0-myvol-dht: revalidate lookup of /
returned with op_ret 0 and op_errno 117
[2014-09-28 07:43:19.487414] D [MSGID: 0] [dht-common.c:2182:dht_lookup]
0-myvol-dht: Calling fresh lookup for /file on myvol-replicate-1
[2014-09-28 07:43:19.558459] D [MSGID: 0]
[dht-common.c:1818:dht_lookup_cbk] 0-myvol-dht: fresh_lookup returned for
/file with op_ret -1 and op_errno 2
[2014-09-28 07:43:19.558497] I [dht-common.c:1822:dht_lookup_cbk]
0-myvol-dht: Entry /file missing on subvol myvol-replicate-1
[2014-09-28 07:43:19.558517] D [MSGID: 0]
[dht-common.c:1607:dht_lookup_everywhere] 0-myvol-dht: winding lookup call
to 2 subvols
[2014-09-28 07:43:19.634376] D [MSGID: 0]
[dht-common.c:1413:dht_lookup_everywhere_cbk] 0-myvol-dht: returned with
op_ret -1 and op_errno 2 (/file) from subvol myvol-replicate-0
[2014-09-28 07:43:19.634573] D [MSGID: 0]
[dht-common.c:1413:dht_lookup_everywhere_cbk] 0-myvol-dht: returned with
op_ret -1 and op_errno 2 (/file) from subvol myvol-replicate-1
[2014-09-28 07:43:19.634605] D [MSGID: 0]
[dht-common.c:1086:dht_lookup_everywhere_done] 0-myvol-dht: STATUS:
hashed_subvol myvol-replicate-1 cached_subvol null
[2014-09-28 07:43:19.634624] D [MSGID: 0]
[dht-common.c:1147:dht_lookup_everywhere_done] 0-myvol-dht: There was no
cached file and  unlink on hashed is not skipped /file
[2014-09-28 07:43:19.634663] D [fuse-resolve.c:83:fuse_resolve_entry_cbk]
0-fuse: 00000000-0000-0000-0000-000000000001/file: failed to resolve (No
such file or directory)
[2014-09-28 07:43:19.708608] I [dht-common.c:1822:dht_lookup_cbk]
0-myvol-dht: Entry /file missing on subvol myvol-replicate-1
[2014-09-28 07:43:19.781420] D [logging.c:1937:_gf_msg_internal]
0-logging-infra: Buffer overflow of a buffer whose size limit is 5. About
to flush least recently used log message to disk
[2014-09-28 07:43:19.708640] D [MSGID: 0]
[dht-common.c:1607:dht_lookup_everywhere] 0-myvol-dht: winding lookup call
to 2 subvols
[2014-09-28 07:43:19.781418] D [MSGID: 0]
[dht-common.c:1413:dht_lookup_everywhere_cbk] 0-myvol-dht: returned with
op_ret -1 and op_errno 2 (/file) from subvol myvol-replicate-0
[2014-09-28 07:43:19.781629] D [MSGID: 0]
[dht-common.c:1413:dht_lookup_everywhere_cbk] 0-myvol-dht: returned with
op_ret -1 and op_errno 2 (/file) from subvol myvol-replicate-1
[2014-09-28 07:43:19.781653] D [MSGID: 0]
[dht-common.c:1086:dht_lookup_everywhere_done] 0-myvol-dht: STATUS:
hashed_subvol myvol-replicate-1 cached_subvol null
[2014-09-28 07:43:19.851925] I [dht-common.c:1822:dht_lookup_cbk]
0-myvol-dht: Entry /file missing on subvol myvol-replicate-1
[2014-09-28 07:43:19.851954] D [logging.c:1937:_gf_msg_internal]
0-logging-infra: Buffer overflow of a buffer whose size limit is 5. About
to flush least recently used log message to disk
The message "D [MSGID: 0] [dht-common.c:1818:dht_lookup_cbk] 0-myvol-dht:
fresh_lookup returned for /file with op_ret -1 and op_errno 2" repeated 2
times between [2014-09-28 07:43:19.558459] and [2014-09-28 07:43:19.851922]
[2014-09-28 07:43:19.851954] D [MSGID: 0]
[dht-common.c:1607:dht_lookup_everywhere] 0-myvol-dht: winding lookup call
to 2 subvols
[2014-09-28 07:43:19.922764] D [MSGID: 0]
[dht-common.c:1413:dht_lookup_everywhere_cbk] 0-myvol-dht: returned with
op_ret -1 and op_errno 2 (/file) from subvol myvol-replicate-0
[2014-09-28 07:43:19.922925] D [MSGID: 0]
[dht-common.c:1413:dht_lookup_everywhere_cbk] 0-myvol-dht: returned with
op_ret -1 and op_errno 2 (/file) from subvol myvol-replicate-1
[2014-09-28 07:43:19.922974] D [fuse-resolve.c:83:fuse_resolve_entry_cbk]
0-fuse: 00000000-0000-0000-0000-000000000001/file: failed to resolve (No
such file or directory)
[2014-09-28 07:43:19.997012] D [logging.c:1937:_gf_msg_internal]
0-logging-infra: Buffer overflow of a buffer whose size limit is 5. About
to flush least recently used log message to disk
The message "D [MSGID: 0] [dht-common.c:1147:dht_lookup_everywhere_done]
0-myvol-dht: There was no cached file and  unlink on hashed is not skipped
/file" repeated 2 times between [2014-09-28 07:43:19.634624] and
[2014-09-28 07:43:19.922951]
[2014-09-28 07:43:19.997011] D [MSGID: 0]
[dht-diskusage.c:96:dht_du_info_cbk] 0-myvol-dht: subvolume
'myvol-replicate-0': avail_percent is: 99.00 and avail_space is:
44000304627712 and avail_inodes is: 99.00
[2014-09-28 07:43:19.997134] D [MSGID: 0]
[dht-diskusage.c:96:dht_du_info_cbk] 0-myvol-dht: subvolume
'myvol-replicate-1': avail_percent is: 99.00 and avail_space is:
44000304627712 and avail_inodes is: 99.00
[2014-09-28 07:43:19.997179] D
[afr-transaction.c:1166:afr_post_nonblocking_entrylk_cbk]
0-myvol-replicate-1: Non blocking entrylks done. Proceeding to FOP
[2014-09-28 07:43:20.067587] D [afr-lk-common.c:447:transaction_lk_op]
0-myvol-replicate-1: lk op is for a transaction
[2014-09-28 07:43:20.216287] D
[afr-transaction.c:1116:afr_post_nonblocking_inodelk_cbk]
0-myvol-replicate-1: Non blocking inodelks done. Proceeding to FOP
[2014-09-28 07:43:20.356844] W [client-rpc-fops.c:850:client3_3_writev_cbk]
0-myvol-client-3: remote operation failed: Transport endpoint is not
connected
[2014-09-28 07:43:20.356979] W [client-rpc-fops.c:850:client3_3_writev_cbk]
0-myvol-client-2: remote operation failed: Transport endpoint is not
connected
[2014-09-28 07:43:20.357009] D [afr-lk-common.c:447:transaction_lk_op]
0-myvol-replicate-1: lk op is for a transaction
[2014-09-28 07:43:20.428013] W [fuse-bridge.c:1261:fuse_err_cbk]
0-glusterfs-fuse: 14: FLUSH() ERR => -1 (Transport endpoint is not
connected)
[2014-09-28 07:43:28.593512] D [logging.c:1816:gf_log_flush_timeout_cbk]
0-logging-infra: Log timer timed out. About to flush outstanding messages
if present
The message "D [MSGID: 0] [dht-common.c:621:dht_revalidate_cbk]
0-myvol-dht: revalidate lookup of / returned with op_ret 0 and op_errno
117" repeated 5 times between [2014-09-28 07:42:42.433338] and [2014-09-28
07:43:19.487219]
The message "D [MSGID: 0] [dht-common.c:2182:dht_lookup] 0-myvol-dht:
Calling fresh lookup for /file on myvol-replicate-1" repeated 2 times
between [2014-09-28 07:43:19.487414] and [2014-09-28 07:43:19.781835]
[2014-09-28 07:43:19.922950] D [MSGID: 0]
[dht-common.c:1086:dht_lookup_everywhere_done] 0-myvol-dht: STATUS:
hashed_subvol myvol-replicate-1 cached_subvol null
[2014-09-28 07:43:28.593615] D [logging.c:1781:__gf_log_inject_timer_event]
0-logging-infra: Starting timer now. Timeout = 120, current buf size = 5
[2014-09-28 07:44:26.517010] D [MSGID: 0]
[dht-common.c:621:dht_revalidate_cbk] 0-myvol-dht: revalidate lookup of /
returned with op_ret 0 and op_errno 117
[2014-09-28 07:44:26.950203] D [MSGID: 0] [dht-common.c:2108:dht_lookup]
0-myvol-dht: calling revalidate lookup for /file at myvol-replicate-1
[2014-09-28 07:44:27.022986] D [MSGID: 0]
[dht-common.c:621:dht_revalidate_cbk] 0-myvol-dht: revalidate lookup of
/file returned with op_ret 0 and op_errno 0



Pleasy give me some ideas to fix this problem.   Thanks all !

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=900k6QXjNj&a=cc_unsubscribe


More information about the Bugs mailing list