[Gluster-users] cannot write file when gluster 3.6 client mount volume of gluster 3.3

panpan feng jiaowopan at gmail.com
Sun Sep 28 08:05:22 UTC 2014


hi  , maybe I need to comment some more infomation of the situation.

----->>>>>  version of client and server  <<<<<-----

client version :  glusterfs 3.6.0 beta1
server version:   glusterfs 3.3.0


---------->>>> volume info <<<<<-------------

Volume Name: myvol
Type: Distributed-Replicate
Volume ID: c36dfe1c-3f95-4d64-9dae-1b5916b56b19
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 10.10.10.10:/mnt/xfsd/myvol-0
Brick2: 10.10.10.10:/mnt/xfsd/myvol-1
Brick3: 10.10.10.10:/mnt/xfsd/myvol-2
Brick4: 10.10.10.10:/mnt/xfsd/myvol-3

-------------->>>> mount with --debug   <<<<<-------------------
/usr/sbin/glusterfs --volfile-server=10.10.10.10 --volfile-id=myvol
/mnt/myvol --debug

then, at another window of client, I  goto the mount point /mnt/myvol and
 execute "ls"
then a couple of lines present. I think no error message.   following, I
execute the command

echo "hello,gluster3.3" > /mnt/myvol/file

then some "error like" message are present .  Again, with "ls" , I can find
the file with name "file" under
mount point.  but "cat /mnt/myvol/file" will result in nothing. which means
, this is an empyt file !!!
After many tests, a conclusion is coming :   I cannot write anything to a
file, while I can create file successfully,
and can read an  existing file successfully.


-------------->>>>> mount log  (some "error like" message when write file )
<<<<<<----------------

[2014-09-28 07:41:28.585181] D [logging.c:1781:__gf_log_inject_timer_event]
0-logging-infra: Starting timer now. Timeout = 120, current buf size = 5
[2014-09-28 07:42:42.433338] D [MSGID: 0]
[dht-common.c:621:dht_revalidate_cbk] 0-myvol-dht: revalidate lookup of /
returned with op_ret 0 and op_errno 117
[2014-09-28 07:43:19.487414] D [MSGID: 0] [dht-common.c:2182:dht_lookup]
0-myvol-dht: Calling fresh lookup for /file on myvol-replicate-1
[2014-09-28 07:43:19.558459] D [MSGID: 0]
[dht-common.c:1818:dht_lookup_cbk] 0-myvol-dht: fresh_lookup returned for
/file with op_ret -1 and op_errno 2
[2014-09-28 07:43:19.558497] I [dht-common.c:1822:dht_lookup_cbk]
0-myvol-dht: Entry /file missing on subvol myvol-replicate-1
[2014-09-28 07:43:19.558517] D [MSGID: 0]
[dht-common.c:1607:dht_lookup_everywhere] 0-myvol-dht: winding lookup call
to 2 subvols
[2014-09-28 07:43:19.634376] D [MSGID: 0]
[dht-common.c:1413:dht_lookup_everywhere_cbk] 0-myvol-dht: returned with
op_ret -1 and op_errno 2 (/file) from subvol myvol-replicate-0
[2014-09-28 07:43:19.634573] D [MSGID: 0]
[dht-common.c:1413:dht_lookup_everywhere_cbk] 0-myvol-dht: returned with
op_ret -1 and op_errno 2 (/file) from subvol myvol-replicate-1
[2014-09-28 07:43:19.634605] D [MSGID: 0]
[dht-common.c:1086:dht_lookup_everywhere_done] 0-myvol-dht: STATUS:
hashed_subvol myvol-replicate-1 cached_subvol null
[2014-09-28 07:43:19.634624] D [MSGID: 0]
[dht-common.c:1147:dht_lookup_everywhere_done] 0-myvol-dht: There was no
cached file and  unlink on hashed is not skipped /file
[2014-09-28 07:43:19.634663] D [fuse-resolve.c:83:fuse_resolve_entry_cbk]
0-fuse: 00000000-0000-0000-0000-000000000001/file: failed to resolve (No
such file or directory)
[2014-09-28 07:43:19.708608] I [dht-common.c:1822:dht_lookup_cbk]
0-myvol-dht: Entry /file missing on subvol myvol-replicate-1
[2014-09-28 07:43:19.781420] D [logging.c:1937:_gf_msg_internal]
0-logging-infra: Buffer overflow of a buffer whose size limit is 5. About
to flush least recently used log message to disk
[2014-09-28 07:43:19.708640] D [MSGID: 0]
[dht-common.c:1607:dht_lookup_everywhere] 0-myvol-dht: winding lookup call
to 2 subvols
[2014-09-28 07:43:19.781418] D [MSGID: 0]
[dht-common.c:1413:dht_lookup_everywhere_cbk] 0-myvol-dht: returned with
op_ret -1 and op_errno 2 (/file) from subvol myvol-replicate-0
[2014-09-28 07:43:19.781629] D [MSGID: 0]
[dht-common.c:1413:dht_lookup_everywhere_cbk] 0-myvol-dht: returned with
op_ret -1 and op_errno 2 (/file) from subvol myvol-replicate-1
[2014-09-28 07:43:19.781653] D [MSGID: 0]
[dht-common.c:1086:dht_lookup_everywhere_done] 0-myvol-dht: STATUS:
hashed_subvol myvol-replicate-1 cached_subvol null
[2014-09-28 07:43:19.851925] I [dht-common.c:1822:dht_lookup_cbk]
0-myvol-dht: Entry /file missing on subvol myvol-replicate-1
[2014-09-28 07:43:19.851954] D [logging.c:1937:_gf_msg_internal]
0-logging-infra: Buffer overflow of a buffer whose size limit is 5. About
to flush least recently used log message to disk
The message "D [MSGID: 0] [dht-common.c:1818:dht_lookup_cbk] 0-myvol-dht:
fresh_lookup returned for /file with op_ret -1 and op_errno 2" repeated 2
times between [2014-09-28 07:43:19.558459] and [2014-09-28 07:43:19.851922]
[2014-09-28 07:43:19.851954] D [MSGID: 0]
[dht-common.c:1607:dht_lookup_everywhere] 0-myvol-dht: winding lookup call
to 2 subvols
[2014-09-28 07:43:19.922764] D [MSGID: 0]
[dht-common.c:1413:dht_lookup_everywhere_cbk] 0-myvol-dht: returned with
op_ret -1 and op_errno 2 (/file) from subvol myvol-replicate-0
[2014-09-28 07:43:19.922925] D [MSGID: 0]
[dht-common.c:1413:dht_lookup_everywhere_cbk] 0-myvol-dht: returned with
op_ret -1 and op_errno 2 (/file) from subvol myvol-replicate-1
[2014-09-28 07:43:19.922974] D [fuse-resolve.c:83:fuse_resolve_entry_cbk]
0-fuse: 00000000-0000-0000-0000-000000000001/file: failed to resolve (No
such file or directory)
[2014-09-28 07:43:19.997012] D [logging.c:1937:_gf_msg_internal]
0-logging-infra: Buffer overflow of a buffer whose size limit is 5. About
to flush least recently used log message to disk
The message "D [MSGID: 0] [dht-common.c:1147:dht_lookup_everywhere_done]
0-myvol-dht: There was no cached file and  unlink on hashed is not skipped
/file" repeated 2 times between [2014-09-28 07:43:19.634624] and
[2014-09-28 07:43:19.922951]
[2014-09-28 07:43:19.997011] D [MSGID: 0]
[dht-diskusage.c:96:dht_du_info_cbk] 0-myvol-dht: subvolume
'myvol-replicate-0': avail_percent is: 99.00 and avail_space is:
44000304627712 and avail_inodes is: 99.00
[2014-09-28 07:43:19.997134] D [MSGID: 0]
[dht-diskusage.c:96:dht_du_info_cbk] 0-myvol-dht: subvolume
'myvol-replicate-1': avail_percent is: 99.00 and avail_space is:
44000304627712 and avail_inodes is: 99.00
[2014-09-28 07:43:19.997179] D
[afr-transaction.c:1166:afr_post_nonblocking_entrylk_cbk]
0-myvol-replicate-1: Non blocking entrylks done. Proceeding to FOP
[2014-09-28 07:43:20.067587] D [afr-lk-common.c:447:transaction_lk_op]
0-myvol-replicate-1: lk op is for a transaction
[2014-09-28 07:43:20.216287] D
[afr-transaction.c:1116:afr_post_nonblocking_inodelk_cbk]
0-myvol-replicate-1: Non blocking inodelks done. Proceeding to FOP
[2014-09-28 07:43:20.356844] W [client-rpc-fops.c:850:client3_3_writev_cbk]
0-myvol-client-3: remote operation failed: Transport endpoint is not
connected
[2014-09-28 07:43:20.356979] W [client-rpc-fops.c:850:client3_3_writev_cbk]
0-myvol-client-2: remote operation failed: Transport endpoint is not
connected
[2014-09-28 07:43:20.357009] D [afr-lk-common.c:447:transaction_lk_op]
0-myvol-replicate-1: lk op is for a transaction
[2014-09-28 07:43:20.428013] W [fuse-bridge.c:1261:fuse_err_cbk]
0-glusterfs-fuse: 14: FLUSH() ERR => -1 (Transport endpoint is not
connected)
[2014-09-28 07:43:28.593512] D [logging.c:1816:gf_log_flush_timeout_cbk]
0-logging-infra: Log timer timed out. About to flush outstanding messages
if present
The message "D [MSGID: 0] [dht-common.c:621:dht_revalidate_cbk]
0-myvol-dht: revalidate lookup of / returned with op_ret 0 and op_errno
117" repeated 5 times between [2014-09-28 07:42:42.433338] and [2014-09-28
07:43:19.487219]
The message "D [MSGID: 0] [dht-common.c:2182:dht_lookup] 0-myvol-dht:
Calling fresh lookup for /file on myvol-replicate-1" repeated 2 times
between [2014-09-28 07:43:19.487414] and [2014-09-28 07:43:19.781835]
[2014-09-28 07:43:19.922950] D [MSGID: 0]
[dht-common.c:1086:dht_lookup_everywhere_done] 0-myvol-dht: STATUS:
hashed_subvol myvol-replicate-1 cached_subvol null
[2014-09-28 07:43:28.593615] D [logging.c:1781:__gf_log_inject_timer_event]
0-logging-infra: Starting timer now. Timeout = 120, current buf size = 5
[2014-09-28 07:44:26.517010] D [MSGID: 0]
[dht-common.c:621:dht_revalidate_cbk] 0-myvol-dht: revalidate lookup of /
returned with op_ret 0 and op_errno 117
[2014-09-28 07:44:26.950203] D [MSGID: 0] [dht-common.c:2108:dht_lookup]
0-myvol-dht: calling revalidate lookup for /file at myvol-replicate-1
[2014-09-28 07:44:27.022986] D [MSGID: 0]
[dht-common.c:621:dht_revalidate_cbk] 0-myvol-dht: revalidate lookup of
/file returned with op_ret 0 and op_errno 0



Pleasy give me some ideas to fix this problem.   Thanks all !






2014-09-27 6:16 GMT+08:00 Justin Clift <justin at gluster.org>:

> On 25/09/2014, at 10:29 AM, panpan feng wrote:
> > hi , Dear experts of gluster,
> >      today I have met a problem.   I  install  glusterfs 3.6 beta1 at
> client  , and mount a volume which
> > servered by glusterfs 3.3 server,  the mount operation success. And I
> can read file successfully. But
> > write operation will failed with error "E72 Close error on swap file".
> There are many strange message in log file
> >
> >
> > [2014-09-25 05:44:17.659510] W [graph.c:344:_log_if_unknown_option]
> 0-maintain4-quota: option 'timeout' is not recognized
> >
> > [2014-09-25 05:45:26.022305] I [MSGID: 109018]
> [dht-common.c:715:dht_revalidate_cbk] 0-maintain4-dht: Mismatching layouts
> for /StorageReport, gfid = 00000000-0000-0000-0000-000000000000
> > [2014-09-25 05:45:26.022616] I
> [dht-layout.c:754:dht_layout_dir_mismatch] 0-maintain4-dht: /StorageReport:
> Disk layout missing, gfid = 4f8abc71-b771-4fc3-b7fa-42ef0ea09dc5
> > [2014-09-25 05:45:26.022673] I
> [dht-layout.c:754:dht_layout_dir_mismatch] 0-maintain4-dht: /StorageReport:
> Disk layout missing, gfid = 4f8abc71-b771-4fc3-b7fa-42ef0ea09dc5
> > [2014-09-25 05:45:26.022973] I
> [dht-layout.c:754:dht_layout_dir_mismatch] 0-maintain4-dht: /StorageReport:
> Disk layout missing, gfid = 4f8abc71-b771-4fc3-b7fa-42ef0ea09dc5
> > [2014-09-25 05:45:26.023216] I
> [dht-layout.c:754:dht_layout_dir_mismatch] 0-maintain4-dht: /StorageReport:
> Disk layout missing, gfid = 4f8abc71-b771-4fc3-b7fa-42ef0ea09dc5
> > [2014-09-25 05:45:26.023430] I
> [dht-layout.c:754:dht_layout_dir_mismatch] 0-maintain4-dht: /StorageReport:
> Disk layout missing, gfid = 4f8abc71-b771-4fc3-b7fa-42ef0ea09dc5
> > [2014-09-25 05:45:26.059022] I
> [afr-self-heal-metadata.c:41:__afr_selfheal_metadata_do]
> 0-maintain4-replicate-1: performing metadata selfheal on
> 4f8abc71-b771-4fc3-b7fa-42ef0ea09dc5
> > [2014-09-25 05:45:26.059208] I
> [afr-self-heal-metadata.c:41:__afr_selfheal_metadata_do]
> 0-maintain4-replicate-2: performing metadata selfheal on
> 4f8abc71-b771-4fc3-b7fa-42ef0ea09dc5
> > [2014-09-25 05:45:26.059238] I
> [afr-self-heal-metadata.c:41:__afr_selfheal_metadata_do]
> 0-maintain4-replicate-0: performing metadata selfheal on
> 4f8abc71-b771-4fc3-b7fa-42ef0ea09dc5
> > [2014-09-25 05:45:26.059281] I
> [afr-self-heal-metadata.c:41:__afr_selfheal_metadata_do]
> 0-maintain4-replicate-3: performing metadata selfheal on
> 4f8abc71-b771-4fc3-b7fa-42ef0ea09dc5
> > [2014-09-25 05:45:26.059594] I
> [afr-self-heal-metadata.c:41:__afr_selfheal_metadata_do]
> 0-maintain4-replicate-4: performing metadata selfheal on
> 4f8abc71-b771-4fc3-b7fa-42ef0ea09dc5
> > [2014-09-25 05:45:26.059783] I
> [afr-self-heal-metadata.c:41:__afr_selfheal_metadata_do]
> 0-maintain4-replicate-5: performing metadata selfheal on
> 4f8abc71-b771-4fc3-b7fa-42ef0ea09dc5
> > [2014-09-25 05:45:26.112882] I [dht-layout.c:663:dht_layout_normalize]
> 0-maintain4-dht: Found anomalies in /StorageReport (gfid =
> 00000000-0000-0000-0000-000000000000). Holes=1 overlaps=0
> > [2014-09-25 05:45:26.112903] I
> [dht-selfheal.c:1065:dht_selfheal_layout_new_directory] 0-maintain4-dht:
> chunk size = 0xffffffff / 188817468 = 0x16
> > [2014-09-25 05:45:26.112915] I
> [dht-selfheal.c:1103:dht_selfheal_layout_new_directory] 0-maintain4-dht:
> assigning range size 0x294420dc to maintain4-replicate-4
> > [2014-09-25 05:45:26.112928] I
> [dht-selfheal.c:1103:dht_selfheal_layout_new_directory] 0-maintain4-dht:
> assigning range size 0x294420dc to maintain4-replicate-5
> > [2014-09-25 05:45:26.112937] I
> [dht-selfheal.c:1103:dht_selfheal_layout_new_directory] 0-maintain4-dht:
> assigning range size 0x294420dc to maintain4-replicate-0
> > [2014-09-25 05:45:26.112945] I
> [dht-selfheal.c:1103:dht_selfheal_layout_new_directory] 0-maintain4-dht:
> assigning range size 0x294420dc to maintain4-replicate-1
> > [2014-09-25 05:45:26.112952] I
> [dht-selfheal.c:1103:dht_selfheal_layout_new_directory] 0-maintain4-dht:
> assigning range size 0x294420dc to maintain4-replicate-2
> > [2014-09-25 05:45:26.112960] I
> [dht-selfheal.c:1103:dht_selfheal_layout_new_directory] 0-maintain4-dht:
> assigning range size 0x294420dc to maintain4-replicate-3
> > The message "I [MSGID: 109018] [dht-common.c:715:dht_revalidate_cbk]
> 0-maintain4-dht: Mismatching layouts for /StorageReport, gfid =
> 00000000-0000-0000-0000-000000000000" repeated 5 times between [2014-09-25
> 05:45:26.022305] and [2014-09-25 05:45:26.023456]
> > [2014-09-25 05:45:26.130803] I [MSGID: 109036]
> [dht-common.c:6221:dht_log_new_layout_for_dir_selfheal] 0-maintain4-dht:
> Setting layout of /StorageReport with [Subvol_name: maintain4-replicate-0,
> Err: -1 , Start: 1384661432 , Stop: 2076992147 ], [Subvol_name:
> maintain4-replicate-1, Err: -1 , Start: 2076992148 , Stop: 2769322863 ],
> [Subvol_name: maintain4-replicate-2, Err: -1 , Start: 2769322864 , Stop:
> 3461653579 ], [Subvol_name: maintain4-replicate-3, Err: -1 , Start:
> 3461653580 , Stop: 4294967295 ], [Subvol_name: maintain4-replicate-4, Err:
> -1 , Start: 0 , Stop: 692330715 ], [Subvol_name: maintain4-replicate-5,
> Err: -1 , Start: 692330716 , Stop: 1384661431 ],
> >
> > What Can I do to fix this problem?
>
> Interesting.  The guys will definitely want to look at this next week. :)
>
> Regards and best wishes,
>
> Justin Clift
>
> --
> GlusterFS - http://www.gluster.org
>
> An open source, distributed file system scaling to several
> petabytes, and handling thousands of clients.
>
> My personal twitter: twitter.com/realjustinclift
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140928/db7a141b/attachment.html>


More information about the Gluster-users mailing list