[Gluster-users] Estrange behavior on replicated volume: one directory does not replicate

Juan Pablo Baudoin jpbaudoin at gmail.com
Wed Apr 13 12:42:00 UTC 2011


I have setup a replicated volume in which on of the dirs does not replicate.

The configuration:

 - I have srvfs1(Master) and srvfs2(Peer) as gluster servers and
liveweb as the client,  all of them are micro instances on EC2.
 - IP configuration is setup in /etc/hosts
 - Volume config:
       gluster volume create gls_tray01-data replica 2 transport tcp
srvfs1:/var/trays-data/gls_tray01/export_root/
srvfs2:/var/trays-data/gls_tray01/export_root/
       gluster volume set gls_tray01-data nfs.disable on
 - Mount: mount -t glusterfs srvfs1:/gls_tray01-data
/mnt/glusterfs/data/gls_tray01 -o
log-level=DEBUG,log-file=/tmp/log_gluster

The base directory structure:
├── archivos
│   ├── configs
│   │   └── apache2
│   │       ├── sites-available
│   │       └── sites-enabled
│   ├── projects
│   └── prueba
└── files
    ├── configs
    │   └── apache2
    │       ├── sites-available
    │       └── sites-enabled
    └── projects

The problem:
 When I create a file in the "files" dir, It is synchronized correctly
in both servers, but when I create a file in the "archivos" dir, the
file is only created in the server srvfs1.
 If I umount the volume and remount It, the files get synchronized on
the first directory inspection, but after that sync, the described
behavior repeats.

Logs:
I've checked the logs when the created file is correctly sync and when
It is not and notice this error on the later:

    [2011-04-13 12:10:58.403971] D
[client3_1-fops.c:3283:client3_1_flush] 0-gls_tray01-data-client-1:
(1572916): failed to get fd ctx. EBADFD

 Attached are both files.


Thanks in advanced
Juan Pablo Baudoin
-------------- next part --------------
2011-04-13 12:09:54.463593] D [afr-transaction.c:1029:afr_post_nonblocking_entrylk_cbk] 0-gls_tray01-data-replicate-0: Non blocking entrylks done. Proceeding to FOP
[2011-04-13 12:09:54.465592] D [afr-lk-common.c:410:transaction_lk_op] 0-gls_tray01-data-replicate-0: lk op is for a transaction
[2011-04-13 12:09:54.467543] D [afr-transaction.c:979:afr_post_nonblocking_inodelk_cbk] 0-gls_tray01-data-replicate-0: Non blocking inodelks done. Proceeding to FOP
[2011-04-13 12:09:54.467632] D [afr-lk-common.c:410:transaction_lk_op] 0-gls_tray01-data-replicate-0: lk op is for a transaction
[2011-04-13 12:09:54.467651] D [afr-transaction.c:973:afr_post_nonblocking_inodelk_cbk] 0-gls_tray01-data-replicate-0: Non blocking inodelks failed. Proceeding to blocking
[2011-04-13 12:09:54.468032] D [client3_1-fops.c:641:client3_1_flush_cbk] 0-gls_tray01-data-client-0: Attempting to delete locks of owner=14147872386574953654
[2011-04-13 12:09:54.468055] D [client-lk.c:407:delete_granted_locks_owner] 0-gls_tray01-data-client-0: Number of locks cleared=0
[2011-04-13 12:09:54.469408] D [client3_1-fops.c:641:client3_1_flush_cbk] 0-gls_tray01-data-client-1: Attempting to delete locks of owner=14147872386574953654
[2011-04-13 12:09:54.469435] D [client-lk.c:407:delete_granted_locks_owner] 0-gls_tray01-data-client-1: Number of locks cleared=0
[2011-04-13 12:09:54.469453] D [afr-lk-common.c:410:transaction_lk_op] 0-gls_tray01-data-replicate-0: lk op is for a transaction
[2011-04-13 12:09:54.471663] D [afr-lk-common.c:987:afr_lock_blocking] 0-gls_tray01-data-replicate-0: we're done locking
[2011-04-13 12:09:54.471738] D [afr-transaction.c:953:afr_post_blocking_inodelk_cbk] 0-gls_tray01-data-replicate-0: Blocking inodelks done. Proceeding to FOP
[2011-04-13 12:09:54.473632] D [afr-lk-common.c:410:transaction_lk_op] 0-gls_tray01-data-replicate-0: lk op is for a transaction
[2011-04-13 12:09:54.475600] D [afr-transaction.c:979:afr_post_nonblocking_inodelk_cbk] 0-gls_tray01-data-replicate-0: Non blocking inodelks done. Proceeding to FOP
[2011-04-13 12:09:54.476068] D [client3_1-fops.c:641:client3_1_flush_cbk] 0-gls_tray01-data-client-0: Attempting to delete locks of owner=14147872386574953654
[2011-04-13 12:09:54.476140] D [client-lk.c:407:delete_granted_locks_owner] 0-gls_tray01-data-client-0: Number of locks cleared=0
[2011-04-13 12:09:54.477454] D [client3_1-fops.c:641:client3_1_flush_cbk] 0-gls_tray01-data-client-1: Attempting to delete locks of owner=14147872386574953654
[2011-04-13 12:09:54.477524] D [client-lk.c:407:delete_granted_locks_owner] 0-gls_tray01-data-client-1: Number of locks cleared=0
[2011-04-13 12:09:54.477590] D [afr-lk-common.c:410:transaction_lk_op] 0-gls_tray01-data-replicate-0: lk op is for a transaction
[2011-04-13 12:09:54.479489] D [client-lk.c:441:delete_granted_locks_fd] 0-gls_tray01-data-client-0: Number of locks cleared=0
[2011-04-13 12:09:54.479577] D [client-lk.c:441:delete_granted_locks_fd] 0-gls_tray01-data-client-1: Number of locks cleared=0
-------------- next part --------------
[2011-04-13 12:10:58.388588] D [afr-common.c:561:afr_lookup_collect_xattr] 0-gls_tray01-data-replicate-0: entry self-heal is pending for /archivos.
[2011-04-13 12:10:58.392093] D [client3_1-fops.c:1937:client3_1_lookup_cbk] 0-gls_tray01-data-client-1: gfid changed for /archivos
[2011-04-13 12:10:58.392126] I [afr-common.c:716:afr_lookup_done] 0-gls_tray01-data-replicate-0: background  entry self-heal triggered. path: /archivos
[2011-04-13 12:10:58.393914] D [afr-lk-common.c:415:transaction_lk_op] 0-gls_tray01-data-replicate-0: lk op is for a self heal
[2011-04-13 12:10:58.394349] D [afr-self-heal-entry.c:2230:afr_sh_post_nonblocking_entry_cbk] 0-gls_tray01-data-replicate-0: Non Blocking entrylks failed.
[2011-04-13 12:10:58.394372] I [afr-self-heal-common.c:1527:afr_self_heal_completion_cbk] 0-gls_tray01-data-replicate-0: background  entry self-heal completed on /archivos
[2011-04-13 12:10:58.398257] D [afr-lk-common.c:410:transaction_lk_op] 0-gls_tray01-data-replicate-0: lk op is for a transaction
[2011-04-13 12:10:58.398822] D [afr-transaction.c:1023:afr_post_nonblocking_entrylk_cbk] 0-gls_tray01-data-replicate-0: Non blocking entrylks failed. Proceeding to blocking
[2011-04-13 12:10:58.401286] D [afr-lk-common.c:987:afr_lock_blocking] 0-gls_tray01-data-replicate-0: we're done locking
[2011-04-13 12:10:58.401359] D [afr-transaction.c:1003:afr_post_blocking_entrylk_cbk] 0-gls_tray01-data-replicate-0: Blocking entrylks done. Proceeding to FOP
[2011-04-13 12:10:58.402915] D [afr-lk-common.c:410:transaction_lk_op] 0-gls_tray01-data-replicate-0: lk op is for a transaction
[2011-04-13 12:10:58.403064] D [afr-lk-common.c:410:transaction_lk_op] 0-gls_tray01-data-replicate-0: lk op is for a transaction
[2011-04-13 12:10:58.403414] D [afr-transaction.c:973:afr_post_nonblocking_inodelk_cbk] 0-gls_tray01-data-replicate-0: Non blocking inodelks failed. Proceeding to blocking
[2011-04-13 12:10:58.403815] D [afr-lk-common.c:987:afr_lock_blocking] 0-gls_tray01-data-replicate-0: we're done locking
[2011-04-13 12:10:58.403884] D [afr-transaction.c:953:afr_post_blocking_inodelk_cbk] 0-gls_tray01-data-replicate-0: Blocking inodelks done. Proceeding to FOP
[2011-04-13 12:10:58.403971] D [client3_1-fops.c:3283:client3_1_flush] 0-gls_tray01-data-client-1: (1572916): failed to get fd ctx. EBADFD
[2011-04-13 12:10:58.404111] D [afr-lk-common.c:410:transaction_lk_op] 0-gls_tray01-data-replicate-0: lk op is for a transaction
[2011-04-13 12:10:58.404179] D [afr-transaction.c:973:afr_post_nonblocking_inodelk_cbk] 0-gls_tray01-data-replicate-0: Non blocking inodelks failed. Proceeding to blocking
[2011-04-13 12:10:58.404291] D [client3_1-fops.c:641:client3_1_flush_cbk] 0-gls_tray01-data-client-0: Attempting to delete locks of owner=14147872386574953654
[2011-04-13 12:10:58.404366] D [client-lk.c:407:delete_granted_locks_owner] 0-gls_tray01-data-client-0: Number of locks cleared=0
[2011-04-13 12:10:58.404435] D [afr-lk-common.c:410:transaction_lk_op] 0-gls_tray01-data-replicate-0: lk op is for a transaction
[2011-04-13 12:10:58.406589] D [afr-lk-common.c:987:afr_lock_blocking] 0-gls_tray01-data-replicate-0: we're done locking
[2011-04-13 12:10:58.406658] D [afr-transaction.c:953:afr_post_blocking_inodelk_cbk] 0-gls_tray01-data-replicate-0: Blocking inodelks done. Proceeding to FOP
[2011-04-13 12:10:58.407532] D [afr-lk-common.c:410:transaction_lk_op] 0-gls_tray01-data-replicate-0: lk op is for a transaction
[2011-04-13 12:10:58.409431] D [afr-lk-common.c:410:transaction_lk_op] 0-gls_tray01-data-replicate-0: lk op is for a transaction
[2011-04-13 12:10:58.409913] D [afr-transaction.c:973:afr_post_nonblocking_inodelk_cbk] 0-gls_tray01-data-replicate-0: Non blocking inodelks failed. Proceeding to blocking
[2011-04-13 12:10:58.410301] D [afr-lk-common.c:987:afr_lock_blocking] 0-gls_tray01-data-replicate-0: we're done locking
[2011-04-13 12:10:58.410370] D [afr-transaction.c:953:afr_post_blocking_inodelk_cbk] 0-gls_tray01-data-replicate-0: Blocking inodelks done. Proceeding to FOP
[2011-04-13 12:10:58.410458] D [client3_1-fops.c:3283:client3_1_flush] 0-gls_tray01-data-client-1: (1572916): failed to get fd ctx. EBADFD
[2011-04-13 12:10:58.410777] D [client3_1-fops.c:641:client3_1_flush_cbk] 0-gls_tray01-data-client-0: Attempting to delete locks of owner=14147872386574953654
[2011-04-13 12:10:58.410849] D [client-lk.c:407:delete_granted_locks_owner] 0-gls_tray01-data-client-0: Number of locks cleared=0
[2011-04-13 12:10:58.410920] D [afr-lk-common.c:410:transaction_lk_op] 0-gls_tray01-data-replicate-0: lk op is for a transaction
[2011-04-13 12:10:58.411309] D [client-lk.c:441:delete_granted_locks_fd] 0-gls_tray01-data-client-0: Number of locks cleared=0


More information about the Gluster-users mailing list