[Gluster-users] Problem with add-brick

Ravishankar N ravishankar at redhat.com
Tue Sep 27 16:40:36 UTC 2016


On 09/27/2016 09:53 PM, Dennis Michael wrote:
> Yes, you are right.  I mixed up the logs.  I just ran the add-brick 
> command again after cleaning up fs4 and re-installing gluster.  This 
> is the complete fs4 data-brick.log.
>
> [root at fs1 ~]# gluster volume add-brick cees-data fs4:/data/brick
> volume add-brick: failed: Commit failed on fs4. Please check log file 
> for details.
>
> [root at fs4 bricks]# pwd
> /var/log/glusterfs/bricks
> [root at fs4 bricks]# cat data-brick.log
> [2016-09-27 16:16:28.095661] I [MSGID: 100030] 
> [glusterfsd.c:2338:main] 0-/usr/sbin/glusterfsd: Started running 
> /usr/sbin/glusterfsd version 3.7.14 (args: /usr/sbin/glusterfsd -s fs4 
> --volfile-id cees-data.fs4.data-brick -p 
> /var/lib/glusterd/vols/cees-data/run/fs4-data-brick.pid -S 
> /var/run/gluster/5203ab38be21e1d37c04f6bdfee77d4a.socket --brick-name 
> /data/brick -l /var/log/glusterfs/bricks/data-brick.log 
> --xlator-option 
> *-posix.glusterd-uuid=f04b231e-63f8-4374-91ae-17c0c623f165 
> --brick-port 49152 --xlator-option 
> cees-data-server.transport.rdma.listen-port=49153 --xlator-option 
> cees-data-server.listen-port=49152 --volfile-server-transport=socket,rdma)
> [2016-09-27 16:16:28.101547] I [MSGID: 101190] 
> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started 
> thread with index 1
> [2016-09-27 16:16:28.104637] I [graph.c:269:gf_add_cmdline_options] 
> 0-cees-data-server: adding option 'listen-port' for volume 
> 'cees-data-server' with value '49152'
> [2016-09-27 16:16:28.104646] I [graph.c:269:gf_add_cmdline_options] 
> 0-cees-data-server: adding option 'transport.rdma.listen-port' for 
> volume 'cees-data-server' with value '49153'
> [2016-09-27 16:16:28.104662] I [graph.c:269:gf_add_cmdline_options] 
> 0-cees-data-posix: adding option 'glusterd-uuid' for volume 
> 'cees-data-posix' with value 'f04b231e-63f8-4374-91ae-17c0c623f165'
> [2016-09-27 16:16:28.104808] I [MSGID: 115034] 
> [server.c:403:_check_for_auth_option] 0-/data/brick: skip format check 
> for non-addr auth option auth.login./data/brick.allow
> [2016-09-27 16:16:28.104814] I [MSGID: 115034] 
> [server.c:403:_check_for_auth_option] 0-/data/brick: skip format check 
> for non-addr auth option 
> auth.login.18ddaf4c-ad98-4155-9372-717eae718b4c.password
> [2016-09-27 16:16:28.104883] I [MSGID: 101190] 
> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started 
> thread with index 2
> [2016-09-27 16:16:28.105479] I 
> [rpcsvc.c:2196:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service: 
> Configured rpc.outstanding-rpc-limit with value 64
> [2016-09-27 16:16:28.105532] W [MSGID: 101002] 
> [options.c:957:xl_opt_validate] 0-cees-data-server: option 
> 'listen-port' is deprecated, preferred is 
> 'transport.socket.listen-port', continuing with correction
> [2016-09-27 16:16:28.109456] W [socket.c:3665:reconfigure] 
> 0-cees-data-quota: NBIO on -1 failed (Bad file descriptor)
> [2016-09-27 16:16:28.489255] I [MSGID: 121050] 
> [ctr-helper.c:259:extract_ctr_options] 0-gfdbdatastore: CTR Xlator is 
> disabled.
> [2016-09-27 16:16:28.489272] W [MSGID: 101105] 
> [gfdb_sqlite3.h:239:gfdb_set_sql_params] 
> 0-cees-data-changetimerecorder: Failed to retrieve sql-db-pagesize 
> from params.Assigning default value: 4096
> [2016-09-27 16:16:28.489278] W [MSGID: 101105] 
> [gfdb_sqlite3.h:239:gfdb_set_sql_params] 
> 0-cees-data-changetimerecorder: Failed to retrieve sql-db-journalmode 
> from params.Assigning default value: wal
> [2016-09-27 16:16:28.489284] W [MSGID: 101105] 
> [gfdb_sqlite3.h:239:gfdb_set_sql_params] 
> 0-cees-data-changetimerecorder: Failed to retrieve sql-db-sync from 
> params.Assigning default value: off
> [2016-09-27 16:16:28.489288] W [MSGID: 101105] 
> [gfdb_sqlite3.h:239:gfdb_set_sql_params] 
> 0-cees-data-changetimerecorder: Failed to retrieve sql-db-autovacuum 
> from params.Assigning default value: none
> [2016-09-27 16:16:28.490431] I [trash.c:2412:init] 0-cees-data-trash: 
> no option specified for 'eliminate', using NULL
> [2016-09-27 16:16:28.672814] W [graph.c:357:_log_if_unknown_option] 
> 0-cees-data-server: option 'rpc-auth.auth-glusterfs' is not recognized
> [2016-09-27 16:16:28.672854] W [graph.c:357:_log_if_unknown_option] 
> 0-cees-data-server: option 'rpc-auth.auth-unix' is not recognized
> [2016-09-27 16:16:28.672872] W [graph.c:357:_log_if_unknown_option] 
> 0-cees-data-server: option 'rpc-auth.auth-null' is not recognized
> [2016-09-27 16:16:28.672924] W [graph.c:357:_log_if_unknown_option] 
> 0-cees-data-quota: option 'timeout' is not recognized
> [2016-09-27 16:16:28.672955] W [graph.c:357:_log_if_unknown_option] 
> 0-cees-data-trash: option 'brick-path' is not recognized
> Final graph:
> +------------------------------------------------------------------------------+
>   1: volume cees-data-posix
>   2:     type storage/posix
>   3:     option glusterd-uuid f04b231e-63f8-4374-91ae-17c0c623f165
>   4:     option directory /data/brick
>   5:     option volume-id 27d2a59c-bdac-4f66-bcd8-e6124e53a4a2
>   6:     option update-link-count-parent on
>   7: end-volume
>   8:
>   9: volume cees-data-trash
>  10:     type features/trash
>  11:     option trash-dir .trashcan
>  12:     option brick-path /data/brick
>  13:     option trash-internal-op off
>  14:     subvolumes cees-data-posix
>  15: end-volume
>  16:
>  17: volume cees-data-changetimerecorder
>  18:     type features/changetimerecorder
>  19:     option db-type sqlite3
>  20:     option hot-brick off
>  21:     option db-name brick.db
>  22:     option db-path /data/brick/.glusterfs/
>  23:     option record-exit off
>  24:     option ctr_link_consistency off
>  25:     option ctr_lookupheal_link_timeout 300
>  26:     option ctr_lookupheal_inode_timeout 300
>  27:     option record-entry on
>  28:     option ctr-enabled off
>  29:     option record-counters off
>  30:     option ctr-record-metadata-heat off
>  31:     option sql-db-cachesize 1000
>  32:     option sql-db-wal-autocheckpoint 1000
>  33:     subvolumes cees-data-trash
>  34: end-volume
>  35:
>  36: volume cees-data-changelog
>  37:     type features/changelog
>  38:     option changelog-brick /data/brick
>  39:     option changelog-dir /data/brick/.glusterfs/changelogs
>  40:     option changelog-barrier-timeout 120
>  41:     subvolumes cees-data-changetimerecorder
>  42: end-volume
>  43:
>  44: volume cees-data-bitrot-stub
>  45:     type features/bitrot-stub
>  46:     option export /data/brick
>  47:     subvolumes cees-data-changelog
>  48: end-volume
>  49:
>  50: volume cees-data-access-control
>  51:     type features/access-control
>  52:     subvolumes cees-data-bitrot-stub
>  53: end-volume
>  54:
>  55: volume cees-data-locks
>  56:     type features/locks
>  57:     subvolumes cees-data-access-control
>  58: end-volume
>  59:
>  60: volume cees-data-upcall
>  61:     type features/upcall
>  62:     option cache-invalidation off
>  63:     subvolumes cees-data-locks
>  64: end-volume
>  65:
>  66: volume cees-data-io-threads
>  67:     type performance/io-threads
>  68:     subvolumes cees-data-upcall
>  69: end-volume
>  70:
>  71: volume cees-data-marker
>  72:     type features/marker
>  73:     option volume-uuid 27d2a59c-bdac-4f66-bcd8-e6124e53a4a2
>  74:     option timestamp-file 
> /var/lib/glusterd/vols/cees-data/marker.tstamp
>  75:     option quota-version 1
>  76:     option xtime off
>  77:     option gsync-force-xtime off
>  78:     option quota on
>  79:     option inode-quota on
>  80:     subvolumes cees-data-io-threads
>  81: end-volume
>  82:
>  83: volume cees-data-barrier
>  84:     type features/barrier
>  85:     option barrier disable
>  86:     option barrier-timeout 120
>  87:     subvolumes cees-data-marker
>  88: end-volume
>  89:
>  90: volume cees-data-index
>  91:     type features/index
>  92:     option index-base /data/brick/.glusterfs/indices
>  93:     subvolumes cees-data-barrier
>  94: end-volume
>  95:
>  96: volume cees-data-quota
>  97:     type features/quota
>  98:     option transport.socket.connect-path 
> /var/run/gluster/quotad.socket
>  99:     option transport-type socket
> 100:     option transport.address-family unix
> 101:     option volume-uuid cees-data
> 102:     option server-quota on
> 103:     option timeout 0
> 104:     option deem-statfs on
> 105:     subvolumes cees-data-index
> 106: end-volume
> 107:
> 108: volume cees-data-worm
> 109:     type features/worm
> 110:     option worm off
> 111:     subvolumes cees-data-quota
> 112: end-volume
> 113:
> 114: volume cees-data-read-only
> 115:     type features/read-only
> 116:     option read-only off
> 117:     subvolumes cees-data-worm
> 118: end-volume
> 119:
> 120: volume /data/brick
> 121:     type debug/io-stats
> 122:     option log-level INFO
> 123:     option latency-measurement off
> 124:     option count-fop-hits off
> 125:     subvolumes cees-data-read-only
> 126: end-volume
> 127:
> 128: volume cees-data-server
> 129:     type protocol/server
> 130:     option transport.socket.listen-port 49152
> 131:     option rpc-auth.auth-glusterfs on
> 132:     option rpc-auth.auth-unix on
> 133:     option rpc-auth.auth-null on
> 134:     option rpc-auth-allow-insecure on
> 135:     option transport.rdma.listen-port 49153
> 136:     option transport-type tcp,rdma
> 137:     option auth.login./data/brick.allow 
> 18ddaf4c-ad98-4155-9372-717eae718b4c
> 138:     option 
> auth.login.18ddaf4c-ad98-4155-9372-717eae718b4c.password 
> 9e913e92-7de0-47f9-94ed-d08cbb130d23
> 139:     option auth.addr./data/brick.allow *
> 140:     subvolumes /data/brick
> 141: end-volume
> 142:
> +------------------------------------------------------------------------------+
> [2016-09-27 16:16:30.079541] I [login.c:81:gf_auth] 0-auth/login: 
> allowed user names: 18ddaf4c-ad98-4155-9372-717eae718b4c
> [2016-09-27 16:16:30.079567] I [MSGID: 115029] 
> [server-handshake.c:690:server_setvolume] 0-cees-data-server: accepted 
> client from fs3-12560-2016/09/27-16:16:30:47674-cees-data-client-3-0-0 
> (version: 3.7.14)
> [2016-09-27 16:16:30.081487] I [login.c:81:gf_auth] 0-auth/login: 
> allowed user names: 18ddaf4c-ad98-4155-9372-717eae718b4c
> [2016-09-27 16:16:30.081505] I [MSGID: 115029] 
> [server-handshake.c:690:server_setvolume] 0-cees-data-server: accepted 
> client from fs2-11709-2016/09/27-16:16:30:50047-cees-data-client-3-0-0 
> (version: 3.7.14)
> [2016-09-27 16:16:30.111091] I [login.c:81:gf_auth] 0-auth/login: 
> allowed user names: 18ddaf4c-ad98-4155-9372-717eae718b4c
> [2016-09-27 16:16:30.111113] I [MSGID: 115029] 
> [server-handshake.c:690:server_setvolume] 0-cees-data-server: accepted 
> client from fs2-11701-2016/09/27-16:16:29:24060-cees-data-client-3-0-0 
> (version: 3.7.14)
> [2016-09-27 16:16:30.112822] I [login.c:81:gf_auth] 0-auth/login: 
> allowed user names: 18ddaf4c-ad98-4155-9372-717eae718b4c
> [2016-09-27 16:16:30.112836] I [MSGID: 115029] 
> [server-handshake.c:690:server_setvolume] 0-cees-data-server: accepted 
> client from fs3-12552-2016/09/27-16:16:29:23041-cees-data-client-3-0-0 
> (version: 3.7.14)
> [2016-09-27 16:16:31.950978] I [login.c:81:gf_auth] 0-auth/login: 
> allowed user names: 18ddaf4c-ad98-4155-9372-717eae718b4c
> [2016-09-27 16:16:31.950998] I [MSGID: 115029] 
> [server-handshake.c:690:server_setvolume] 0-cees-data-server: accepted 
> client from fs1-6721-2016/09/27-16:16:26:939991-cees-data-client-3-0-0 
> (version: 3.7.14)
> [2016-09-27 16:16:31.981977] I [login.c:81:gf_auth] 0-auth/login: 
> allowed user names: 18ddaf4c-ad98-4155-9372-717eae718b4c
> [2016-09-27 16:16:31.981994] I [MSGID: 115029] 
> [server-handshake.c:690:server_setvolume] 0-cees-data-server: accepted 
> client from fs1-6729-2016/09/27-16:16:27:971228-cees-data-client-3-0-0 
> (version: 3.7.14)
>

Hmm, this shows the brick has started.
Does gluster volume info on fs4 shows all 4 bricks? (I guess it does 
based on your first email).
Does gluster volume status on fs4  (or ps aux|grep glusterfsd) show the 
brick as running?
Does gluster peer status on all nodes list the other 3 nodes as connected?

If yes, you could try `service glusterd restart` on fs4 and see if if 
brings up the brick? I'm just shooting in the dark here for possible clues.
-Ravi

> On Tue, Sep 27, 2016 at 8:46 AM, Ravishankar N <ravishankar at redhat.com 
> <mailto:ravishankar at redhat.com>> wrote:
>
>     On 09/27/2016 09:06 PM, Dennis Michael wrote:
>>     Yes, the brick log /var/log/glusterfs/bricks/data-brick.log is
>>     created on fs4, and the snippets showing the errors were from
>>     that log.
>>
>     Unless I'm missing something, the snippet below is from glusterd's
>     log and not the brick's as is evident from the function names.
>     -Ravi
>>     Dennis
>>
>>     On Mon, Sep 26, 2016 at 5:58 PM, Ravishankar N
>>     <ravishankar at redhat.com <mailto:ravishankar at redhat.com>> wrote:
>>
>>         On 09/27/2016 05:25 AM, Dennis Michael wrote:
>>
>>             [2016-09-26 22:44:39.254921] E [MSGID: 106005]
>>             [glusterd-utils.c:4771:glusterd_brick_start]
>>             0-management: Unable to start brick fs4:/data/brick
>>             [2016-09-26 22:44:39.254949] E [MSGID: 106074]
>>             [glusterd-brick-ops.c:2372:glusterd_op_add_brick]
>>             0-glusterd: Unable to add bricks
>>
>>
>>         Is the brick log created on fs4? Does it contain warnings/errors?
>>
>>         -Ravi
>>
>>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160927/97771adb/attachment.html>


More information about the Gluster-users mailing list