[Gluster-devel] Skipped files during rebalance

Tue Aug 18 08:45:53 UTC 2015

Hi Christophe,

   Need some info regarding the high mem-usage.

1. Top output: To see whether any other process eating up memory.
2. Gluster volume info
3. Is rebalance process still running? If yes can you point to specific mem usage by rebalance process? The high mem-usage was seen during rebalance or even post rebalance?
4. Gluster version

Will ask for more information in case needed.

Regards,
Susant

----- Original Message -----
> From: "Christophe TREFOIS" <christophe.trefois at uni.lu>
> To: "Raghavendra Gowdappa" <rgowdapp at redhat.com>, "Nithya Balachandran" <nbalacha at redhat.com>, "Susant Palai"
> <spalai at redhat.com>, "Shyamsundar Ranganathan" <srangana at redhat.com>
> Cc: "Mohammed Rafi K C" <rkavunga at redhat.com>
> Sent: Monday, 17 August, 2015 7:03:20 PM
> Subject: Fwd: [Gluster-devel] Skipped files during rebalance
> 
> Hi DHT team,
> 
> This email somehow didn’t get forwarded to you.
> 
> In addition to my problem described below, here is one example of free memory
> after everything failed
> 
> [root at highlander ~]# pdsh -g live 'free -m'
> stor106:               total        used        free      shared  buff/cache
> available
> stor106: Mem:         193249      124784        1347           9       67118
> 12769
> stor106: Swap:             0           0           0
> stor104:               total        used        free      shared  buff/cache
> available
> stor104: Mem:         193249      107617       31323           9       54308
> 42752
> stor104: Swap:             0           0           0
> stor105:               total        used        free      shared  buff/cache
> available
> stor105: Mem:         193248      141804        6736           9       44707
> 9713
> stor105: Swap:             0           0           0
> 
> So after the failed operation, there’s almost no memory free, and it is also
> not freed up.
> 
> Thank you for pointing me to any directions,
> 
> Kind regards,
> 
> —
> Christophe
> 
> 
> Begin forwarded message:
> 
> From: Christophe TREFOIS
> <christophe.trefois at uni.lu<mailto:christophe.trefois at uni.lu>>
> Subject: Re: [Gluster-devel] Skipped files during rebalance
> Date: 17 Aug 2015 11:54:32 CEST
> To: Mohammed Rafi K C <rkavunga at redhat.com<mailto:rkavunga at redhat.com>>
> Cc: "gluster-devel at gluster.org<mailto:gluster-devel at gluster.org>"
> <gluster-devel at gluster.org<mailto:gluster-devel at gluster.org>>
> 
> Dear Rafi,
> 
> Thanks for submitting a patch.
> 
> @DHT, I have two additional questions / problems.
> 
> 1. When doing a rebalance (with data) RAM consumption on the nodes goes
> dramatically high, eg out of 196 GB available per node, RAM usage would fill
> up to 195.6 GB. This seems quite excessive and strange to me.
> 
> 2. As you can see, the rebalance (with data) failed as one endpoint becomes
> unconnected (even though it still is connected). I’m thinking this could be
> due to the high RAM usage?
> 
> Thank you for your help,
> 
> —
> Christophe
> 
> Dr Christophe Trefois, Dipl.-Ing.
> Technical Specialist / Post-Doc
> 
> UNIVERSITÉ DU LUXEMBOURG
> 
> LUXEMBOURG CENTRE FOR SYSTEMS BIOMEDICINE
> Campus Belval | House of Biomedicine
> 6, avenue du Swing
> L-4367 Belvaux
> T: +352 46 66 44 6124
> F: +352 46 66 44 6949
> http://www.uni.lu/lcsb
> 
> [Facebook]<https://www.facebook.com/trefex>  [Twitter]
> <https://twitter.com/Trefex>   [Google Plus]
> <https://plus.google.com/+ChristopheTrefois/>   [Linkedin]
> <https://www.linkedin.com/in/trefoischristophe>   [skype]
> <http://skype:Trefex?call>
> 
> 
> ----
> This message is confidential and may contain privileged information.
> It is intended for the named recipient only.
> If you receive it in error please notify me and permanently delete the
> original message and any copies.
> ----
> 
> 
> 
> On 17 Aug 2015, at 11:27, Mohammed Rafi K C
> <rkavunga at redhat.com<mailto:rkavunga at redhat.com>> wrote:
> 
> 
> 
> On 08/17/2015 01:58 AM, Christophe TREFOIS wrote:
> Dear all,
> 
> I have successfully added a new node to our setup, and finally managed to get
> a successful fix-layout run as well with no errors.
> 
> Now, as per the documentation, I started a gluster volume rebalance live
> start task and I see many skipped files.
> The error log contains then entires as follows for each skipped file.
> 
> [2015-08-16 20:23:30.591161] E [MSGID: 109023]
> [dht-rebalance.c:1965:gf_defrag_get_entry] 0-live-dht: Migrate file
> failed:/hcs/hcs/OperaArchiveCol/SK 20131011_Oligo_Rot_lowConc_P1/Mea
> s_05(2013-10-11_17-12-02)/004010008.flex lookup failed
> [2015-08-16 20:23:30.768391] E [MSGID: 109023]
> [dht-rebalance.c:1965:gf_defrag_get_entry] 0-live-dht: Migrate file
> failed:/hcs/hcs/OperaArchiveCol/SK 20131011_Oligo_Rot_lowConc_P1/Mea
> s_05(2013-10-11_17-12-02)/007005003.flex lookup failed
> [2015-08-16 20:23:30.804811] E [MSGID: 109023]
> [dht-rebalance.c:1965:gf_defrag_get_entry] 0-live-dht: Migrate file
> failed:/hcs/hcs/OperaArchiveCol/SK 20131011_Oligo_Rot_lowConc_P1/Mea
> s_05(2013-10-11_17-12-02)/006005009.flex lookup failed
> [2015-08-16 20:23:30.805201] E [MSGID: 109023]
> [dht-rebalance.c:1965:gf_defrag_get_entry] 0-live-dht: Migrate file
> failed:/hcs/hcs/OperaArchiveCol/SK 20131011_Oligo_Rot_lowConc_P1/Mea
> s_05(2013-10-11_17-12-02)/005006011.flex lookup failed
> [2015-08-16 20:23:30.880037] E [MSGID: 109023]
> [dht-rebalance.c:1965:gf_defrag_get_entry] 0-live-dht: Migrate file
> failed:/hcs/hcs/OperaArchiveCol/SK 20131011_Oligo_Rot_lowConc_P1/Mea
> s_05(2013-10-11_17-12-02)/005009012.flex lookup failed
> [2015-08-16 20:23:31.038236] E [MSGID: 109023]
> [dht-rebalance.c:1965:gf_defrag_get_entry] 0-live-dht: Migrate file
> failed:/hcs/hcs/OperaArchiveCol/SK 20131011_Oligo_Rot_lowConc_P1/Mea
> s_05(2013-10-11_17-12-02)/003008007.flex lookup failed
> [2015-08-16 20:23:31.259762] E [MSGID: 109023]
> [dht-rebalance.c:1965:gf_defrag_get_entry] 0-live-dht: Migrate file
> failed:/hcs/hcs/OperaArchiveCol/SK 20131011_Oligo_Rot_lowConc_P1/Mea
> s_05(2013-10-11_17-12-02)/004008006.flex lookup failed
> [2015-08-16 20:23:31.333764] E [MSGID: 109023]
> [dht-rebalance.c:1965:gf_defrag_get_entry] 0-live-dht: Migrate file
> failed:/hcs/hcs/OperaArchiveCol/SK 20131011_Oligo_Rot_lowConc_P1/Mea
> s_05(2013-10-11_17-12-02)/007008001.flex lookup failed
> [2015-08-16 20:23:31.340190] E [MSGID: 109023]
> [dht-rebalance.c:1965:gf_defrag_get_entry] 0-live-dht: Migrate file
> failed:/hcs/hcs/OperaArchiveCol/SK 20131011_Oligo_Rot_lowConc_P1/Mea
> s_05(2013-10-11_17-12-02)/006007004.flex lookup failed
> 
> Update: one of the rebalance tasks now failed.
> 
> @Rafi, I got the same error as Friday except this time with data.
> 
> Packets that carrying the ping request could be waiting in the queue during
> the whole time-out period, because of the heavy traffic in the network. I
> have sent a patch for this. You can track the status here :
> http://review.gluster.org/11935
> 
> 
> 
> [2015-08-16 20:24:34.533167] C
> [rpc-clnt-ping.c:161:rpc_clnt_ping_timer_expired] 0-live-client-0: server
> 192.168.123.104:49164 has not responded in the last 42 seconds,
> disconnecting.
> [2015-08-16 20:24:34.533614] E [rpc-clnt.c:362:saved_frames_unwind] (-->
> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x196)[0x7fa454de59e6] (-->
> /lib64/libgfrpc.so.0(saved_frames_unwin
> d+0x1de)[0x7fa454bb09be] (-->
> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fa454bb0ace] (-->
> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x9c)[0x7fa454bb247c] (-->
> /lib64/li
> bgfrpc.so.0(rpc_clnt_notify+0x48)[0x7fa454bb2c38] ))))) 0-live-client-0:
> forced unwinding frame type(GlusterFS 3.3) op(INODELK(29)) called at
> 2015-08-16 20:23:51.305640 (xid=0x5dd4da)
> [2015-08-16 20:24:34.533672] E [MSGID: 114031]
> [client-rpc-fops.c:1621:client3_3_inodelk_cbk] 0-live-client-0: remote
> operation failed [Transport endpoint is not connected]
> [2015-08-16 20:24:34.534201] E [rpc-clnt.c:362:saved_frames_unwind] (-->
> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x196)[0x7fa454de59e6] (-->
> /lib64/libgfrpc.so.0(saved_frames_unwin
> d+0x1de)[0x7fa454bb09be] (-->
> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fa454bb0ace] (-->
> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x9c)[0x7fa454bb247c] (-->
> /lib64/li
> bgfrpc.so.0(rpc_clnt_notify+0x48)[0x7fa454bb2c38] ))))) 0-live-client-0:
> forced unwinding frame type(GlusterFS 3.3) op(READ(12)) called at 2015-08-16
> 20:23:51.303938 (xid=0x5dd4d7)
> [2015-08-16 20:24:34.534347] E [MSGID: 109023]
> [dht-rebalance.c:1124:dht_migrate_file] 0-live-dht: Migrate file failed:
> /hcs/hcs/OperaArchiveCol/SK 20131011_Oligo_Rot_lowConc_P1/Meas_
> 12(2013-10-12_00-12-55)/007008007.flex: failed to migrate data
> [2015-08-16 20:24:34.534413] E [rpc-clnt.c:362:saved_frames_unwind] (-->
> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x196)[0x7fa454de59e6] (-->
> /lib64/libgfrpc.so.0(saved_frames_unwin
> d+0x1de)[0x7fa454bb09be] (-->
> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fa454bb0ace] (-->
> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x9c)[0x7fa454bb247c] (-->
> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x48)[0x7fa454bb2c38] )))))
> 0-live-client-0: forced unwinding frame type(GlusterFS 3.3) op(READ(12))
> called at 2015-08-16 20:23:51.303969 (xid=0x5dd4d8)
> [2015-08-16 20:24:34.534579] E [MSGID: 109023]
> [dht-rebalance.c:1124:dht_migrate_file] 0-live-dht: Migrate file failed:
> /hcs/hcs/OperaArchiveCol/SK
> 20131011_Oligo_Rot_lowConc_P1/Meas_12(2013-10-12_00-12-55)/007009012.flex:
> failed to migrate data
> [2015-08-16 20:24:34.534676] E [rpc-clnt.c:362:saved_frames_unwind] (-->
> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x196)[0x7fa454de59e6] (-->
> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7fa454bb09be] (-->
> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fa454bb0ace] (-->
> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x9c)[0x7fa454bb247c] (-->
> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x48)[0x7fa454bb2c38] )))))
> 0-live-client-0: forced unwinding frame type(GlusterFS 3.3) op(READ(12))
> called at 2015-08-16 20:23:51.313548 (xid=0x5dd4db)
> [2015-08-16 20:24:34.534745] E [MSGID: 109023]
> [dht-rebalance.c:1124:dht_migrate_file] 0-live-dht: Migrate file failed:
> /hcs/hcs/OperaArchiveCol/SK
> 20131011_Oligo_Rot_lowConc_P1/Meas_12(2013-10-12_00-12-55)/006008011.flex:
> failed to migrate data
> [2015-08-16 20:24:34.535199] E [rpc-clnt.c:362:saved_frames_unwind] (-->
> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x196)[0x7fa454de59e6] (-->
> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7fa454bb09be] (-->
> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fa454bb0ace] (-->
> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x9c)[0x7fa454bb247c] (-->
> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x48)[0x7fa454bb2c38] )))))
> 0-live-client-0: forced unwinding frame type(GlusterFS 3.3) op(READ(12))
> called at 2015-08-16 20:23:51.326369 (xid=0x5dd4dc)
> [2015-08-16 20:24:34.535232] E [MSGID: 109023]
> [dht-rebalance.c:1124:dht_migrate_file] 0-live-dht: Migrate file failed:
> /hcs/hcs/OperaArchiveCol/SK
> 20131011_Oligo_Rot_lowConc_P1/Meas_12(2013-10-12_00-12-55)/005003001.flex:
> failed to migrate data
> [2015-08-16 20:24:34.535984] E [rpc-clnt.c:362:saved_frames_unwind] (-->
> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x196)[0x7fa454de59e6] (-->
> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7fa454bb09be] (-->
> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fa454bb0ace] (-->
> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x9c)[0x7fa454bb247c] (-->
> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x48)[0x7fa454bb2c38] )))))
> 0-live-client-0: forced unwinding frame type(GlusterFS 3.3) op(READ(12))
> called at 2015-08-16 20:23:51.326437 (xid=0x5dd4dd)
> [2015-08-16 20:24:34.536069] E [MSGID: 109023]
> [dht-rebalance.c:1124:dht_migrate_file] 0-live-dht: Migrate file failed:
> /hcs/hcs/OperaArchiveCol/SK
> 20131011_Oligo_Rot_lowConc_P1/Meas_12(2013-10-12_00-12-55)/007010012.flex:
> failed to migrate data
> [2015-08-16 20:24:34.536267] E [rpc-clnt.c:362:saved_frames_unwind] (-->
> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x196)[0x7fa454de59e6] (-->
> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7fa454bb09be] (-->
> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fa454bb0ace] (-->
> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x9c)[0x7fa454bb247c] (-->
> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x48)[0x7fa454bb2c38] )))))
> 0-live-client-0: forced unwinding frame type(GlusterFS 3.3) op(LOOKUP(27))
> called at 2015-08-16 20:23:51.337240 (xid=0x5dd4de)
> [2015-08-16 20:24:34.536339] E [MSGID: 109023]
> [dht-rebalance.c:1965:gf_defrag_get_entry] 0-live-dht: Migrate file
> failed:/hcs/hcs/OperaArchiveCol/SK
> 20131011_Oligo_Rot_lowConc_P1/Meas_08(2013-10-11_20-12-25)/002005012.flex
> lookup failed
> [2015-08-16 20:24:34.536487] E [rpc-clnt.c:362:saved_frames_unwind] (-->
> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x196)[0x7fa454de59e6] (-->
> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7fa454bb09be] (-->
> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fa454bb0ace] (-->
> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x9c)[0x7fa454bb247c] (-->
> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x48)[0x7fa454bb2c38] )))))
> 0-live-client-0: forced unwinding frame type(GlusterFS 3.3) op(LOOKUP(27))
> called at 2015-08-16 20:23:51.425254 (xid=0x5dd4df)
> [2015-08-16 20:24:34.536685] E [rpc-clnt.c:362:saved_frames_unwind] (-->
> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x196)[0x7fa454de59e6] (-->
> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7fa454bb09be] (-->
> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fa454bb0ace] (-->
> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x9c)[0x7fa454bb247c] (-->
> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x48)[0x7fa454bb2c38] )))))
> 0-live-client-0: forced unwinding frame type(GlusterFS 3.3) op(LOOKUP(27))
> called at 2015-08-16 20:23:51.738907 (xid=0x5dd4e0)
> [2015-08-16 20:24:34.536891] E [rpc-clnt.c:362:saved_frames_unwind] (-->
> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x196)[0x7fa454de59e6] (-->
> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7fa454bb09be] (-->
> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fa454bb0ace] (-->
> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x9c)[0x7fa454bb247c] (-->
> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x48)[0x7fa454bb2c38] )))))
> 0-live-client-0: forced unwinding frame type(GlusterFS 3.3) op(LOOKUP(27))
> called at 2015-08-16 20:23:51.805096 (xid=0x5dd4e1)
> [2015-08-16 20:24:34.537316] E [rpc-clnt.c:362:saved_frames_unwind] (-->
> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x196)[0x7fa454de59e6] (-->
> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7fa454bb09be] (-->
> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fa454bb0ace] (-->
> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x9c)[0x7fa454bb247c] (-->
> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x48)[0x7fa454bb2c38] )))))
> 0-live-client-0: forced unwinding frame type(GlusterFS 3.3) op(LOOKUP(27))
> called at 2015-08-16 20:23:51.805977 (xid=0x5dd4e2)
> [2015-08-16 20:24:34.537735] E [rpc-clnt.c:362:saved_frames_unwind] (-->
> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x196)[0x7fa454de59e6] (-->
> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7fa454bb09be] (-->
> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fa454bb0ace] (-->
> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x9c)[0x7fa454bb247c] (-->
> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x48)[0x7fa454bb2c38] )))))
> 0-live-client-0: forced unwinding frame type(GF-DUMP) op(NULL(2)) called at
> 2015-08-16 20:23:52.530107 (xid=0x5dd4e3)
> [2015-08-16 20:24:34.538475] E [MSGID: 114031]
> [client-rpc-fops.c:1621:client3_3_inodelk_cbk] 0-live-client-0: remote
> operation failed [Transport endpoint is not connected]
> The message "E [MSGID: 114031] [client-rpc-fops.c:1621:client3_3_inodelk_cbk]
> 0-live-client-0: remote operation failed [Transport endpoint is not
> connected]" repeated 4 times between [2015-08-16 20:24:34.538475] and
> [2015-08-16 20:24:34.538535]
> [2015-08-16 20:24:34.538584] E [MSGID: 109023]
> [dht-rebalance.c:1617:gf_defrag_migrate_single_file] 0-live-dht: Migrate
> file failed: 002004003.flex lookup failed
> [2015-08-16 20:24:34.538904] E [MSGID: 109023]
> [dht-rebalance.c:1617:gf_defrag_migrate_single_file] 0-live-dht: Migrate
> file failed: 003009008.flex lookup failed
> [2015-08-16 20:24:34.539724] E [MSGID: 109023]
> [dht-rebalance.c:1965:gf_defrag_get_entry] 0-live-dht: Migrate file
> failed:/hcs/hcs/OperaArchiveCol/SK
> 20131011_Oligo_Rot_lowConc_P1/Meas_08(2013-10-11_20-12-25)/005009006.flex
> lookup failed
> [2015-08-16 20:24:34.539820] E [MSGID: 109016]
> [dht-rebalance.c:2554:gf_defrag_fix_layout] 0-live-dht: Fix layout failed
> for /hcs/hcs/OperaArchiveCol/SK
> 20131011_Oligo_Rot_lowConc_P1/Meas_08(2013-10-11_20-12-25)
> [2015-08-16 20:24:34.540031] E [MSGID: 109016]
> [dht-rebalance.c:2554:gf_defrag_fix_layout] 0-live-dht: Fix layout failed
> for /hcs/hcs/OperaArchiveCol/SK 20131011_Oligo_Rot_lowConc_P1
> [2015-08-16 20:24:34.540691] E [MSGID: 114031]
> [client-rpc-fops.c:251:client3_3_mknod_cbk] 0-live-client-0: remote
> operation failed. Path: /hcs/hcs/OperaArchiveCol/SK
> 20131011_Oligo_Rot_lowConc_P1/Meas_12(2013-10-12_00-12-55)/002005008.flex
> [Transport endpoint is not connected]
> [2015-08-16 20:24:34.541152] E [MSGID: 114031]
> [client-rpc-fops.c:251:client3_3_mknod_cbk] 0-live-client-0: remote
> operation failed. Path: /hcs/hcs/OperaArchiveCol/SK
> 20131011_Oligo_Rot_lowConc_P1/Meas_12(2013-10-12_00-12-55)/005004009.flex
> [Transport endpoint is not connected]
> [2015-08-16 20:24:34.541331] E [MSGID: 114031]
> [client-rpc-fops.c:251:client3_3_mknod_cbk] 0-live-client-0: remote
> operation failed. Path: /hcs/hcs/OperaArchiveCol/SK
> 20131011_Oligo_Rot_lowConc_P1/Meas_12(2013-10-12_00-12-55)/007005011.flex
> [Transport endpoint is not connected]
> [2015-08-16 20:24:34.541486] E [MSGID: 109016]
> [dht-rebalance.c:2554:gf_defrag_fix_layout] 0-live-dht: Fix layout failed
> for /hcs/hcs/OperaArchiveCol
> [2015-08-16 20:24:34.541572] E [MSGID: 109016]
> [dht-rebalance.c:2554:gf_defrag_fix_layout] 0-live-dht: Fix layout failed
> for /hcs/hcs
> [2015-08-16 20:24:34.541639] E [MSGID: 109016]
> [dht-rebalance.c:2554:gf_defrag_fix_layout] 0-live-dht: Fix layout failed
> for /hcs
> 
> Any help would be greatly appreciated.
> CCing dht teams to give you better idea about why rebalance failed/ and about
> huge memory consumption by rebalance process (200GB RAM) .
> 
> Regards
> Rafi KC
> 
> 
> 
> 
> Thanks,
> 
> --
> Christophe
> 
> Dr Christophe Trefois, Dipl.-Ing.
> Technical Specialist / Post-Doc
> 
> UNIVERSITÉ DU LUXEMBOURG
> 
> LUXEMBOURG CENTRE FOR SYSTEMS BIOMEDICINE
> Campus Belval | House of Biomedicine
> 6, avenue du Swing
> L-4367 Belvaux
> T: +352 46 66 44 6124
> F: +352 46 66 44 6949
> http://www.uni.lu/lcsb
> 
> ----
> This message is confidential and may contain privileged information.
> It is intended for the named recipient only.
> If you receive it in error please notify me and permanently delete the
> original message and any copies.
> ----
> 
> 
> 
> 
> 
> 
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org<mailto:Gluster-devel at gluster.org>
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 
>