[Gluster-users] Gluster 11.1 - heal hangs (again)

Hu Bert revirii at googlemail.com
Tue Apr 23 06:54:51 UTC 2024


Ah, logs: nothing in the glustershd.log on the 3 gluster servers. But
on one client in /var/log/glusterfs/data-sourceimages.log :

[2024-04-23 06:54:21.456157 +0000] W [MSGID: 114061]
[client-common.c:796:client_pre_lk_v2] 0-sourceimages-client-2:
remote_fd is -1. EBADFD [{gfid=a1817071-2949-4145-a96a-874159e46511},
{errno=77}, {error=File descriptor in bad state}]
[2024-04-23 06:54:21.456195 +0000] E [MSGID: 108028]
[afr-open.c:361:afr_is_reopen_allowed_cbk] 0-sourceimages-replicate-0:
Failed getlk for a1817071-2949-4145-a96a-874159e46511 [File descriptor
in bad state]
[2024-04-23 06:54:21.488511 +0000] W [MSGID: 114061]
[client-common.c:530:client_pre_flush_v2] 0-sourceimages-client-2:
remote_fd is -1. EBADFD [{gfid=a1817071-2949-4145-a96a-874159e46511},
{errno=77}, {error=File descriptor in bad stat
e}]


Am Di., 23. Apr. 2024 um 08:46 Uhr schrieb Hu Bert <revirii at googlemail.com>:
>
> Hi,
>
> referring to this thread:
> https://lists.gluster.org/pipermail/gluster-users/2024-January/040465.html
> especially: https://lists.gluster.org/pipermail/gluster-users/2024-January/040513.html
>
> I've updated+rebooted 3 servers (debian bookworm) with gluster 11.1
> running. The first 2 servers went fine, gluster volume ok, no heals,
> so after a couple of minutes i rebooted the 3rd server. And having the
> same problem again: heals are counting up, no heals happen. gluster
> volume status+info ok, gluster peer status ok.
>
> Full volume status+info: https://pastebin.com/aEEEKn7h
>
> Volume Name: sourceimages
> Type: Replicate
> Volume ID: d6a559a1-ca4c-48c7-8adf-89048333bb58
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: gluster188:/gluster/md3/sourceimages
> Brick2: gluster189:/gluster/md3/sourceimages
> Brick3: gluster190:/gluster/md3/sourceimages
>
> Internal IPs:
> gluster188: 192.168.0.188
> gluster189: 192.168.0.189
> gluster190: 192.168.0.190
>
> After rebooting the 3rd server (gluster190) the client info looks like this:
>
> gluster volume status sourceimages clients
> Client connections for volume sourceimages
> ----------------------------------------------
> Brick : gluster188:/gluster/md3/sourceimages
> Clients connected : 17
> Hostname                                               BytesRead
> BytesWritten       OpVersion
> --------                                               ---------
> ------------       ---------
> 192.168.0.188:49151                                    1047856
>  988364          110000
> 192.168.0.189:49149                                       930792
>    654096          110000
> 192.168.0.109:49147                                       271598
>    279908          110000
> 192.168.0.223:49147                                       126764
>    130964          110000
> 192.168.0.222:49146                                       125848
>    130144          110000
> 192.168.0.2:49147                                         273756
>  43400387          110000
> 192.168.0.15:49147                                      57248531
>  14327465          110000
> 192.168.0.126:49147                                     32282645
> 671284763          110000
> 192.168.0.94:49146                                        125520
>    128864          110000
> 192.168.0.66:49146                                      34086248
> 666519388          110000
> 192.168.0.99:49146                                       3051076
> 522652843          110000
> 192.168.0.16:49146                                     149773024
>   1049035          110000
> 192.168.0.110:49146                                      1574768
> 566124922          110000
> 192.168.0.106:49146                                    152640790
> 146483580          110000
> 192.168.0.91:49133                                      89548971
>  82709793          110000
> 192.168.0.190:49149                                         4132
>      6540          110000
> 192.168.0.118:49133                                        92176
>     92884          110000
> ----------------------------------------------
> Brick : gluster189:/gluster/md3/sourceimages
> Clients connected : 17
> Hostname                                               BytesRead
> BytesWritten       OpVersion
> --------                                               ---------
> ------------       ---------
> 192.168.0.188:49146                                       935172
>    658268          110000
> 192.168.0.189:49151                                    1039048
>  977920          110000
> 192.168.0.126:49146                                     27106555
> 231766764          110000
> 192.168.0.110:49147                                      1121696
> 226426262          110000
> 192.168.0.16:49147                                     147165735
>    994015          110000
> 192.168.0.106:49147                                    152476618
>   1091156          110000
> 192.168.0.94:49147                                        109612
>    112688          110000
> 192.168.0.109:49146                                       180819
>   1489715          110000
> 192.168.0.223:49146                                       110708
>    114316          110000
> 192.168.0.99:49147                                       2573412
> 157737429          110000
> 192.168.0.2:49145                                         242696
>  26088710          110000
> 192.168.0.222:49145                                       109728
>    113064          110000
> 192.168.0.66:49145                                      27003740
> 215124678          110000
> 192.168.0.15:49145                                      57217513
>    594699          110000
> 192.168.0.91:49132                                      89463431
>   2714920          110000
> 192.168.0.190:49148                                         4132
>      6540          110000
> 192.168.0.118:49131                                        92380
>     94996          110000
> ----------------------------------------------
> Brick : gluster190:/gluster/md3/sourceimages
> Clients connected : 2
> Hostname                                               BytesRead
> BytesWritten       OpVersion
> --------                                               ---------
> ------------       ---------
> 192.168.0.190:49151                                      21252
>   27988          110000
> 192.168.0.118:49132                                        92176
>     92884          110000
>
> The bad server (gluster190) has only 2 clients: itself and
> 192.168.0.118 (was rebooted after gluster190). Well, i remounted the
> volume on the other clients (without reboot), they appear now - but
> the most important thing: the other 2 gluster servers are missing.
> Output shortened, removed the connected clients:
>
> gluster volume status sourceimages clients
> Client connections for volume sourceimages
> ----------------------------------------------
> Brick : gluster188:/gluster/md3/sourceimages
> Clients connected : 17
> Hostname                                               BytesRead
> BytesWritten       OpVersion
> --------                                               ---------
> ------------       ---------
> 192.168.0.188:49151                                    3707272
> 3387700          110000
> 192.168.0.189:49149                                      3346388
>   2264688          110000
> 192.168.0.190:49149                                         4132
>      6540          110000
> ----------------------------------------------
> Brick : gluster189:/gluster/md3/sourceimages
> Clients connected : 17
> Hostname                                               BytesRead
> BytesWritten       OpVersion
> --------                                               ---------
> ------------       ---------
> 192.168.0.189:49151                                    3698464
> 3377496          110000
> 192.168.0.188:49146                                      3350768
>   2268260          110000
> 192.168.0.190:49148                                         4132
>      6540          110000
> ----------------------------------------------
> Brick : gluster190:/gluster/md3/sourceimages
> Clients connected : 15
> Hostname                                               BytesRead
> BytesWritten       OpVersion
> --------                                               ---------
> ------------       ---------
> 192.168.0.190:49151                                      38692
>   49988          110000
> ----------------------------------------------
>
> The 2 good (peer) cluster are missing on the 3rd/bad server. As these
> are not normal clients: how do i re-add/re-connect them? The 3 servers
> do not mount the volume to some mountpoint during normal service.
>
>
> Best regards,
> Hubert


More information about the Gluster-users mailing list