[Gluster-users] Geo-replication log file not closed

David Cunningham dcunningham at voisonics.com
Mon Aug 31 04:11:40 UTC 2020


Hello all,

Apparently we don't want to "kill -HUP" the two processes that have rotated
log file still open:
root      4495     1  0 Aug10 ?        00:00:59 /usr/bin/python2
/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/gsyncd.py
--path=/nodirectwritedata/gluster/gvol0  --monitor -c
/var/lib/glusterd/geo-replication/gvol0_nvfs10_gvol0/gsyncd.conf
--iprefix=/var :gvol0 --glusterd-uuid=b7521445-ee93-4fed-8ced-6a609fa8c7d4
nvfs10::gvol0
root      4508  4495  0 Aug10 ?        00:01:56 python2
/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/gsyncd.py agent gvol0
nvfs10::gvol0 --local-path /nodirectwritedata/gluster/gvol0 --local-node
cafs30 --local-node-id b7521445-ee93-4fed-8ced-6a609fa8c7d4 --slave-id
cdcdb210-839c-4306-a4dc-e696b165ed17 --rpc-fd 9,12,11,10
... a kill -HUP on those processes stops them rather than re-opening the
log file.

Does anyone know if these processes are supposed to have gsyncd.log open?
If so, how do we tell them to close and re-open their file handle?

Thanks in advance!


On Tue, 25 Aug 2020 at 15:24, David Cunningham <dcunningham at voisonics.com>
wrote:

> Hello,
>
> We're having an issue with the rotated gsyncd.log not being released.
> Here's the output of 'lsof':
>
> # lsof | grep 'gsyncd.log.1'
> python2    4495                  root    3w      REG                8,1
>  991675023    4332241
> /var/log/glusterfs/geo-replication/gvol0_nvfs10_gvol0/gsyncd.log.1 (deleted)
> python2    4495  4496            root    3w      REG                8,1
>  991675023    4332241
> /var/log/glusterfs/geo-replication/gvol0_nvfs10_gvol0/gsyncd.log.1 (deleted)
> python2    4495  4507            root    3w      REG                8,1
>  991675023    4332241
> /var/log/glusterfs/geo-replication/gvol0_nvfs10_gvol0/gsyncd.log.1 (deleted)
> python2    4508                  root    3w      REG                8,1
>  991675023    4332241
> /var/log/glusterfs/geo-replication/gvol0_nvfs10_gvol0/gsyncd.log.1 (deleted)
> python2    4508                  root    5w      REG                8,1
>  991675023    4332241
> /var/log/glusterfs/geo-replication/gvol0_nvfs10_gvol0/gsyncd.log.1 (deleted)
> python2    4508  4511            root    3w      REG                8,1
>  991675023    4332241
> /var/log/glusterfs/geo-replication/gvol0_nvfs10_gvol0/gsyncd.log.1 (deleted)
> ... etc...
>
> Those processes are:
> # ps -ef | egrep '4495|4508'
> root      4495     1  0 Aug10 ?        00:00:59 /usr/bin/python2
> /usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/gsyncd.py
> --path=/nodirectwritedata/gluster/gvol0  --monitor -c
> /var/lib/glusterd/geo-replication/gvol0_nvfs10_gvol0/gsyncd.conf
> --iprefix=/var :gvol0 --glusterd-uuid=b7521445-ee93-4fed-8ced-6a609fa8c7d4
> nvfs10::gvol0
> root      4508  4495  0 Aug10 ?        00:01:56 python2
> /usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/gsyncd.py agent gvol0
> nvfs10::gvol0 --local-path /nodirectwritedata/gluster/gvol0 --local-node
> cafs30 --local-node-id b7521445-ee93-4fed-8ced-6a609fa8c7d4 --slave-id
> cdcdb210-839c-4306-a4dc-e696b165ed17 --rpc-fd 9,12,11,10
>
> And here's the relevant part of the /etc/logrotate.d/glusterfs-georep
> script:
>
> /var/log/glusterfs/geo-replication/*/*.log {
>     sharedscripts
>     rotate 52
>     missingok
>     compress
>     delaycompress
>     notifempty
>     postrotate
>     for pid in `ps -aef | grep glusterfs | egrep "\-\-aux-gfid-mount" |
> awk '{print $2}'`; do
>         /usr/bin/kill -HUP $pid > /dev/null 2>&1 || true
>     done
>      endscript
> }
>
> If I run the postrotate part manually:
> # ps -aef | grep glusterfs | egrep "\-\-aux-gfid-mount" | awk '{print $2}'
> 4520
>
> # ps -aef | grep 4520
> root      4520     1  0 Aug10 ?        01:24:23 /usr/sbin/glusterfs
> --aux-gfid-mount --acl --log-level=INFO
> --log-file=/var/log/glusterfs/geo-replication/gvol0_nvfs10_gvol0/mnt-nodirectwritedata-gluster-gvol0.log
> --volfile-server=localhost --volfile-id=gvol0 --client-pid=-1
> /tmp/gsyncd-aux-mount-Tq_3sU
>
> Perhaps the problem is that the kill -HUP in the logrotate script doesn't
> act on the right process? If so, does anyone have a command to get the
> right PID?
>
> Thanks in advance for any help.
>
> --
> David Cunningham, Voisonics Limited
> http://voisonics.com/
> USA: +1 213 221 1092
> New Zealand: +64 (0)28 2558 3782
>


-- 
David Cunningham, Voisonics Limited
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20200831/24631ee9/attachment.html>


More information about the Gluster-users mailing list