[Bugs] [Bug 1642638] New: Log-file rotation on a Disperse Volume while a failed brick results in files that cannot be healed.

bugzilla at redhat.com bugzilla at redhat.com
Wed Oct 24 21:04:28 UTC 2018


https://bugzilla.redhat.com/show_bug.cgi?id=1642638

            Bug ID: 1642638
           Summary: Log-file rotation on a Disperse Volume while a failed
                    brick results in files that cannot be healed.
           Product: GlusterFS
           Version: 3.12
         Component: disperse
          Severity: medium
          Assignee: bugs at gluster.org
          Reporter: jbyers at stonefly.com
                CC: bugs at gluster.org



Log-file rotation on a Disperse Volume while a failed brick results in files
that cannot be healed.

Simple log-file rotation on a Disperse Volume while there is a
failed brick results in files that cannot later be healed when
the brick is restored.

The rotation here is simply closing and renaming the old file
and creating a new file of the old name in the same directory.
What seems to happen is that on the failed and then restored
brick links both the new file and the renamed old file
together.

The only way to manually get the file to heal is to either
copy the file to a temp name, delete it, and move the temp
file to the original name. However, this is not an acceptable
procedure when the application has the file opened.

Another manual procedure that seems to work is to delete all
of the old name and new name files from the previously failed
brick, and run heal.

This problem is likely related to:

  Bug 1359681 – [disperse] Data gain while brick is down and rename a file
  https://bugzilla.redhat.com/show_bug.cgi?id=1359681

with the added issue of persistent heal failures, and manual
intervention being required.

Test plan below.

# gluster --version
glusterfs 3.12.14
##### Create the disperse volume.
# mkdir brick1 brick2 brick3
# gluster volume create disperse-vol disperse-data 2 redundancy 1 transport tcp
\
    10.10.60.181:/exports/nas-segment-0002/brick1 \
    10.10.60.181:/exports/nas-segment-0002/brick2 \
    10.10.60.181:/exports/nas-segment-0002/brick3 force
volume create: disperse-vol: success: please start the volume to access data
# gluster volume start disperse-vol
volume start: disperse-vol: success
# mkdir /mnt/disperse-vol
# mount -t glusterfs 10.10.60.181:/disperse-vol /mnt/disperse-vol/

##### Place a test file, verify the contents.
# cp -i /var/log/messages /mnt/disperse-vol/
# sum /mnt/disperse-vol/messages
36906  4082
# gluster volume status disperse-vol
Status of volume: disperse-vol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.10.60.181:/exports/nas-segment-000
2/brick1                                    62001     0          Y       24636
Brick 10.10.60.181:/exports/nas-segment-000
2/brick2                                    62008     0          Y       24656
Brick 10.10.60.181:/exports/nas-segment-000
2/brick3                                    62009     0          Y       24676
Self-heal Daemon on localhost               N/A       N/A        Y       24717
Task Status of Volume disperse-vol
------------------------------------------------------------------------------
There are no active volume tasks

##### Simulate a brick failure by killing the brick process for brick3.
# kill 24676
##### Verify the contents.
# sum /mnt/disperse-vol/messages
36906  4082
##### Manually simulate a logfile rotation.
# mv /mnt/disperse-vol/messages /mnt/disperse-vol/messages-1
# cp -i /var/log/boot.log /mnt/disperse-vol/messages
##### Verify the contents.
# sum /mnt/disperse-vol/messages
58391     5
# sum /mnt/disperse-vol/messages-1
36906  4082

##### Fix the brick failure by force starting disperse-vol.
# gluster volume start disperse-vol force
volume start: disperse-vol: success

##### Force healing, and note that the new logfile is not healthy.
# gluster volume heal disperse-vol full
# gluster volume heal disperse-vol info
Brick 10.10.60.181:/exports/nas-segment-0002/brick1
/
/messages
Status: Connected
Number of entries: 2
Brick 10.10.60.181:/exports/nas-segment-0002/brick2
/
/messages
Status: Connected
Number of entries: 2
Brick 10.10.60.181:/exports/nas-segment-0002/brick3
Status: Connected
Number of entries: 0

##### Problems with unable to heal in GlusterFS log file.
[2018-10-24 20:19:25.611393] I [MSGID: 122059]
[ec-heald.c:407:ec_shd_full_healer] 0-disperse-vol-disperse-0: starting full
sweep on subvol disperse-vol-client-0
[2018-10-24 20:19:25.611824] I [MSGID: 122059]
[ec-heald.c:407:ec_shd_full_healer] 0-disperse-vol-disperse-0: starting full
sweep on subvol disperse-vol-client-1
[2018-10-24 20:19:25.612487] I [MSGID: 122059]
[ec-heald.c:407:ec_shd_full_healer] 0-disperse-vol-disperse-0: starting full
sweep on subvol disperse-vol-client-2
[2018-10-24 20:19:25.619637] I [MSGID: 122002] [ec-heal.c:1343:__ec_heal_name]
0-disperse-vol-disperse-0: 00000000-0000-0000-0000-000000000001/messages: Not
able to heal
[2018-10-24 20:19:25.628517] I [MSGID: 122059]
[ec-heald.c:416:ec_shd_full_healer] 0-disperse-vol-disperse-0: finished full
sweep on subvol disperse-vol-client-0
[2018-10-24 20:19:25.629298] I [MSGID: 122002] [ec-heal.c:1343:__ec_heal_name]
0-disperse-vol-disperse-0: 00000000-0000-0000-0000-000000000001/messages: Not
able to heal
[2018-10-24 20:19:25.633801] I [MSGID: 122059]
[ec-heald.c:416:ec_shd_full_healer] 0-disperse-vol-disperse-0: finished full
sweep on subvol disperse-vol-client-2
[2018-10-24 20:19:25.636178] I [MSGID: 122002] [ec-heal.c:1343:__ec_heal_name]
0-disperse-vol-disperse-0: 00000000-0000-0000-0000-000000000001/messages: Not
able to heal
[2018-10-24 20:19:25.637871] I [MSGID: 122059]
[ec-heald.c:416:ec_shd_full_healer] 0-disperse-vol-disperse-0: finished full
sweep on subvol disperse-vol-client-1

##### Verify the contents.
# sum /mnt/disperse-vol/messages
58391     5
# sum /mnt/disperse-vol/messages-1
36906  4082

##### Dump the 'messages' file attributes. Note that the previously failed
brick has 2 gfid2paths.
# getfattr --absolute-names -m '.*' -d -e hex
/exports/nas-segment-0002/brick*/messages
# file: /exports/nas-segment-0002/brick1/messages
trusted.ec.config=0x0000080301000200
trusted.ec.dirty=0x00000000000000010000000000000001
trusted.ec.size=0x000000000000128e
trusted.ec.version=0x00000000000000010000000000000001
trusted.gfid=0x4d041ae05fd649e0a9ee6fc502899c73
trusted.gfid2path.f07d8a47ec444a18=0x30303030303030302d303030302d303030302d303030302d3030303030303030303030312f6d65737361676573
trusted.gfid2path.f07d8a47ec444a18="00000000-0000-0000-0000-000000000001/messages"
(in text)

# file: /exports/nas-segment-0002/brick2/messages
trusted.ec.config=0x0000080301000200
trusted.ec.dirty=0x00000000000000010000000000000001
trusted.ec.size=0x000000000000128e
trusted.ec.version=0x00000000000000010000000000000001
trusted.gfid=0x4d041ae05fd649e0a9ee6fc502899c73
trusted.gfid2path.f07d8a47ec444a18=0x30303030303030302d303030302d303030302d303030302d3030303030303030303030312f6d65737361676573
trusted.gfid2path.f07d8a47ec444a18="00000000-0000-0000-0000-000000000001/messages"
(in text)

# file: /exports/nas-segment-0002/brick3/messages
trusted.ec.config=0x0000080301000200
trusted.ec.dirty=0x00000000000000000000000000000000
trusted.ec.size=0x00000000003fc5d1
trusted.ec.version=0x00000000000000200000000000000020
trusted.gfid=0x1d288eab6fca4561a162f0a6b13a6d3a
trusted.gfid2path.084a6f56cf401310=0x30303030303030302d303030302d303030302d303030302d3030303030303030303030312f6d657373616765732d31
trusted.gfid2path.f07d8a47ec444a18=0x30303030303030302d303030302d303030302d303030302d3030303030303030303030312f6d65737361676573
trusted.gfid2path.084a6f56cf401310="00000000-0000-0000-0000-000000000001/messages-1"
(in text)
trusted.gfid2path.f07d8a47ec444a18="00000000-0000-0000-0000-000000000001/messages"
(in text)

##### Dump the 'messages-1' file attributes. Note that the previously failed
brick has 2 gfid2paths.
# getfattr --absolute-names -m '.*' -d -e hex
/exports/nas-segment-0002/brick*/messages-1
# file: /exports/nas-segment-0002/brick1/messages-1
trusted.ec.config=0x0000080301000200
trusted.ec.dirty=0x00000000000000000000000000000000
trusted.ec.size=0x00000000003fc5d1
trusted.ec.version=0x00000000000000200000000000000020
trusted.gfid=0x1d288eab6fca4561a162f0a6b13a6d3a
trusted.gfid2path.084a6f56cf401310=0x30303030303030302d303030302d303030302d303030302d3030303030303030303030312f6d657373616765732d31
trusted.gfid2path.084a6f56cf401310="00000000-0000-0000-0000-000000000001/messages-1"
(in text)

# file: /exports/nas-segment-0002/brick2/messages-1
trusted.ec.config=0x0000080301000200
trusted.ec.dirty=0x00000000000000000000000000000000
trusted.ec.size=0x00000000003fc5d1
trusted.ec.version=0x00000000000000200000000000000020
trusted.gfid=0x1d288eab6fca4561a162f0a6b13a6d3a
trusted.gfid2path.084a6f56cf401310=0x30303030303030302d303030302d303030302d303030302d3030303030303030303030312f6d657373616765732d31
trusted.gfid2path.084a6f56cf401310="00000000-0000-0000-0000-000000000001/messages-1"
(in text)

# file: /exports/nas-segment-0002/brick3/messages-1
trusted.ec.config=0x0000080301000200
trusted.ec.dirty=0x00000000000000000000000000000000
trusted.ec.size=0x00000000003fc5d1
trusted.ec.version=0x00000000000000200000000000000020
trusted.gfid=0x1d288eab6fca4561a162f0a6b13a6d3a
trusted.gfid2path.084a6f56cf401310=0x30303030303030302d303030302d303030302d303030302d3030303030303030303030312f6d657373616765732d31
trusted.gfid2path.f07d8a47ec444a18=0x30303030303030302d303030302d303030302d303030302d3030303030303030303030312f6d65737361676573
trusted.gfid2path.084a6f56cf401310="00000000-0000-0000-0000-000000000001/messages-1"
(in text)
trusted.gfid2path.f07d8a47ec444a18="00000000-0000-0000-0000-000000000001/messages"
(in text)

##### File 'messages' brick fragment stats.
# stat /exports/nas-segment-0002/brick*/messages
  File: `/exports/nas-segment-0002/brick1/messages'
  Size: 2560            Blocks: 8          IO Block: 4096   regular file
Device: 880h/2176d      Inode: 92          Links: 2
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2018-10-24 13:18:37.935361600 -0700
Modify: 2018-10-24 13:18:37.953361595 -0700
Change: 2018-10-24 13:18:37.954361598 -0700
  File: `/exports/nas-segment-0002/brick2/messages'
  Size: 2560            Blocks: 8          IO Block: 4096   regular file
Device: 880h/2176d      Inode: 268435548   Links: 2
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2018-10-24 13:18:37.935361600 -0700
Modify: 2018-10-24 13:18:37.953361595 -0700
Change: 2018-10-24 13:18:37.954361598 -0700
  File: `/exports/nas-segment-0002/brick3/messages'
  Size: 2089984         Blocks: 4088       IO Block: 4096   regular file
Device: 880h/2176d      Inode: 537347155   Links: 3
Access: (0600/-rw-------)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2018-10-24 13:17:16.301361665 -0700
Modify: 2018-10-24 13:17:16.438361664 -0700
Change: 2018-10-24 13:18:59.853361581 -0700

##### File 'messages-1' brick fragment stats.
# stat /exports/nas-segment-0002/brick*/messages-1
  File: `/exports/nas-segment-0002/brick1/messages-1'
  Size: 2089984         Blocks: 4088       IO Block: 4096   regular file
Device: 880h/2176d      Inode: 89          Links: 2
Access: (0600/-rw-------)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2018-10-24 13:17:16.301361665 -0700
Modify: 2018-10-24 13:17:16.438361664 -0700
Change: 2018-10-24 13:18:59.872361580 -0700
  File: `/exports/nas-segment-0002/brick2/messages-1'
  Size: 2089984         Blocks: 4088       IO Block: 4096   regular file
Device: 880h/2176d      Inode: 268435544   Links: 2
Access: (0600/-rw-------)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2018-10-24 13:17:16.301361665 -0700
Modify: 2018-10-24 13:17:16.438361664 -0700
Change: 2018-10-24 13:18:59.872361580 -0700
  File: `/exports/nas-segment-0002/brick3/messages-1'
  Size: 2089984         Blocks: 4088       IO Block: 4096   regular file
Device: 880h/2176d      Inode: 537347155   Links: 3
Access: (0600/-rw-------)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2018-10-24 13:17:16.301361665 -0700
Modify: 2018-10-24 13:17:16.438361664 -0700
Change: 2018-10-24 13:18:59.853361581 -0700

##### Find the 'messages' file fragments on previously failed brick3.
# find /exports/nas-segment-0002/brick3 -samefile
/exports/nas-segment-0002/brick3/messages -ls
537347155 2044 -rw-------   3 root     root      2089984 Oct 24 13:17
/exports/nas-segment-0002/brick3/.glusterfs/1d/28/1d288eab-6fca-4561-a162-f0a6b13a6d3a
537347155 2044 -rw-------   3 root     root      2089984 Oct 24 13:17
/exports/nas-segment-0002/brick3/messages
537347155 2044 -rw-------   3 root     root      2089984 Oct 24 13:17
/exports/nas-segment-0002/brick3/messages-1
##### Find the 'messages-1' file fragments on previously failed brick3.
# find /exports/nas-segment-0002/brick3 -samefile
/exports/nas-segment-0002/brick3/messages-1 -ls
537347155 2044 -rw-------   3 root     root      2089984 Oct 24 13:17
/exports/nas-segment-0002/brick3/.glusterfs/1d/28/1d288eab-6fca-4561-a162-f0a6b13a6d3a
537347155 2044 -rw-------   3 root     root      2089984 Oct 24 13:17
/exports/nas-segment-0002/brick3/messages
537347155 2044 -rw-------   3 root     root      2089984 Oct 24 13:17
/exports/nas-segment-0002/brick3/messages-1

##### Find the 'messages' file fragments on always good brick1.
# find /exports/nas-segment-0002/brick1 -samefile
/exports/nas-segment-0002/brick1/messages -ls
    92    4 -rw-r--r--   2 root     root         2560 Oct 24 13:18
/exports/nas-segment-0002/brick1/.glusterfs/4d/04/4d041ae0-5fd6-49e0-a9ee-6fc502899c73
    92    4 -rw-r--r--   2 root     root         2560 Oct 24 13:18
/exports/nas-segment-0002/brick1/messages
##### Find the 'messages-1' file fragments on always good brick1.
# find /exports/nas-segment-0002/brick1 -samefile
/exports/nas-segment-0002/brick1/messages-1 -ls
    89 2044 -rw-------   2 root     root      2089984 Oct 24 13:17
/exports/nas-segment-0002/brick1/.glusterfs/1d/28/1d288eab-6fca-4561-a162-f0a6b13a6d3a
    89 2044 -rw-------   2 root     root      2089984 Oct 24 13:17
/exports/nas-segment-0002/brick1/messages-1

##### "Fix" the healing by deleting all of the unhealable file 'messages' from
the previously failed brick3.
# find /exports/nas-segment-0002/brick3 -samefile
/exports/nas-segment-0002/brick3/messages -delete

##### Disperse Volume healing now works on the 'message' file.
# gluster volume heal disperse-vol full
# gluster volume heal disperse-vol info
Brick 10.10.60.181:/exports/nas-segment-0002/brick1
Status: Connected
Number of entries: 0
Brick 10.10.60.181:/exports/nas-segment-0002/brick2
Status: Connected
Number of entries: 0
Brick 10.10.60.181:/exports/nas-segment-0002/brick3
Status: Connected
Number of entries: 0

# sum /mnt/disperse-vol/messages
58391     5
# sum /mnt/disperse-vol/messages-1
36906  4082

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list