[Bugs] [Bug 1505597] New: ZFS backed bricks fail to rebalance distributed volumes [ubuntu?]

Tue Oct 24 00:25:34 UTC 2017

https://bugzilla.redhat.com/show_bug.cgi?id=1505597

            Bug ID: 1505597
           Summary: ZFS backed bricks fail to rebalance distributed
                    volumes [ubuntu?]
           Product: GlusterFS
           Version: 3.12
         Component: posix
          Severity: medium
          Assignee: bugs at gluster.org
          Reporter: crackerjackmack at gmail.com
                CC: bugs at gluster.org

Description of problem:
It appears posix.c sys_allocate() is being called with the FALLOC_FL_KEEP_SIZE
flag despite whether the underlying system supports said feature.  ZFS on Linux
with ubuntu 16.04 seems to support FALLOC_FL_KEEP_SIZE at compile time but not
on all filesystems, particularly ZFS.

This bug appears to only be triggered using rebalance on a distributed volume. 
I could not produce the bug with normal store and unlink actions.

If this type of setup "just works" with RHEL/CentOS systems utilizing ZFS then
I can only figure that there might be some core differences which aren't
accounted for during compile time.

Version-Release number of selected component (if applicable):
3.12.2-ubuntu1~xenial2

How reproducible: add a zfs backed brick to an existing distributed volume and
ask gluster to rebalance that volume.

Steps to Reproduce:
1. install ubuntu xenial 16.04 on normal ext4/xfs root
2. install zfsutils-linux (will install a 0.6.5 variant)
3. install glusterfs-server from
https://launchpad.net/~gluster/+archive/ubuntu/glusterfs-3.12
4. create a zpool "tank"
5. zfs set compression=lz4 tank
6. zfs set xattr=sa tank
7. zfs set sync=disabled tank
8. zfs create -o mountpoint=/gluster tank/gluster
9. zfs create tank/gluster/brick1
9a. zfs create tank/gluster/brick2
10. mkdir /gluster/brick{1,2}/brick
11. gluster volume create media server:/gluster/brick1/brick
server:/gluster/brick2/brick
12. mount -t glusterfs localhost:/media
13. populate with test data
14. zfs create tank/gluster/brick3; mkdir /gluster/brick3/brick
15. gluster add-brick media server:/gluster/brick3
16. gluster volume rebalance media start
17. observe error rebalance logs and brick logs.

Actual results:
[2017-10-23 20:09:57.258434] T [MSGID: 0]
[server-rpc-fops.c:3229:server_fallocate_resume] 0-stack-trace: stack-address:
0x7ff138001940, winding from media-server to /gluster/media/brick
[2017-10-23 20:09:57.258449] T [MSGID: 0] [defaults.c:2606:default_fallocate]
0-stack-trace: stack-address: 0x7ff138001940, winding from /gluster/media/brick
to media-io-stats
[2017-10-23 20:09:57.258462] T [MSGID: 0] [io-stats.c:3426:io_stats_fallocate]
0-stack-trace: stack-address: 0x7ff138001940, winding from media-io-stats to
media-quota
[2017-10-23 20:09:57.258475] T [MSGID: 0] [quota.c:4925:quota_fallocate]
0-stack-trace: stack-address: 0x7ff138001940, winding from media-quota to
media-index
[2017-10-23 20:09:57.258488] T [MSGID: 0] [defaults.c:2606:default_fallocate]
0-stack-trace: stack-address: 0x7ff138001940, winding from media-index to
media-barrier
[2017-10-23 20:09:57.258501] T [MSGID: 0] [defaults.c:2606:default_fallocate]
0-stack-trace: stack-address: 0x7ff138001940, winding from media-barrier to
media-marker
[2017-10-23 20:09:57.258515] T [MSGID: 0] [marker.c:2184:marker_fallocate]
0-stack-trace: stack-address: 0x7ff138001940, winding from media-marker to
media-selinux
[2017-10-23 20:09:57.258528] T [MSGID: 0] [defaults.c:2606:default_fallocate]
0-stack-trace: stack-address: 0x7ff138001940, winding from media-selinux to
media-io-threads
[2017-10-23 20:09:57.258564] T [MSGID: 0]
[defaults.c:1900:default_fallocate_resume] 0-stack-trace: stack-address:
0x7ff138001940, winding from media-io-threads to media-upcall
[2017-10-23 20:09:57.258581] T [MSGID: 0] [upcall.c:1487:up_fallocate]
0-stack-trace: stack-address: 0x7ff138001940, winding from media-upcall to
media-leases
[2017-10-23 20:09:57.258595] T [MSGID: 0] [leases.c:771:leases_fallocate]
0-stack-trace: stack-address: 0x7ff138001940, winding from media-leases to
media-read-only
[2017-10-23 20:09:57.258608] T [MSGID: 0] [defaults.c:2606:default_fallocate]
0-stack-trace: stack-address: 0x7ff138001940, winding from media-read-only to
media-worm
[2017-10-23 20:09:57.258621] T [MSGID: 0] [defaults.c:2606:default_fallocate]
0-stack-trace: stack-address: 0x7ff138001940, winding from media-worm to
media-locks
[2017-10-23 20:09:57.258635] T [MSGID: 0] [posix.c:4294:pl_fallocate]
0-stack-trace: stack-address: 0x7ff138001940, winding from media-locks to
media-access-control
[2017-10-23 20:09:57.258648] T [MSGID: 0] [defaults.c:2606:default_fallocate]
0-stack-trace: stack-address: 0x7ff138001940, winding from media-access-control
to media-bitrot-stub
[2017-10-23 20:09:57.258661] T [MSGID: 0] [defaults.c:2606:default_fallocate]
0-stack-trace: stack-address: 0x7ff138001940, winding from media-bitrot-stub to
media-changelog
[2017-10-23 20:09:57.258674] T [MSGID: 0] [defaults.c:2606:default_fallocate]
0-stack-trace: stack-address: 0x7ff138001940, winding from media-changelog to
media-changetimerecorder
[2017-10-23 20:09:57.258687] T [MSGID: 0] [defaults.c:2606:default_fallocate]
0-stack-trace: stack-address: 0x7ff138001940, winding from
media-changetimerecorder to media-trash
[2017-10-23 20:09:57.258707] T [MSGID: 0] [defaults.c:2606:default_fallocate]
0-stack-trace: stack-address: 0x7ff138001940, winding from media-trash to
media-posix
[2017-10-23 20:09:57.258729] D [MSGID: 0] [posix.c:1038:_posix_fallocate]
0-stack-trace: stack-address: 0x7ff138001940, media-posix returned -1 error:
Operation not supported [Operation not supported]
[2017-10-23 20:09:57.258747] D [MSGID: 0] [posix.c:4282:pl_fallocate_cbk]
0-stack-trace: stack-address: 0x7ff138001940, media-locks returned -1 error:
Operation not supported [Operation not supported]
[2017-10-23 20:09:57.258764] D [MSGID: 0] [leases.c:736:leases_fallocate_cbk]
0-stack-trace: stack-address: 0x7ff138001940, media-leases returned -1 error:
Operation not supported [Operation not supported]
[2017-10-23 20:09:57.258781] D [MSGID: 0] [upcall.c:1464:up_fallocate_cbk]
0-stack-trace: stack-address: 0x7ff138001940, media-upcall returned -1 error:
Operation not supported [Operation not supported]
[2017-10-23 20:09:57.258798] D [MSGID: 0]
[defaults.c:1290:default_fallocate_cbk] 0-stack-trace: stack-address:
0x7ff138001940, media-io-threads returned -1 error: Operation not supported
[Operation not supported]
[2017-10-23 20:09:57.258826] D [MSGID: 0] [marker.c:2142:marker_fallocate_cbk]
0-stack-trace: stack-address: 0x7ff138001940, media-marker returned -1 error:
Operation not supported [Operation not supported]
[2017-10-23 20:09:57.258843] D [MSGID: 0]
[io-stats.c:2465:io_stats_fallocate_cbk] 0-stack-trace: stack-address:
0x7ff138001940, media-io-stats returned -1 error: Operation not supported
[Operation not supported]

Expected results:

sys_fallocate to work with zfs on linux as well as the native linux file
systems.

Additional info:

root at sulley:~# cd /tmp/
root at sulley:/tmp# touch fallocate-test
root at sulley:/tmp# fallocate -n -l 50M fallocate-test
root at sulley:/tmp# ls -alh fallocate-test
-rw-r--r-- 1 root root 0 Oct 23 19:17 fallocate-test
root at sulley:/tmp# du -Sh fallocate-test
50M     fallocate-test
root at sulley:/tmp# cd /gluster
root at sulley:/gluster# touch fallocate-test
root at sulley:/gluster# fallocate -n -l 50M fallocate-test
fallocate: fallocate failed: keep size mode is unsupported

* Fallocate support for ZFS on linux -
 https://github.com/zfsonlinux/zfs/issues/326
* Original commit include the compile time logic for posix_fallocate -

https://github.com/gluster/glusterfs/commit/8e57090f7da4027c46176c9786372a00e22df69d
* GlusterFS zfs documentation -
 http://docs.gluster.org/en/latest/Administrator%20Guide/Gluster%20On%20ZFS/

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.