[Gluster-Maintainers] [Gluster-devel] Release 4.0: Unable to complete rolling upgrade tests

Raghavendra G raghavendra at gluster.com
Fri Mar 9 05:25:37 UTC 2018


On Fri, Mar 9, 2018 at 10:38 AM, Raghavendra Gowdappa <rgowdapp at redhat.com>
wrote:

>
>
> On Fri, Mar 2, 2018 at 11:01 AM, Ravishankar N <ravishankar at redhat.com>
> wrote:
>
>>
>> On 03/02/2018 10:11 AM, Ravishankar N wrote:
>>
>>> + Anoop.
>>>
>>> It looks like clients on the old (3.12) nodes are not able to talk to
>>> the upgraded (4.0) node. I see messages like these on the old clients:
>>>
>>>  [2018-03-02 03:49:13.483458] W [MSGID: 114007]
>>> [client-handshake.c:1197:client_setvolume_cbk] 0-testvol-client-2:
>>> failed to find key 'clnt-lk-version' in the options
>>>
>>> I see this in a 2x1 plain distribute also. I see ENOTCONN for the
>> upgraded brick on the old client:
>>
>> [2018-03-02 04:58:54.559446] E [MSGID: 114058]
>> [client-handshake.c:1571:client_query_portmap_cbk] 0-testvol-client-1:
>> failed to get the port number for remote subvolume. Please run 'gluster
>> volume status' on server to see if brick process is running.
>> [2018-03-02 04:58:54.559618] I [MSGID: 114018]
>> [client.c:2285:client_rpc_notify] 0-testvol-client-1: disconnected from
>> testvol-client-1. Client process will keep trying to connect to glusterd
>> until brick's port is available
>> [2018-03-02 04:58:56.973199] I [rpc-clnt.c:1994:rpc_clnt_reconfig]
>> 0-testvol-client-1: changing port to 49152 (from 0)
>> [2018-03-02 04:58:56.975844] I [MSGID: 114057]
>> [client-handshake.c:1484:select_server_supported_programs]
>> 0-testvol-client-1: Using Program GlusterFS 3.3, Num (1298437), Version
>> (330)
>> [2018-03-02 04:58:56.978114] W [MSGID: 114007]
>> [client-handshake.c:1197:client_setvolume_cbk] 0-testvol-client-1:
>> failed to find key 'clnt-lk-version' in the options
>> [2018-03-02 04:58:46.618036] E [MSGID: 114031]
>> [client-rpc-fops.c:2768:client3_3_opendir_cbk] 0-testvol-client-1:
>> remote operation failed. Path: / (00000000-0000-0000-0000-000000000001)
>> [Transport endpoint is not connected]
>> The message "W [MSGID: 114031] [client-rpc-fops.c:2577:client3_3_readdirp_cbk]
>> 0-testvol-client-1: remote operation failed [Transport endpoint is not
>> connected]" repeated 3 times between [2018-03-02 04:58:46.609529] and
>> [2018-03-02 04:58:46.618683]
>>
>> Also, mkdir fails on the old mount with EIO, though physically succeeding
>> on both bricks. Can the rpc folks offer a helping hand?
>>
>
>
> Sometimes glusterfs returns wrong ia_type (IA_IFIFO to be precise) in
> response of mkdir. This is the reason for failure. Note that mkdir response
> from glusterfs says its successful, but with a wrong iatt. That's the
> reason why we see directories created on bricks.
>
> On debugging further, in dht_selfheal_dir_xattr_cbk, which gets executed
> as part of mkdir in dht,
>
> (gdb)
> 677             ret = dict_get_bin (xdata, DHT_IATT_IN_XDATA_KEY, (void
> **) &stbuf);
> (gdb)
> 692             LOCK (&frame->lock);
> (gdb)
> 694                     dht_iatt_merge (this, &local->stbuf, stbuf,
> subvol);
> (gdb) p stbuf
> $16 = (struct iatt *) 0x7f84e405aaf0
> (gdb) p *stbuf
> $17 = {ia_ino = 6143, ia_gfid = "\222\064\301\225~6v\242\021\b\000\000\000\000\000",
> ia_dev = 0, ia_type = IA_IFIFO, ia_prot = {suid = 0 '\000', sgid = 0
> '\000', sticky = 0 '\000', owner = {read = 0 '\000',
>       write = 0 '\000', exec = 0 '\000'}, group = {read = 0 '\000', write
> = 0 '\000', exec = 0 '\000'}, other = {read = 0 '\000', write = 0 '\000',
> exec = 0 '\000'}}, ia_nlink = 2, ia_uid = 0, ia_gid = 0,
>   ia_rdev = 0, ia_size = 1520570685, ia_blksize = 1520570529, ia_blocks =
> 1520570714, ia_atime = 0, ia_atime_nsec = 0, ia_mtime = 172390349,
> ia_mtime_nsec = 475585538, ia_ctime = 626110118, ia_ctime_nsec = 0}
> (gdb) p local->stbuf
> $18 = {ia_ino = 11706604198702429330, ia_gfid =
> "e\223\246pH\005F\226\242v6~\225\301\064\222", ia_dev = 2065, ia_type =
> IA_IFDIR, ia_prot = {suid = 0 '\000', sgid = 0 '\000', sticky = 0 '\000',
> owner = {
>       read = 1 '\001', write = 1 '\001', exec = 1 '\001'}, group = {read =
> 1 '\001', write = 0 '\000', exec = 1 '\001'}, other = {read = 1 '\001',
> write = 0 '\000', exec = 1 '\001'}}, ia_nlink = 2, ia_uid = 0,
>   ia_gid = 0, ia_rdev = 0, ia_size = 4096, ia_blksize = 4096, ia_blocks =
> 8, ia_atime = 1520570529, ia_atime_nsec = 475585538, ia_mtime = 1520570529,
> ia_mtime_nsec = 475585538, ia_ctime = 1520570529,
>   ia_ctime_nsec = 475585538}
> (gdb) n
> 696             UNLOCK (&frame->lock);
> (gdb) p local->stbuf
> $19 = {ia_ino = 6143, ia_gfid = "\222\064\301\225~6v\242\021\b\000\000\000\000\000",
> ia_dev = 0, ia_type = IA_IFIFO, ia_prot = {suid = 0 '\000', sgid = 0
> '\000', sticky = 0 '\000', owner = {read = 0 '\000',
>       write = 0 '\000', exec = 0 '\000'}, group = {read = 0 '\000', write
> = 0 '\000', exec = 0 '\000'}, other = {read = 0 '\000', write = 0 '\000',
> exec = 0 '\000'}}, ia_nlink = 2, ia_uid = 0, ia_gid = 0,
>   ia_rdev = 0, ia_size = 1520574781, ia_blksize = 1520570529, ia_blocks =
> 1520570722, ia_atime = 1520570529, ia_atime_nsec = 475585538, ia_mtime =
> 1520570529, ia_mtime_nsec = 475585538, ia_ctime = 1520570529,
>   ia_ctime_nsec = 475585538}
>
> So, we got correct iatt during mkdir, but wrong one while trying to set
> the layout on directory.
>
> Debugging further,
>
>
> (gdb) p *stbuf
> $26 = {ia_ino = 6143, ia_gfid = "L\rk\212\367\275\"\256\021\b\000\000\000\000\000",
> ia_dev = 0, ia_type = IA_IFIFO, ia_prot = {suid = 0 '\000', sgid = 0
> '\000', sticky = 0 '\000', owner = {read = 0 '\000',
>       write = 0 '\000', exec = 0 '\000'}, group = {read = 0 '\000', write
> = 0 '\000', exec = 0 '\000'}, other = {read = 0 '\000', write = 0 '\000',
> exec = 0 '\000'}}, ia_nlink = 2, ia_uid = 0, ia_gid = 0,
>   ia_rdev = 0, ia_size = 1520571192, ia_blksize = 1520571192, ia_blocks =
> 1520571192, ia_atime = 0, ia_atime_nsec = 0, ia_mtime = 87784021,
> ia_mtime_nsec = 87784021, ia_ctime = 92784143, ia_ctime_nsec = 0}
> (gdb) up
> #1  0x00007f84eae8ead1 in client3_3_setxattr_cbk (req=0x7f84e0008130,
> iov=0x7f84e0008170, count=1, myframe=0x7f84e0008d80) at
> client-rpc-fops.c:1013
> 1013            CLIENT_STACK_UNWIND (setxattr, frame, rsp.op_ret,
> op_errno, xdata);
> (gdb) p this->name
> $27 = 0x7f84e4009190 "testvol-client-1"
>
> Breakpoint 12, dht_selfheal_dir_xattr_cbk (frame=0x7f84dc006a00,
> cookie=0x7f84e4007c50, this=0x7f84e400ce80, op_ret=0, op_errno=0,
> xdata=0x7f84e00017a0) at dht-selfheal.c:685
> 685             for (i = 0; i < layout->cnt; i++) {
> (gdb) p *stbuf
> $28 = {ia_ino = 12547800382684466508, ia_gfid =
> "\020{mk\200\067Kq\256\"\275\367\212k\rL", ia_dev = 2065, ia_type =
> IA_IFDIR, ia_prot = {suid = 0 '\000', sgid = 0 '\000', sticky = 0 '\000',
> owner = {
>       read = 1 '\001', write = 1 '\001', exec = 1 '\001'}, group = {read =
> 1 '\001', write = 0 '\000', exec = 1 '\001'}, other = {read = 1 '\001',
> write = 0 '\000', exec = 1 '\001'}}, ia_nlink = 2, ia_uid = 0,
>   ia_gid = 0, ia_rdev = 0, ia_size = 6, ia_blksize = 4096, ia_blocks = 0,
> ia_atime = 1520571192, ia_atime_nsec = 90026323, ia_mtime = 1520571192,
> ia_mtime_nsec = 90026323, ia_ctime = 1520571192,
>   ia_ctime_nsec = 94026420}
> (gdb) up
> #1  0x00007f84eae8ead1 in client3_3_setxattr_cbk (req=0x7f84e000a5f0,
> iov=0x7f84e000a630, count=1, myframe=0x7f84e000aa00) at
> client-rpc-fops.c:1013
> 1013            CLIENT_STACK_UNWIND (setxattr, frame, rsp.op_ret,
> op_errno, xdata);
> (gdb) p this->name
> $29 = 0x7f84e4008810 "testvol-client-0"
>
> As can be seen above, its always new brick (testvol-client-1) that returns
> wrong iatt with ia_type IA_FIFO. old client (testvol-client-0) returns
> correct iatt.
>
> We need to debug further on what in client-1 (which is running 4.0)
> resulted in wrong iatt. Note that the iatt is got from dictionary. So,
> dictionary changes in 4.0 is one suspect.
>

I just checked on bricks. storage/posix is setting correct iatt and getting
back iatt from dictionary (once it is set) is working correctly. So, I
suspect (un)serialization of dictionary in rpc layer is resulting in this
corruption of iatt.


> Thanks to Ravi for providing a live setup, which made my life easy :).
>
>>
>> -Ravi
>>
>> Is there something more to be done on BZ 1544366?
>>>
>>> -Ravi
>>> On 03/02/2018 08:44 AM, Ravishankar N wrote:
>>>
>>>>
>>>> On 03/02/2018 07:26 AM, Shyam Ranganathan wrote:
>>>>
>>>>> Hi Pranith/Ravi,
>>>>>
>>>>> So, to keep a long story short, post upgrading 1 node in a 3 node 3.13
>>>>> cluster, self-heal is not able to catch the heal backlog and this is a
>>>>> very simple synthetic test anyway, but the end result is that upgrade
>>>>> testing is failing.
>>>>>
>>>>
>>>> Let me try this now and get back. I had done some thing similar when
>>>> testing the FIPS patch and the rolling upgrade had worked.
>>>> Thanks,
>>>> Ravi
>>>>
>>>>>
>>>>> Here are the details,
>>>>>
>>>>> - Using
>>>>> https://hackmd.io/GYIwTADCDsDMCGBaArAUxAY0QFhBAbIgJwCMySIwJm
>>>>> AJvGMBvNEA#
>>>>> I setup 3 server containers to install 3.13 first as follows (within
>>>>> the
>>>>> containers)
>>>>>
>>>>> (inside the 3 server containers)
>>>>> yum -y update; yum -y install centos-release-gluster313; yum install
>>>>> glusterfs-server; glusterd
>>>>>
>>>>> (inside centos-glfs-server1)
>>>>> gluster peer probe centos-glfs-server2
>>>>> gluster peer probe centos-glfs-server3
>>>>> gluster peer status
>>>>> gluster v create patchy replica 3 centos-glfs-server1:/d/brick1
>>>>> centos-glfs-server2:/d/brick2 centos-glfs-server3:/d/brick3
>>>>> centos-glfs-server1:/d/brick4 centos-glfs-server2:/d/brick5
>>>>> centos-glfs-server3:/d/brick6 force
>>>>> gluster v start patchy
>>>>> gluster v status
>>>>>
>>>>> Create a client container as per the document above, and mount the
>>>>> above
>>>>> volume and create 1 file, 1 directory and a file within that directory.
>>>>>
>>>>> Now we start the upgrade process (as laid out for 3.13 here
>>>>> http://docs.gluster.org/en/latest/Upgrade-Guide/upgrade_to_3.13/ ):
>>>>> - killall glusterfs glusterfsd glusterd
>>>>> - yum install
>>>>> http://cbs.centos.org/kojifiles/work/tasks/1548/311548/cento
>>>>> s-release-gluster40-0.9-1.el7.centos.x86_64.rpm
>>>>> - yum upgrade --enablerepo=centos-gluster40-test glusterfs-server
>>>>>
>>>>> < Go back to the client and edit the contents of one of the files and
>>>>> change the permissions of a directory, so that there are things to heal
>>>>> when we bring up the newly upgraded server>
>>>>>
>>>>> - gluster --version
>>>>> - glusterd
>>>>> - gluster v status
>>>>> - gluster v heal patchy
>>>>>
>>>>> The above starts failing as follows,
>>>>> [root at centos-glfs-server1 /]# gluster v heal patchy
>>>>> Launching heal operation to perform index self heal on volume patchy
>>>>> has
>>>>> been unsuccessful:
>>>>> Commit failed on centos-glfs-server2.glfstest20. Please check log file
>>>>> for details.
>>>>> Commit failed on centos-glfs-server3. Please check log file for
>>>>> details.
>>>>>
>>>>>  From here, if further files or directories are created from the
>>>>> client,
>>>>> they just get added to the heal backlog, and heal does not catchup.
>>>>>
>>>>> As is obvious, I cannot proceed, as the upgrade procedure is broken.
>>>>> The
>>>>> issue itself may not be selfheal deamon, but something around
>>>>> connections, but as the process fails here, looking to you guys to
>>>>> unblock this as soon as possible, as we are already running a day's
>>>>> slip
>>>>> in the release.
>>>>>
>>>>> Thanks,
>>>>> Shyam
>>>>>
>>>>
>>>>
>>>
>> _______________________________________________
>> maintainers mailing list
>> maintainers at gluster.org
>> http://lists.gluster.org/mailman/listinfo/maintainers
>>
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-devel
>



-- 
Raghavendra G
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/maintainers/attachments/20180309/fc052b1f/attachment-0001.html>


More information about the maintainers mailing list