[Gluster-Maintainers] Build failed in Jenkins: regression-test-burn-in #203

Xavier Hernandez xhernandez at datalab.es
Thu Jan 7 12:30:46 UTC 2016


The problem seems to be that the inode is not valid (i.e. a previous 
lookup has not been fully completed) before calling a setattr fop. This 
causes that some information needed by the fop won't be available and 
the core is generated by a failed GF_ASSERT().

EC needs additional information to handle the encoding/decoding of 
regular files. This information is stored in special xattr that are only 
retrieved for inodes with ia_type == IA_IFREG.

I've seen that sometimes an inode corresponding to a regular file is 
received with its ia_type == IA_INVAL. Sometimes the type is set while 
processing the request, in which case a log message is written ("Unable 
to get size xattr") and the fop fails. However in other cases the 
ia_type is set later (or not set at all), causing a later assert to fail.

There's a patch that could solve this particular problem 
(http://review.gluster.org/13039) but it's only a hack to avoid a worse 
problem that could reappear in other places.

I talked with Pranith about this, and we agreed that a better solution 
should be implemented. It doesn't seem acceptable that any fop receives 
an inode that has invalid information.

Xavi

On 07/01/16 03:08, Vijay Bellur wrote:
> On 01/05/2016 09:07 PM, Vijay Bellur wrote:
>> On 01/05/2016 04:55 PM, jenkins at build.gluster.org wrote:
>>> See <http://build.gluster.org/job/regression-test-burn-in/203/>
>>
>>> ++ ls /build/install/cores/core.9754
>>> + CORELIST=/build/install/cores/core.9754
>>> + for corefile in '$CORELIST'
>>> + getliblistfromcore /build/install/cores/core.9754
>>> + rm -f /build/install/cores/gdbout.txt
>>> + gdb -c /build/install/cores/core.9754 -q -ex 'info sharedlibrary'
>>> -ex q
>>> + set +x
>>> + rm -f /build/install/cores/gdbout.txt
>>> + sort /build/install/cores/liblist.txt
>>> + uniq
>>> + cat /build/install/cores/liblist.txt.tmp
>>> + grep -v /build/install
>>> + tar -cf
>>> /archives/archived_builds/build-install-20160105:20:48:04.tar
>>> /build/install/sbin /build/install/bin /build/install/lib
>>> /build/install/libexec /build/install/cores
>>> tar: Removing leading `/' from member names
>>> + tar -rhf
>>> /archives/archived_builds/build-install-20160105:20:48:04.tar -T
>>> /build/install/cores/liblist.txt
>>> tar: Removing leading `/' from member names
>>> + bzip2 /archives/archived_builds/build-install-20160105:20:48:04.tar
>>> + rm -f /build/install/cores/liblist.txt
>>> + rm -f /build/install/cores/liblist.txt.tmp
>>> + echo Cores and build archived in
>>> http://slave21.cloud.gluster.org/archived_builds/build-install-20160105:20:48:04.tar.bz2
>>>
>>>
>>> Cores and build archived in
>>> http://slave21.cloud.gluster.org/archived_builds/build-install-20160105:20:48:04.tar.bz2
>>>
>>>
>>> + echo Open core using the following command to get a proper stack...
>>> Open core using the following command to get a proper stack...
>>> + echo Example: From root of extracted tarball
>>> Example: From root of extracted tarball
>>> + echo 'gdb -ex '\''set sysroot ./'\'' -ex '\''core-file
>>> ./build/install/cores/core.xxx'\'' <target, say
>>> ./build/install/sbin/glusterd>'
>>> gdb -ex 'set sysroot ./' -ex 'core-file
>>> ./build/install/cores/core.xxx' <target, say
>>> ./build/install/sbin/glusterd>
>>> + RET=1
>>> + '[' 1 -ne 0 ']'
>>> + filename=logs/glusterfs-logs-20160105:20:48:04.tgz
>>> + tar -czf /archives/logs/glusterfs-logs-20160105:20:48:04.tgz
>>> /var/log/glusterfs /var/log/messages /var/log/messages-20151129
>>> /var/log/messages-20151206 /var/log/messages-20151213
>>> /var/log/messages-20160104
>>> tar: Removing leading `/' from member names
>>> + echo Logs archived in
>>> http://slave21.cloud.gluster.org/logs/glusterfs-logs-20160105:20:48:04.tgz
>>>
>>>
>>> Logs archived in
>>> http://slave21.cloud.gluster.org/logs/glusterfs-logs-20160105:20:48:04.tgz
>>>
>>>
>>> + exit 1
>>> + RET=1
>>> + '[' 1 = 0 ']'
>>> + V=-1
>>> + VERDICT=FAILED
>>
>>
>> This run has failed due to a core in ec.
>>
>> Pranith, Xavi - can you please take a look?
>>
>
> Another regression run failed due to this core:
>
> https://build.gluster.org/job/regression-test-burn-in/206/consoleFull
>
> Can we please expedite resolution of this crash?
>
> Thanks,
> Vijay
>


More information about the maintainers mailing list