[Gluster-Maintainers] Build failed in Jenkins: regression-test-burn-in #203

Xavier Hernandez xhernandez at datalab.es
Thu Jan 7 13:04:56 UTC 2016



On 07/01/16 13:51, Pranith Kumar Karampuri wrote:
>
>
> On 01/07/2016 06:00 PM, Xavier Hernandez wrote:
>> The problem seems to be that the inode is not valid (i.e. a previous
>> lookup has not been fully completed) before calling a setattr fop.
>> This causes that some information needed by the fop won't be available
>> and the core is generated by a failed GF_ASSERT().
>>
>> EC needs additional information to handle the encoding/decoding of
>> regular files. This information is stored in special xattr that are
>> only retrieved for inodes with ia_type == IA_IFREG.
>>
>> I've seen that sometimes an inode corresponding to a regular file is
>> received with its ia_type == IA_INVAL. Sometimes the type is set while
>> processing the request, in which case a log message is written
>> ("Unable to get size xattr") and the fop fails. However in other cases
>> the ia_type is set later (or not set at all), causing a later assert
>> to fail.
>>
>> There's a patch that could solve this particular problem
>> (http://review.gluster.org/13039) but it's only a hack to avoid a
>> worse problem that could reappear in other places.
>>
>> I talked with Pranith about this, and we agreed that a better solution
>> should be implemented. It doesn't seem acceptable that any fop
>> receives an inode that has invalid information.
> Xavi that solution will take a bit of time to implement IMO. I will
> definitely start the conversation about what we discussed on
> gluster-devel, meanwhile can we accept this patch?

We can accept the patch, however I'm not sure if the same problem can 
appear in other places or even other xlators.

I've triggered regression tests. Once they pass, I'll merge the patch.

Xavi

>
> Raghavendra G,
>              How long do you think it will take for us to implement the
> solution we talked about?
>
> Pranith
>
>>
>> Xavi
>>
>> On 07/01/16 03:08, Vijay Bellur wrote:
>>> On 01/05/2016 09:07 PM, Vijay Bellur wrote:
>>>> On 01/05/2016 04:55 PM, jenkins at build.gluster.org wrote:
>>>>> See <http://build.gluster.org/job/regression-test-burn-in/203/>
>>>>
>>>>> ++ ls /build/install/cores/core.9754
>>>>> + CORELIST=/build/install/cores/core.9754
>>>>> + for corefile in '$CORELIST'
>>>>> + getliblistfromcore /build/install/cores/core.9754
>>>>> + rm -f /build/install/cores/gdbout.txt
>>>>> + gdb -c /build/install/cores/core.9754 -q -ex 'info sharedlibrary'
>>>>> -ex q
>>>>> + set +x
>>>>> + rm -f /build/install/cores/gdbout.txt
>>>>> + sort /build/install/cores/liblist.txt
>>>>> + uniq
>>>>> + cat /build/install/cores/liblist.txt.tmp
>>>>> + grep -v /build/install
>>>>> + tar -cf
>>>>> /archives/archived_builds/build-install-20160105:20:48:04.tar
>>>>> /build/install/sbin /build/install/bin /build/install/lib
>>>>> /build/install/libexec /build/install/cores
>>>>> tar: Removing leading `/' from member names
>>>>> + tar -rhf
>>>>> /archives/archived_builds/build-install-20160105:20:48:04.tar -T
>>>>> /build/install/cores/liblist.txt
>>>>> tar: Removing leading `/' from member names
>>>>> + bzip2 /archives/archived_builds/build-install-20160105:20:48:04.tar
>>>>> + rm -f /build/install/cores/liblist.txt
>>>>> + rm -f /build/install/cores/liblist.txt.tmp
>>>>> + echo Cores and build archived in
>>>>> http://slave21.cloud.gluster.org/archived_builds/build-install-20160105:20:48:04.tar.bz2
>>>>>
>>>>>
>>>>>
>>>>> Cores and build archived in
>>>>> http://slave21.cloud.gluster.org/archived_builds/build-install-20160105:20:48:04.tar.bz2
>>>>>
>>>>>
>>>>>
>>>>> + echo Open core using the following command to get a proper stack...
>>>>> Open core using the following command to get a proper stack...
>>>>> + echo Example: From root of extracted tarball
>>>>> Example: From root of extracted tarball
>>>>> + echo 'gdb -ex '\''set sysroot ./'\'' -ex '\''core-file
>>>>> ./build/install/cores/core.xxx'\'' <target, say
>>>>> ./build/install/sbin/glusterd>'
>>>>> gdb -ex 'set sysroot ./' -ex 'core-file
>>>>> ./build/install/cores/core.xxx' <target, say
>>>>> ./build/install/sbin/glusterd>
>>>>> + RET=1
>>>>> + '[' 1 -ne 0 ']'
>>>>> + filename=logs/glusterfs-logs-20160105:20:48:04.tgz
>>>>> + tar -czf /archives/logs/glusterfs-logs-20160105:20:48:04.tgz
>>>>> /var/log/glusterfs /var/log/messages /var/log/messages-20151129
>>>>> /var/log/messages-20151206 /var/log/messages-20151213
>>>>> /var/log/messages-20160104
>>>>> tar: Removing leading `/' from member names
>>>>> + echo Logs archived in
>>>>> http://slave21.cloud.gluster.org/logs/glusterfs-logs-20160105:20:48:04.tgz
>>>>>
>>>>>
>>>>>
>>>>> Logs archived in
>>>>> http://slave21.cloud.gluster.org/logs/glusterfs-logs-20160105:20:48:04.tgz
>>>>>
>>>>>
>>>>>
>>>>> + exit 1
>>>>> + RET=1
>>>>> + '[' 1 = 0 ']'
>>>>> + V=-1
>>>>> + VERDICT=FAILED
>>>>
>>>>
>>>> This run has failed due to a core in ec.
>>>>
>>>> Pranith, Xavi - can you please take a look?
>>>>
>>>
>>> Another regression run failed due to this core:
>>>
>>> https://build.gluster.org/job/regression-test-burn-in/206/consoleFull
>>>
>>> Can we please expedite resolution of this crash?
>>>
>>> Thanks,
>>> Vijay
>>>
>


More information about the maintainers mailing list