[Gluster-users] total outage - almost

Bernhard Dübi 1linuxengineer at gmail.com
Mon Jun 19 17:11:36 UTC 2017


Hi,


I just remembered that I posted once a bug at redhat

https://bugzilla.redhat.com/show_bug.cgi?id=1434000

could this be the same problem? but this time it's not a few files but
hundreds of thousands


BTW: I tried to disable bitrot but it didn't help

Best Regards
Bernhard


2017-06-19 16:51 GMT+02:00 Bernhard Dübi <1linuxengineer at gmail.com>:
> Hi,
>
> I checked the attributes of one of the files with I/O errors
>
> root at chastcvtprd04:~# getfattr -d -e hex -m -
> /data/glusterfs/Server_Standard/1I-1-14/brick/Server_Standard/CV_MAGNETIC/V_1050932/CHUNK_11126559/SFILE_CONTAINER_014
> getfattr: Removing leading '/' from absolute path names
> # file: data/glusterfs/Server_Standard/1I-1-14/brick/Server_Standard/CV_MAGNETIC/V_1050932/CHUNK_11126559/SFILE_CONTAINER_014
> trusted.afr.dirty=0x000000000000000000000000
> trusted.bit-rot.bad-file=0x3100
> trusted.bit-rot.signature=0x011400000000000000ee3e3ac6a79b8efc42d0904ca431cb20d01890d300c041e905d9d78a562bf276
> trusted.bit-rot.version=0x14000000000000005841bb3c000ac813
> trusted.gfid=0x1427a79086f14ed2902e3c18e133d02b
>
>
>
>
> root at chglbcvtprd04:~# getfattr -d -e hex -m -
> /data/glusterfs/Server_Standard/1I-1-14/brick/Server_Standard/CV_MAGNETIC/V_1050932/CHUNK_11126559/SFILE_CONTAINER_014
> getfattr: Removing leading '/' from absolute path names
> # file: data/glusterfs/Server_Standard/1I-1-14/brick/Server_Standard/CV_MAGNETIC/V_1050932/CHUNK_11126559/SFILE_CONTAINER_014
> trusted.afr.dirty=0x000000000000000000000000
> trusted.bit-rot.bad-file=0x3100
> trusted.bit-rot.signature=0x011300000000000000ee3e3ac6a79b8efc42d0904ca431cb20d01890d300c041e905d9d78a562bf276
> trusted.bit-rot.version=0x13000000000000005841b921000c222f
> trusted.gfid=0x1427a79086f14ed2902e3c18e133d02b
>
>
>
> the "dirty" is 0, that's good, isn't it?
> what's the "trusted.bit-rot.bad-file=0x3100" information?
>
> Best Regards
> Bernhard Dübi
>
> BTW: I saved all logs, maybe I can upload them somewhere
>
> 2017-06-19 15:55 GMT+02:00 Bernhard Dübi <1linuxengineer at gmail.com>:
>> Hi,
>>
>> we use a bunch of replicated gluster volumes as a backend for our
>> backup. Yesterday I noticed that some synthetic backups failed because
>> of I/O errors.
>>
>> Today I ran "find /gluster_vol -type f | xargs md5sum" and got loads
>> of I/O errors.
>> The brick log file shows the below errors
>>
>> [2017-06-19 13:42:33.554875] E [MSGID: 116020]
>> [bit-rot-stub.c:566:br_stub_check_bad_object]
>> 0-Server_Standard_05-bitrot-stub: c75016a9-95c1-4819-b24a-e5d77107c4ba
>> is a bad object. Returning
>> [2017-06-19 13:42:33.554923] E [MSGID: 116020]
>> [bit-rot-stub.c:566:br_stub_check_bad_object]
>> 0-Server_Standard_05-bitrot-stub: c75016a9-95c1-4819-b24a-e5d77107c4ba
>> is a bad object. Returning
>> [2017-06-19 13:42:33.554931] E [MSGID: 115081]
>> [server-rpc-fops.c:1201:server_fstat_cbk] 0-Server_Standard_05-server:
>> 21461: FSTAT -2 (c75016a9-95c1-4819-b24a-e5d77107c4ba) ==>
>> (Input/output error) [Input/output error]
>> [2017-06-19 13:42:33.554940] E [MSGID: 115081]
>> [server-rpc-fops.c:1201:server_fstat_cbk] 0-Server_Standard_05-server:
>> 21462: FSTAT -2 (c75016a9-95c1-4819-b24a-e5d77107c4ba) ==>
>> (Input/output error) [Input/output error]
>> [2017-06-19 13:42:33.555655] E [MSGID: 116020]
>> [bit-rot-stub.c:566:br_stub_check_bad_object]
>> 0-Server_Standard_05-bitrot-stub: c75016a9-95c1-4819-b24a-e5d77107c4ba
>> is a bad object. Returning
>> [2017-06-19 13:42:33.555697] E [MSGID: 115081]
>> [server-rpc-fops.c:1201:server_fstat_cbk] 0-Server_Standard_05-server:
>> 21463: FSTAT -2 (c75016a9-95c1-4819-b24a-e5d77107c4ba) ==>
>> (Input/output error) [Input/output error]
>> [2017-06-19 13:42:33.555950] E [MSGID: 116020]
>> [bit-rot-stub.c:566:br_stub_check_bad_object]
>> 0-Server_Standard_05-bitrot-stub: c75016a9-95c1-4819-b24a-e5d77107c4ba
>> is a bad object. Returning
>> [2017-06-19 13:42:33.555983] E [MSGID: 115081]
>> [server-rpc-fops.c:1201:server_fstat_cbk] 0-Server_Standard_05-server:
>> 21464: FSTAT -2 (c75016a9-95c1-4819-b24a-e5d77107c4ba) ==>
>> (Input/output error) [Input/output error]
>> [2017-06-19 13:42:33.556604] E [MSGID: 116020]
>> [bit-rot-stub.c:566:br_stub_check_bad_object]
>> 0-Server_Standard_05-bitrot-stub: c75016a9-95c1-4819-b24a-e5d77107c4ba
>> is a bad object. Returning
>>
>>
>>
>>
>> Any idea what's wrong?
>>
>>
>> BTW: I'm running gluster 3.8.12 on Ubuntu 16.04 - 4.4.0-79
>>
>> many thanks for your help
>> Bernhard


More information about the Gluster-users mailing list