[Gluster-users] Fwd: Replica brick not working
Miloš Čučulović - MDPI
cuculovic at mdpi.com
Thu Dec 8 15:32:51 UTC 2016
1. No, atm the old server (storage2) volume is mounted on some other
servers, so all files are created there. If I check the new brick, there
is no files.
2. On storage2 server (old brick)
getfattr: Removing leading '/' from absolute path names
# file: data/data-cluster
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0x0226135726f346bcb3f8cb73365ed382
On storage server (new brick)
getfattr: Removing leading '/' from absolute path names
# file: data/data-cluster
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0x0226135726f346bcb3f8cb73365ed382
3.
Thread 8 (Thread 0x7fad832dd700 (LWP 30057)):
#0 pthread_cond_timedwait@@GLIBC_2.3.2 () at
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
#1 0x00007fad88834f3e in __afr_shd_healer_wait () from
/usr/lib/x86_64-linux-gnu/glusterfs/3.7.6/xlator/cluster/replicate.so
#2 0x00007fad88834fad in afr_shd_healer_wait () from
/usr/lib/x86_64-linux-gnu/glusterfs/3.7.6/xlator/cluster/replicate.so
#3 0x00007fad88835aa0 in afr_shd_index_healer () from
/usr/lib/x86_64-linux-gnu/glusterfs/3.7.6/xlator/cluster/replicate.so
#4 0x00007fad8df4270a in start_thread (arg=0x7fad832dd700) at
pthread_create.c:333
#5 0x00007fad8dc7882d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 7 (Thread 0x7fad83ade700 (LWP 30056)):
#0 0x00007fad8dc78e23 in epoll_wait () at
../sysdeps/unix/syscall-template.S:84
#1 0x00007fad8e808a58 in ?? () from
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0
#2 0x00007fad8df4270a in start_thread (arg=0x7fad83ade700) at
pthread_create.c:333
#3 0x00007fad8dc7882d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 6 (Thread 0x7fad894a5700 (LWP 30055)):
#0 0x00007fad8dc78e23 in epoll_wait () at
../sysdeps/unix/syscall-template.S:84
#1 0x00007fad8e808a58 in ?? () from
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0
#2 0x00007fad8df4270a in start_thread (arg=0x7fad894a5700) at
pthread_create.c:333
#3 0x00007fad8dc7882d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 5 (Thread 0x7fad8a342700 (LWP 30054)):
#0 pthread_cond_timedwait@@GLIBC_2.3.2 () at
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
#1 0x00007fad8e7ecd98 in syncenv_task () from
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0
#2 0x00007fad8e7ed970 in syncenv_processor () from
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0
#3 0x00007fad8df4270a in start_thread (arg=0x7fad8a342700) at
pthread_create.c:333
#4 0x00007fad8dc7882d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 4 (Thread 0x7fad8ab43700 (LWP 30053)):
#0 pthread_cond_timedwait@@GLIBC_2.3.2 () at
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
#1 0x00007fad8e7ecd98 in syncenv_task () from
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0
#2 0x00007fad8e7ed970 in syncenv_processor () from
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0
#3 0x00007fad8df4270a in start_thread (arg=0x7fad8ab43700) at
pthread_create.c:333
#4 0x00007fad8dc7882d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 3 (Thread 0x7fad8b344700 (LWP 30052)):
#0 do_sigwait (sig=0x7fad8b343e3c, set=<optimized out>) at
../sysdeps/unix/sysv/linux/sigwait.c:64
#1 __sigwait (set=<optimized out>, sig=0x7fad8b343e3c) at
../sysdeps/unix/sysv/linux/sigwait.c:96
#2 0x00000000004080bf in glusterfs_sigwaiter ()
#3 0x00007fad8df4270a in start_thread (arg=0x7fad8b344700) at
pthread_create.c:333
#4 0x00007fad8dc7882d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 2 (Thread 0x7fad8bb45700 (LWP 30051)):
#0 0x00007fad8df4bc6d in nanosleep () at
../sysdeps/unix/syscall-template.S:84
#1 0x00007fad8e7ca744 in gf_timer_proc () from
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0
#2 0x00007fad8df4270a in start_thread (arg=0x7fad8bb45700) at
pthread_create.c:333
#3 0x00007fad8dc7882d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 1 (Thread 0x7fad8ec66780 (LWP 30050)):
#0 0x00007fad8df439dd in pthread_join (threadid=140383309420288,
thread_return=0x0) at pthread_join.c:90
#1 0x00007fad8e808eeb in ?? () from
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0
#2 0x0000000000405501 in main ()
- Kindest regards,
Milos Cuculovic
IT Manager
---
MDPI AG
Postfach, CH-4020 Basel, Switzerland
Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
Tel. +41 61 683 77 35
Fax +41 61 302 89 18
Email: cuculovic at mdpi.com
Skype: milos.cuculovic.mdpi
On 08.12.2016 16:17, Ravishankar N wrote:
> On 12/08/2016 06:53 PM, Atin Mukherjee wrote:
>>
>>
>> On Thu, Dec 8, 2016 at 6:44 PM, Miloš Čučulović - MDPI
>> <cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>> wrote:
>>
>> Ah, damn! I found the issue. On the storage server, the storage2
>> IP address was wrong, I inversed two digits in the /etc/hosts
>> file, sorry for that :(
>>
>> I was able to add the brick now, I started the heal, but still no
>> data transfer visible.
>>
> 1. Are the files getting created on the new brick though?
> 2. Can you provide the output of `getfattr -d -m . -e hex
> /data/data-cluster` on both bricks?
> 3. Is it possible to attach gdb to the self-heal daemon on the original
> (old) brick and get a backtrace?
> `gdb -p <pid of self-heal daemon on the orignal brick>`
> thread apply all bt -->share this output
> quit gdb.
>
>
> -Ravi
>>
>> @Ravi/Pranith - can you help here?
>>
>>
>>
>> By doing gluster volume status, I have
>>
>> Status of volume: storage
>> Gluster process TCP Port RDMA Port Online Pid
>> ------------------------------------------------------------------------------
>> Brick storage2:/data/data-cluster 49152 0 Y
>> 23101
>> Brick storage:/data/data-cluster 49152 0 Y
>> 30773
>> Self-heal Daemon on localhost N/A N/A Y
>> 30050
>> Self-heal Daemon on storage N/A N/A Y
>> 30792
>>
>>
>> Any idea?
>>
>> On storage I have:
>> Number of Peers: 1
>>
>> Hostname: 195.65.194.217
>> Uuid: 7c988af2-9f76-4843-8e6f-d94866d57bb0
>> State: Peer in Cluster (Connected)
>>
>>
>> - Kindest regards,
>>
>> Milos Cuculovic
>> IT Manager
>>
>> ---
>> MDPI AG
>> Postfach, CH-4020 Basel, Switzerland
>> Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
>> Tel. +41 61 683 77 35
>> Fax +41 61 302 89 18
>> Email: cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>
>> Skype: milos.cuculovic.mdpi
>>
>> On 08.12.2016 13:55, Atin Mukherjee wrote:
>>
>> Can you resend the attachment as zip? I am unable to extract the
>> content? We shouldn't have 0 info file. What does gluster peer
>> status
>> output say?
>>
>> On Thu, Dec 8, 2016 at 4:51 PM, Miloš Čučulović - MDPI
>> <cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>
>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>> wrote:
>>
>> I hope you received my last email Atin, thank you!
>>
>> - Kindest regards,
>>
>> Milos Cuculovic
>> IT Manager
>>
>> ---
>> MDPI AG
>> Postfach, CH-4020 Basel, Switzerland
>> Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
>> Tel. +41 61 683 77 35
>> Fax +41 61 302 89 18
>> Email: cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>
>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>
>> Skype: milos.cuculovic.mdpi
>>
>> On 08.12.2016 10:28, Atin Mukherjee wrote:
>>
>>
>> ---------- Forwarded message ----------
>> From: *Atin Mukherjee* <amukherj at redhat.com
>> <mailto:amukherj at redhat.com>
>> <mailto:amukherj at redhat.com
>> <mailto:amukherj at redhat.com>> <mailto:amukherj at redhat.com
>> <mailto:amukherj at redhat.com>
>> <mailto:amukherj at redhat.com
>> <mailto:amukherj at redhat.com>>>>
>> Date: Thu, Dec 8, 2016 at 11:56 AM
>> Subject: Re: [Gluster-users] Replica brick not working
>> To: Ravishankar N <ravishankar at redhat.com
>> <mailto:ravishankar at redhat.com>
>> <mailto:ravishankar at redhat.com
>> <mailto:ravishankar at redhat.com>>
>> <mailto:ravishankar at redhat.com <mailto:ravishankar at redhat.com>
>> <mailto:ravishankar at redhat.com
>> <mailto:ravishankar at redhat.com>>>>
>> Cc: Miloš Čučulović - MDPI <cuculovic at mdpi.com
>> <mailto:cuculovic at mdpi.com>
>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>
>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>
>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>>>,
>> Pranith Kumar Karampuri
>> <pkarampu at redhat.com <mailto:pkarampu at redhat.com>
>> <mailto:pkarampu at redhat.com <mailto:pkarampu at redhat.com>>
>> <mailto:pkarampu at redhat.com
>> <mailto:pkarampu at redhat.com> <mailto:pkarampu at redhat.com
>> <mailto:pkarampu at redhat.com>>>>,
>> gluster-users
>> <gluster-users at gluster.org
>> <mailto:gluster-users at gluster.org>
>> <mailto:gluster-users at gluster.org
>> <mailto:gluster-users at gluster.org>>
>> <mailto:gluster-users at gluster.org
>> <mailto:gluster-users at gluster.org>
>> <mailto:gluster-users at gluster.org
>> <mailto:gluster-users at gluster.org>>>>
>>
>>
>>
>>
>> On Thu, Dec 8, 2016 at 11:11 AM, Ravishankar N
>> <ravishankar at redhat.com
>> <mailto:ravishankar at redhat.com> <mailto:ravishankar at redhat.com
>> <mailto:ravishankar at redhat.com>>
>> <mailto:ravishankar at redhat.com
>> <mailto:ravishankar at redhat.com> <mailto:ravishankar at redhat.com
>> <mailto:ravishankar at redhat.com>>>>
>>
>> wrote:
>>
>> On 12/08/2016 10:43 AM, Atin Mukherjee wrote:
>>
>> >From the log snippet:
>>
>> [2016-12-07 09:15:35.677645] I [MSGID: 106482]
>>
>> [glusterd-brick-ops.c:442:__glusterd_handle_add_brick]
>> 0-management: Received add brick req
>> [2016-12-07 09:15:35.677708] I [MSGID: 106062]
>>
>> [glusterd-brick-ops.c:494:__glusterd_handle_add_brick]
>> 0-management: replica-count is 2
>> [2016-12-07 09:15:35.677735] E [MSGID: 106291]
>>
>> [glusterd-brick-ops.c:614:__glusterd_handle_add_brick]
>> 0-management:
>>
>> The last log entry indicates that we hit the
>> code path in
>> gd_addbr_validate_replica_count ()
>>
>> if (replica_count ==
>> volinfo->replica_count) {
>> if (!(total_bricks %
>> volinfo->dist_leaf_count)) {
>> ret = 1;
>> goto out;
>> }
>> }
>>
>>
>> It seems unlikely that this snippet was hit
>> because we print
>> the E
>> [MSGID: 106291] in the above message only if ret==-1.
>> gd_addbr_validate_replica_count() returns -1 and
>> yet not
>> populates
>> err_str only when in volinfo->type doesn't match
>> any of the
>> known
>> volume types, so volinfo->type is corrupted perhaps?
>>
>>
>> You are right, I missed that ret is set to 1 here in
>> the above
>> snippet.
>>
>> @Milos - Can you please provide us the volume info
>> file from
>> /var/lib/glusterd/vols/<volname>/ from all the three
>> nodes to
>> continue
>> the analysis?
>>
>>
>>
>> -Ravi
>>
>> @Pranith, Ravi - Milos was trying to convert a
>> dist (1 X 1)
>> volume to a replicate (1 X 2) using add brick
>> and hit
>> this issue
>> where add-brick failed. The cluster is
>> operating with 3.7.6.
>> Could you help on what scenario this code path
>> can be
>> hit? One
>> straight forward issue I see here is missing
>> err_str in
>> this path.
>>
>>
>>
>>
>>
>>
>> --
>>
>> ~ Atin (atinm)
>>
>>
>>
>> --
>>
>> ~ Atin (atinm)
>>
>>
>>
>>
>> --
>>
>> ~ Atin (atinm)
>>
>>
>>
>>
>> --
>>
>> ~ Atin (atinm)
>
>
More information about the Gluster-users
mailing list