[Gluster-devel] cp taking 100% cpu and never terminating
Mickey Mazarick
mic at digitaltadpole.com
Mon May 12 11:32:20 UTC 2008
Heh yes sorry on the server side I'm seeing errors like:
2008-05-11 17:02:22 E [posix.c:1982:posix_setdents] system-ns: Error
creating file /mnt/gluster/system-ns/scripts/drbl/drblupdateusr.sh with
mode (0100755)
2008-05-11 17:02:22 E [posix.c:1982:posix_setdents] system-ns: Error
creating file /mnt/gluster/system-ns/scripts/drbl/drblrebu.swp with mode
(0100644)
2008-05-11 17:02:22 E [posix.c:1982:posix_setdents] system-ns: Error
creating file /mnt/gluster/system-ns/scripts/drbl/getexefiles.sh with
mode (0100755)
2008-05-11 17:39:33 E [posix.c:1990:posix_setdents] system-ns: error
creating symlink
/mnt/gluster/system-ns/usr/lib64/perl5/5.8.2/x86_64-linux-thread-multi/CORE/libperl.so
2008-05-11 17:39:44 E [posix.c:1990:posix_setdents] system-ns: error
creating symlink
/mnt/gluster/system-ns/usr/lib64/perl5/5.8.1/x86_64-linux-thread-multi/CORE/libperl.so
2008-05-11 18:48:32 E [protocol.c:271:gf_block_unserialize_transport]
server: EOF from peer (192.168.1.204:1013)
2008-05-11 18:48:32 E [protocol.c:271:gf_block_unserialize_transport]
server: EOF from peer (192.168.1.204:1015)
.
The times don't correspond to the errors on the client. This is from the
storage brick "system1" mentioned in the client logs below.
Thanks!
-Mickey Mazarick
Raghavendra G wrote:
> Hi Mickey,
> Is it possible to provide server side logs?
>
> regards,
>
> On Mon, May 12, 2008 at 1:43 AM, Mickey Mazarick
> <mic at digitaltadpole.com <mailto:mic at digitaltadpole.com>> wrote:
>
> Something odd is happening when I run a shell script with cp
> commands in it. This happens infrequently but I have to reboot the
> system to get my processor back. I'm never taring or copying more
> than 50 megs of data.
>
> It either hangs on a command like:
> cp --reply=yes /usr/src/linux-${kernver}/.config
> /tftpboot/node_root/boot/config-${kernver}
> or
> tar cf - etc | gzip > /tftpboot/node_root/drbl_ssi/template_etc.tgz
>
> when I do a top I see:
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 1603 root 20 0 54160 1616 508 R 100 0.0 33:02.72 cp
> (100% cpu time)
>
> I'm unable to kill that process in any way, but I can kill the
> shell script that spawned it. The CP command is still running.
>
> I see the below errors on the client:
> 2008-05-11 17:02:32 E [client-protocol.c:1238:client_flush]
> system1: : returning EBADFD
> 2008-05-11 17:02:32 E [afr.c:2623:afr_flush_cbk] afr1:
> (path=/scripts/gluster/afrheal.sh child=system1) op_ret=-1 op_errno=77
> 2008-05-11 17:02:32 W [client-protocol.c:1296:client_close]
> system1: no valid fd found, returning
> 2008-05-11 17:02:32 W [client-protocol.c:1296:client_close]
> system-ns1: no valid fd found, returning
>
> My client and server specs are identical to:
> http://www.gluster.org/docs/index.php/Simple_High_Availability_Storage_with_GlusterFS_1.3
>
> This happens equally over ib-verbs and tcp transports.
>
> --
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org <mailto:Gluster-devel at nongnu.org>
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>
>
>
>
> --
> Raghavendra G
>
> A centipede was happy quite, until a toad in fun,
> Said, "Prey, which leg comes after which?",
> This raised his doubts to such a pitch,
> He fell flat into the ditch,
> Not knowing how to run.
> -Anonymous
--
More information about the Gluster-devel
mailing list