[GEDI] [PATCH-for-9.1 v2 2/3] migration: Remove RDMA protocol handling

Jinpu Wang jinpu.wang at ionos.com
Mon May 13 07:30:49 UTC 2024


Hi Peter, Hi Chuan,

On Thu, May 9, 2024 at 4:14 PM Peter Xu <peterx at redhat.com> wrote:
>
> On Thu, May 09, 2024 at 04:58:34PM +0800, Zheng Chuan via wrote:
> > That's a good news to see the socket abstraction for RDMA!
> > When I was developed the series above, the most pain is the RDMA migration has no QIOChannel abstraction and i need to take a 'fake channel'
> > for it which is awkward in code implementation.
> > So, as far as I know, we can do this by
> > i. the first thing is that we need to evaluate the rsocket is good enough to satisfy our QIOChannel fundamental abstraction
> > ii. if it works right, then we will continue to see if it can give us opportunity to hide the detail of rdma protocol
> >     into rsocket by remove most of code in rdma.c and also some hack in migration main process.
> > iii. implement the advanced features like multi-fd and multi-uri for rdma migration.
> >
> > Since I am not familiar with rsocket, I need some times to look at it and do some quick verify with rdma migration based on rsocket.
> > But, yes, I am willing to involved in this refactor work and to see if we can make this migration feature more better:)
>
> Based on what we have now, it looks like we'd better halt the deprecation
> process a bit, so I think we shouldn't need to rush it at least in 9.1
> then, and we'll need to see how it goes on the refactoring.
>
> It'll be perfect if rsocket works, otherwise supporting multifd with little
> overhead / exported APIs would also be a good thing in general with
> whatever approach.  And obviously all based on the facts that we can get
> resources from companies to support this feature first.
>
> Note that so far nobody yet compared with rdma v.s. nic perf, so I hope if
> any of us can provide some test results please do so.  Many people are
> saying RDMA is better, but I yet didn't see any numbers comparing it with
> modern TCP networks.  I don't want to have old impressions floating around
> even if things might have changed..  When we have consolidated results, we
> should share them out and also reflect that in QEMU's migration docs when a
> rdma document page is ready.
I also did a tests with Mellanox ConnectX-6 100 G RoCE nic, the
results are mixed, for less than 3 streams native ethernet is faster,
and when more than 3 streams rsocket performs better.

root at x4-right:~# iperf -c 1.1.1.16 -P 1
------------------------------------------------------------
Client connecting to 1.1.1.16, TCP port 5001
TCP window size:  325 KByte (default)
------------------------------------------------------------
[  3] local 1.1.1.15 port 44214 connected with 1.1.1.16 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3] 0.0000-10.0000 sec  52.9 GBytes  45.4 Gbits/sec
root at x4-right:~# iperf -c 1.1.1.16 -P 2
[  3] local 1.1.1.15 port 33118 connected with 1.1.1.16 port 5001
[  4] local 1.1.1.15 port 33130 connected with 1.1.1.16 port 5001
------------------------------------------------------------
Client connecting to 1.1.1.16, TCP port 5001
TCP window size: 4.00 MByte (default)
------------------------------------------------------------
[ ID] Interval       Transfer     Bandwidth
[  3] 0.0000-10.0001 sec  45.0 GBytes  38.7 Gbits/sec
[  4] 0.0000-10.0000 sec  43.9 GBytes  37.7 Gbits/sec
[SUM] 0.0000-10.0000 sec  88.9 GBytes  76.4 Gbits/sec
[ CT] final connect times (min/avg/max/stdev) =
0.172/0.189/0.205/0.172 ms (tot/err) = 2/0
root at x4-right:~# iperf -c 1.1.1.16 -P 4
------------------------------------------------------------
Client connecting to 1.1.1.16, TCP port 5001
TCP window size:  325 KByte (default)
------------------------------------------------------------
[  5] local 1.1.1.15 port 50748 connected with 1.1.1.16 port 5001
[  4] local 1.1.1.15 port 50734 connected with 1.1.1.16 port 5001
[  6] local 1.1.1.15 port 50764 connected with 1.1.1.16 port 5001
[  3] local 1.1.1.15 port 50730 connected with 1.1.1.16 port 5001
[ ID] Interval       Transfer     Bandwidth
[  6] 0.0000-10.0000 sec  24.7 GBytes  21.2 Gbits/sec
[  3] 0.0000-10.0004 sec  23.6 GBytes  20.3 Gbits/sec
[  4] 0.0000-10.0000 sec  27.8 GBytes  23.9 Gbits/sec
[  5] 0.0000-10.0000 sec  28.0 GBytes  24.0 Gbits/sec
[SUM] 0.0000-10.0000 sec   104 GBytes  89.4 Gbits/sec
[ CT] final connect times (min/avg/max/stdev) =
0.104/0.156/0.204/0.124 ms (tot/err) = 4/0
root at x4-right:~# iperf -c 1.1.1.16 -P 8
[  4] local 1.1.1.15 port 55588 connected with 1.1.1.16 port 5001
[  5] local 1.1.1.15 port 55600 connected with 1.1.1.16 port 5001
------------------------------------------------------------
Client connecting to 1.1.1.16, TCP port 5001
TCP window size:  325 KByte (default)
------------------------------------------------------------
[ 10] local 1.1.1.15 port 55628 connected with 1.1.1.16 port 5001
[ 15] local 1.1.1.15 port 55648 connected with 1.1.1.16 port 5001
[  7] local 1.1.1.15 port 55620 connected with 1.1.1.16 port 5001
[  3] local 1.1.1.15 port 55584 connected with 1.1.1.16 port 5001
[ 14] local 1.1.1.15 port 55644 connected with 1.1.1.16 port 5001
[  6] local 1.1.1.15 port 55610 connected with 1.1.1.16 port 5001
[ ID] Interval       Transfer     Bandwidth
[  6] 0.0000-10.0015 sec  8.47 GBytes  7.27 Gbits/sec
[  4] 0.0000-10.0011 sec  8.62 GBytes  7.40 Gbits/sec
[  7] 0.0000-10.0000 sec  18.1 GBytes  15.5 Gbits/sec
[ 14] 0.0000-10.0000 sec  8.69 GBytes  7.46 Gbits/sec
[  5] 0.0000-10.0006 sec  18.5 GBytes  15.9 Gbits/sec
[ 10] 0.0000-10.0006 sec  16.1 GBytes  13.9 Gbits/sec
[  3] 0.0000-10.0000 sec  17.1 GBytes  14.6 Gbits/sec
[ 15] 0.0000-10.0016 sec  8.54 GBytes  7.34 Gbits/sec
[SUM] 0.0000-10.0017 sec   104 GBytes  89.4 Gbits/sec
[ CT] final connect times (min/avg/max/stdev) =
0.049/0.095/0.213/0.062 ms (tot/err) = 8/0

root at x4-right:~#
LD_PRELOAD=/usr/lib/x86_64-linux-gnu/rsocket/librspreload.so  iperf -c
1.1.1.16 -P 1
------------------------------------------------------------
Client connecting to 1.1.1.16, TCP port 5001
TCP window size:  128 KByte (default)
------------------------------------------------------------
[  3] local 1.1.1.15 port 45596 connected with 1.1.1.16 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3] 0.0000-10.0000 sec  37.8 GBytes  32.5 Gbits/sec
root at x4-right:~#
LD_PRELOAD=/usr/lib/x86_64-linux-gnu/rsocket/librspreload.so  iperf -c
1.1.1.16 -P 2
[  4] local 1.1.1.15 port 46782 connected with 1.1.1.16 port 5001
------------------------------------------------------------
Client connecting to 1.1.1.16, TCP port 5001
TCP window size:  128 KByte (default)
------------------------------------------------------------
[  3] local 1.1.1.15 port 43237 connected with 1.1.1.16 port 5001
[ ID] Interval       Transfer     Bandwidth
[  4] 0.0000-10.0000 sec  37.5 GBytes  32.2 Gbits/sec
[  3] 0.0000-10.0000 sec  40.7 GBytes  34.9 Gbits/sec
[SUM] 0.0000-10.0000 sec  78.2 GBytes  67.2 Gbits/sec
[ CT] final connect times (min/avg/max/stdev) =
5.819/6.579/7.340/7.340 ms (tot/err) = 2/0
root at x4-right:~#
LD_PRELOAD=/usr/lib/x86_64-linux-gnu/rsocket/librspreload.so  iperf -c
1.1.1.16 -P 4
[  4] local 1.1.1.15 port 60385 connected with 1.1.1.16 port 5001
[  7] local 1.1.1.15 port 55203 connected with 1.1.1.16 port 5001
[  6] local 1.1.1.15 port 35084 connected with 1.1.1.16 port 5001
------------------------------------------------------------
Client connecting to 1.1.1.16, TCP port 5001
TCP window size:  128 KByte (default)
------------------------------------------------------------
[  3] local 1.1.1.15 port 37253 connected with 1.1.1.16 port 5001
[ ID] Interval       Transfer     Bandwidth
[  6] 0.0000-10.0000 sec  28.4 GBytes  24.4 Gbits/sec
[  4] 0.0000-10.0000 sec  28.3 GBytes  24.3 Gbits/sec
[  7] 0.0000-10.0000 sec  28.4 GBytes  24.4 Gbits/sec
[  3] 0.0000-10.0001 sec  28.2 GBytes  24.3 Gbits/sec
[SUM] 0.0000-10.0001 sec   113 GBytes  97.3 Gbits/sec
[ CT] final connect times (min/avg/max/stdev) =
5.311/7.579/10.019/4.165 ms (tot/err) = 4/0
root at x4-right:~#
LD_PRELOAD=/usr/lib/x86_64-linux-gnu/rsocket/librspreload.so  iperf -c
1.1.1.16 -P 8
[  8] local 1.1.1.15 port 33684 connected with 1.1.1.16 port 5001
[ 10] local 1.1.1.15 port 40620 connected with 1.1.1.16 port 5001
------------------------------------------------------------
Client connecting to 1.1.1.16, TCP port 5001
TCP window size:  128 KByte (default)
------------------------------------------------------------
[  3] local 1.1.1.15 port 56988 connected with 1.1.1.16 port 5001
[  4] local 1.1.1.15 port 51139 connected with 1.1.1.16 port 5001
[ 12] local 1.1.1.15 port 44712 connected with 1.1.1.16 port 5001
[  5] local 1.1.1.15 port 50838 connected with 1.1.1.16 port 5001
[  6] local 1.1.1.15 port 51334 connected with 1.1.1.16 port 5001
[  9] local 1.1.1.15 port 40611 connected with 1.1.1.16 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3] 0.0000-10.0000 sec  13.8 GBytes  11.9 Gbits/sec
[  5] 0.0000-10.0001 sec  13.9 GBytes  11.9 Gbits/sec
[ 12] 0.0000-10.0001 sec  13.8 GBytes  11.9 Gbits/sec
[ 10] 0.0000-10.0001 sec  13.9 GBytes  11.9 Gbits/sec
[  9] 0.0000-10.0000 sec  13.8 GBytes  11.9 Gbits/sec
[  6] 0.0000-10.0000 sec  13.9 GBytes  11.9 Gbits/sec
[  8] 0.0000-10.0000 sec  13.8 GBytes  11.9 Gbits/sec
[  4] 0.0000-10.0001 sec  13.8 GBytes  11.9 Gbits/sec
[SUM] 0.0000-10.0001 sec   111 GBytes  95.1 Gbits/sec
[ CT] final connect times (min/avg/max/stdev) =
5.973/10.699/15.943/4.251 ms (tot/err) = 8/0
root at x4-right:~#
LD_PRELOAD=/usr/lib/x86_64-linux-gnu/rsocket/librspreload.so  iperf -c
1.1.1.16 -P 1
------------------------------------------------------------
Client connecting to 1.1.1.16, TCP port 5001
TCP window size:  128 KByte (default)
------------------------------------------------------------
[  3] local 1.1.1.15 port 36960 connected with 1.1.1.16 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3] 0.0000-10.0000 sec  41.1 GBytes  35.3 Gbits/sec
root at x4-right:~#
LD_PRELOAD=/usr/lib/x86_64-linux-gnu/rsocket/librspreload.so  iperf -c
1.1.1.16 -P 2
[  3] local 1.1.1.15 port 32799 connected with 1.1.1.16 port 5001
------------------------------------------------------------
Client connecting to 1.1.1.16, TCP port 5001
TCP window size:  128 KByte (default)
------------------------------------------------------------
[  4] local 1.1.1.15 port 35912 connected with 1.1.1.16 port 5001
[ ID] Interval       Transfer     Bandwidth
[  4] 0.0000-10.0000 sec  36.6 GBytes  31.4 Gbits/sec
[  3] 0.0000-10.0000 sec  36.6 GBytes  31.4 Gbits/sec
[SUM] 0.0000-10.0000 sec  73.2 GBytes  62.9 Gbits/sec
[ CT] final connect times (min/avg/max/stdev) =
5.172/5.842/6.512/6.512 ms (tot/err) = 2/0
root at x4-right:~#
LD_PRELOAD=/usr/lib/x86_64-linux-gnu/rsocket/librspreload.so  iperf -c
1.1.1.16 -P 4
[  4] local 1.1.1.15 port 53311 connected with 1.1.1.16 port 5001
------------------------------------------------------------
Client connecting to 1.1.1.16, TCP port 5001
TCP window size:  128 KByte (default)
------------------------------------------------------------
[  3] local 1.1.1.15 port 37243 connected with 1.1.1.16 port 5001
[  7] local 1.1.1.15 port 60801 connected with 1.1.1.16 port 5001
[  6] local 1.1.1.15 port 49694 connected with 1.1.1.16 port 5001
[ ID] Interval       Transfer     Bandwidth
[  6] 0.0000-10.0000 sec  28.2 GBytes  24.2 Gbits/sec
[  7] 0.0000-10.0000 sec  28.2 GBytes  24.3 Gbits/sec
[  3] 0.0000-10.0000 sec  28.2 GBytes  24.2 Gbits/sec
[  4] 0.0000-10.0000 sec  28.2 GBytes  24.2 Gbits/sec
[SUM] 0.0000-10.0000 sec   113 GBytes  96.9 Gbits/sec
[ CT] final connect times (min/avg/max/stdev) =
5.570/7.762/10.045/4.265 ms (tot/err) = 4/0
root at x4-right:~#


>
> Chuan, please check the whole thread discussion, it may help to understand
> what we are looking for on rdma migrations [1].  Meanwhile please feel free
> to sync with Jinpu's team and see how to move forward with such a project.
We are happy to work with community to improve rdma migration.

>
> [1] https://lore.kernel.org/qemu-devel/87frwatp7n.fsf@suse.de/
>
> Thanks,
Regards!
>
> --
> Peter Xu
>


More information about the integration mailing list