-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LD_PRELOAD fails while application samples work #923
Comments
Yes, seems the override function in MTL preload break the mlx5_common init process. The MTL override function may give an unexpected return compared to default libc function.
The override functions list can be found at https://github.com/OpenVisualCloud/Media-Transport-Library/blob/main/ld_preload/udp/udp_preload.h#L80. I guess we need find which override function break mlx5_common init and then check what's cause. |
Hi @frankdjx , Thank you very much for the reply and the reference to the code ! I traced the mlx5_common code and it fails during the IB device init here: https://github.com/DPDK/dpdk/blob/c15902587b538ff02cfb0fbb4dd481f1503d936b/drivers/common/mlx5/linux/mlx5_glue.c#L82 So if I understand correctly, you expect that it fails because internally ibv_open_device(..) can call one of the socket functions intercepted by MTL and, thus changing the standard/expected POSIX behaviour. I'll try to change the LD_PRELOAD init code and init mlx5 PMD before it overrides sockets and also add debug prints into the each intercepted call to check this. I will keep you posted. |
Yes. All MTL-intercepted socket functions initially check if the file descriptor (fd) was created by an MTL socket function. If the fd was not created by MTL, the function redirect to the standard libc path. We have tested this ld_preload on the Intel E810 NIC, and it performs well. However, the mlx5 appears to be more complex, potentially harboring scenarios that the current flow does not adequately address. |
Regarding the E810 NIC, did you test the UDP stack with 100Gbit/s port? |
Below is the TX data we has, the RX improvements is similar. <style> </style>
|
Dear MTL developers,
I want to test the MTL user-space UDP implementation with NVIDIA BlueField-2 NICs.
I was able to successfully build MTL with DPDK v23.11 (and some minor changes to disable AVX-related code to compile on ARM).
Built-in UfdServerSample application starts to wait for the data without any problems (e.g., EAL init passes successfully), but when I'm trying to run my UDP "helloworld" example with the LD_PRELOAD'ed libmtl_udp_preload.so the mlx5 EAL fails to initialize.
Is it possible that some code around MUFD in LD_PRELOAD wrapper break things - do you have any idea where to look at?
I'm attaching two logs with the maximum EAL/PMD logging level collected under root user for you to analyse together with UFD json config:
Thank you very much in advance!
The text was updated successfully, but these errors were encountered: