Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve polling for available file descriptors #2365

Open
wants to merge 1 commit into
base: noetic-devel
Choose a base branch
from

Conversation

Hemofektik
Copy link

The polling of all file descriptors can take a lot of time especially on modern OSs where each process can open 1<<20 FDs.
This PR addresses this issue and stops polling for FDs once the desired number of available FDs has been found.
The heap memory needed for holding the FDs is also much smaller and no longer persistent throughout the lifetime of the XmlRpcServer.

TL;DR

  • Remove 8MB of memory for each socket created
  • Reduce available file descriptor polling to minimum necessary
    • roughly a 1000 times faster
    • old duration: 34'571'539 ns
    • new duration: 34'464 ns

Copy link
Contributor

@peci1 peci1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks interesting.

For the PR to proceed, you should revert the whitespace-only changes and force-push a new commit without them.

How did you measure the time? What is the influence on a real running system?

@Hemofektik
Copy link
Author

Hemofektik commented Feb 13, 2024

For the PR to proceed, you should revert the whitespace-only changes and force-push a new commit without them.

I will do that thanks.

How did you measure the time? What is the influence on a real running system?

I saw this issue actually using nsight profiler when one of our nodes tried to create around 50 subscribers/publishers which takes forever. I then checked this code piece individually and measured its duration. 35 ms is not much for a single pub/sub but it can add up. So this change could save us 1-2 seconds in some nodes of startup time. This is probably also visible when using nodelets heavily.

@peci1
Copy link
Contributor

peci1 commented Feb 13, 2024

Interesting. Would it be possible to write a simple script (pair of scripts?) that would demonstrate this?

@Hemofektik
Copy link
Author

Interesting. Would it be possible to write a simple script (pair of scripts?) that would demonstrate this?

You mean as part of the tests? I could try to add a test that is calling this code and measures it.
In my particular case I copied the enoughFreeFDs() method to my test program to measure it individually before and after.

@peci1
Copy link
Contributor

peci1 commented Feb 13, 2024

Probably not as part of unit tests. The buildfarm performance varies too wildly to perform any kind of performance testing that would be able to measure such differences.

I meant something we could use manually when reviewing the PR.

@Hemofektik
Copy link
Author

I meant something we could use manually when reviewing the PR.

Here is a hybrid of the original code and the final code in the PR I used for validating this.
fd_test.zip
You can replace the content of enoughFreeFDs() with the original parts from the ros_comm to see its original performance. You need to change the initFDs() function then also to resize the vector to the original 8 MB size to capture the 1<<20 FDs.

@@ -125,10 +125,6 @@ namespace XmlRpc {

// Minimum number of free file descriptors before rejecting clients.
static const int FREE_FD_BUFFER;
#ifndef _WINDOWS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, and you definitely can't remove anything from header files. That would change API and ABI which is forbidden for released ROS versions.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Contributor

@peci1 peci1 Feb 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think you should drop the local poll_fds variable and use the one from the header instead. It is a protected variable, so there might be downstream code expecting to find some FDs in it. Would that make sense?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh you are right, it is a protected member. Will fix it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no not yet

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now it should be correct

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But actually right now it is not used to hold available file descriptors as this can also change over time

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this member is protected and could in theory be used in successor classes it would not make any sense. These FDs would not be usable in any way. There is no guarantee that they are still available later. Their only purpose was to count the number of available FDs at the time of creation.
This code piece was very questionable from the beginning. The last change to it was supposed to do exactly what it does now with this PR but actually missed the point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, the possibility of using this member is so low that I don't think it is worth it doing anything else...

@Hemofektik Hemofektik force-pushed the noetic-devel branch 3 times, most recently from f09e5d0 to b439358 Compare February 13, 2024 18:24
 - Remove 8MB of memory for each socket created
 - Reduce available file descriptor polling to minimum necessary
   - roughly a 1000 times faster
   - old duration: 34'571'539 ns
   - new duration: 34'464 ns
@peci1
Copy link
Contributor

peci1 commented Feb 14, 2024

I've tested the file you sent and indeed the proposed change makes it run much faster with higher ulimit -n values.

Relevant issues and PRs mentioned that most systems have this limit set to 1024, so the speedup would not be noticeable. But in docker, the limit is 1048576 by default and there it starts to be visible.

I'm still thinking about how to create a real-world (but self-contained and minimal) ROS testcase that would showcase the problem using roscpp. Would you be able to adjust the old example from https://github.com/peci1/ros_infinite_loop/tree/docker to demonstrate the problem?

@Hemofektik
Copy link
Author

Would you be able to adjust the old example from https://github.com/peci1/ros_infinite_loop/tree/docker to demonstrate the problem?

Sure, but it would take some days until I have some spare time available.

@peci1
Copy link
Contributor

peci1 commented Feb 14, 2024

I'm nevertheless not a maintainer of this repo, so don't take my comments as something required. I just try to help getting this PR in a form that would be easy to grasp for maintainers so that it could eventually get merged. There's not much maintainership effort left for ROS 1, so the official maintainers need this kind of community help.

@peci1
Copy link
Contributor

peci1 commented Feb 14, 2024

I've tried running our teaching autonomous exploration pipeline with Trutlebot in Gazebo with this PR. Everything works as I'm used to. So at least practically, it seems be okay!

rosgraph

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants