-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stage_ros crashes #66
Comments
same problem, any feedback? |
Hi, I'm using this fork https://github.com/CodeFinder2/Stage and crashes are gone. But it's not maintained anymore |
I have a very-refactored version with much better performance and no known bugs at my GitHub. It has been used in my lab for a few months now and will be the upstream release soon. Check it out.
Sent from a mobile gadget
On Aug 21, 2018, at 8:22 AM, Jorge Santos Simón <notifications@github.com<mailto:notifications@github.com>> wrote:
Hi, I'm using this fork https://github.com/CodeFinder2/Stage and crashes are gone.
I have also noticed that @rakeshshrestha31<https://github.com/rakeshshrestha31> has some fixes in his fork.
would be really nice if they can PR the fixes on upstream
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub<#66 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AACCnXFmKeq5n3a9zNzoibCMkQOepRZZks5uTCWrgaJpZM4KA3aF>.
|
https://github.com/rtv/stage_ros
It’s very much more efficient and works properly with a simulation clock. The original version was poorly designed.
On Aug 21, 2018, at 8:32 AM, Richard Vaughan <vaughan@sfu.ca<mailto:vaughan@sfu.ca>> wrote:
I have a very-refactored version with much better performance and no known bugs at my GitHub. It has been used in my lab for a few months now and will be the upstream release soon. Check it out.
Sent from a mobile gadget
On Aug 21, 2018, at 8:22 AM, Jorge Santos Simón <notifications@github.com<mailto:notifications@github.com>> wrote:
Hi, I'm using this fork https://github.com/CodeFinder2/Stage and crashes are gone.
I have also noticed that @rakeshshrestha31<https://github.com/rakeshshrestha31> has some fixes in his fork.
would be really nice if they can PR the fixes on upstream
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub<#66 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AACCnXFmKeq5n3a9zNzoibCMkQOepRZZks5uTCWrgaJpZM4KA3aF>.
|
The fixes in my fork are actually for dealing with the problems that come when we have to delete the stage world pointer to reallocate it again. I needed this for my project (which doesn't use stage_ros). stage_ros does not do this memory release so I doubt that my changes will affect this very issues. |
Thanks for informing; yes, I also noted that @rtv's fork is well ahead from upstream. Looking forward to the next release! |
@corot @rtv |
Hi, are you sure you are compiling stage_ros against the forked Stage? Catkin will always prefer the version installed on /opt/ros, so you must make it take the new version, e.g.:
|
Yes. I removed the stage in /opt/ros/kinetic/ when I compiled it from the forked Stage. |
Interesting... actually, my first workaround for the crashes was to
increase the priority of stage process... so the crash seems tightly
related to system resources.
Do the crashes become less common when using the fork?
…On Thu, Aug 23, 2018 at 2:38 PM LEI TAI ***@***.***> wrote:
Yes. I removed the stage in /opt/ros/kinetic/ when I compiled it from the
forked Stage.
It seems that if I only run one stage, it is fine.
But when I try to run two different stage worlds in two roscores (with
different ports) in one machine, it will crash after several hours.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#66 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AATsMtOEvGdZkTaTRZaqDWWAHHGgqxKyks5uTqJLgaJpZM4KA3aF>
.
|
Nothing changed. It still happens frequently. The longest record is running for 7 hours. |
It seems that the more frequently I call positionmodel->setpose(), the more frequently it happens. |
That's interesting. I've never come across this problem and I've used my fork (and also the upstream) for long running experiments. |
@rakeshshrestha31 @corot @rtv Thanks for informing. Further tests showed that it is fine to just call the reset_positions services repeatedly without sending any moving commands to robots. But if I let robots move for several steps and then call "reset_positions", and repeat this whole process, the bug will apear soon at the service callback function. I test it with https://github.com/rtv/stage_ros and 30 robots (positionmodels) in one stage world both in gui and headless modes.
Sometimes it is that:
and
This is the raytrace error when I use rtv/Stage.
|
Thanks for the detailed logs. I have also used ROS services to teleport the robot to a specific pose. I haven't seen these issues though. I see that the AVX optimization caused memory access error in one of the traces. Maybe disable this optimization (using -O0 optimize option instead of -O2 option to not use optimization altogether via the CMakeLists.txt file). Multiple issues related to AVX have been documented for different packages and I've had such troubles myself. (Side note: the debug mode is implicitly using an optimization option it seems. Maybe need to set the optimization option for debug mode explicitly. Also try clearing the build files before rebuilding) Other failure cases might also be stemming from the same root cause (AVX optimization). Just a guess though. As for the specifics of the stage library (like the region count), I guess @rtv would be able to answer it better. |
Thanks Further tests showed that it is fine to just call the reset_positions services repeatedly without sending any moving commands to robots. But if I let robots move for several steps and then call "reset_positions", and repeat this whole process, the bug will appear soon at the service callback function. I test it with https://github.com/rtv/stage_ros and 30 robots (positionmodels) in one stage world both in GUI and headless modes. |
Further test, After several actions, sleep for some seconds before resetting the positions of the robots. |
Hi, I got same problem that you raised as above. I simulated 4 robots, and publish /cmd_vel & call reset_position with command line. It will crash at unknown moment (speed up will crash faster) Do you have any solution now? |
I experience a variety of crashes while running Stage within stage_ros node. I put some of the here. I tried to compile the code on debug to provide better traces, but then stage_ros crashes at startup on libGLU.so library. So stack traces are not very meaningful, sorry. My wild guess is that it's all about memory management, as in one crash (I didn't recorded the bt) it mentioned "doubly freed memory", and also failures tend to increase when my PC has been working for a while (and so RAM gets low). I'll try to provide more information, but meanwhile... did anyone experience similar problems?
Thanks!
EXAMPLE CRASHES
This is the most common:
I saw this one sometimes:
And this only once:
The text was updated successfully, but these errors were encountered: