Mysterious error regarding tf timeout

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Mysterious error regarding tf timeout

derektan1995
This post was updated on .
Hi Mathieu,

Hope you are doing well. I am currently facing a weird error regarding tf timeout, as shown in the picture below. I never had any problems with this before, but started facing this issue when I'm running RTABMap in localization mode. I managed to fix the issue when I ran RTABMap localization mode using another .db file. Unfortunately, I have no clue why changing my .db file is related to this issue. Furthermore, in mapping mode, things were going fine for the first minute or so before I encountered this issue again. Another observation I had was that the timestamp divergence between incoming tf message and the current system time increases with time, up to 10s of seconds. 

More context here. Currently from an external miniPC, I am retrieving odom message and odom-base_link tf from my robot's internal PC. I made sure to sync both computers perfectly using Chrony, and it had been working well for the past few months. A few months back, I was configuring RTABMap to work with robots individually (not namespaced), and all was good. In recent weeks as I was namespacing my robots' topics and nodes to perform multi robot mapping (as discussed in this post), I began having this problem. I'm sure that the tf tree is connected by checking rosrun tf view_frames. I highly doubt I made any mistakes in the namespacing process.

What stumped me most was the fact that changing the .db file averted the problem altogether. Do you have any insights to this issue? I'm tempted to set 'wait_for_transform' to a high value, and maybe setting 'sensor_odom_sync' to true, though I don't think that resolves the crux of the issue. Any insights?

Regards,
Derek Tan

Reply | Threaded
Open this post in threaded view
|

Re: Mysterious error regarding tf timeout

matlabbe
Administrator
Hi,

This warning will appear when rtabmap gets TF odom->base_link with the stamp from the odometry topic received, and TF has not been yet updated at that time, or it is too old (which seems in the case here). For example, if TF on external pc is currently updating at time 30 sec, but it received an odometry topic with stamp at 20 sec, which is 10 sec in the past, we get this error. I think the default TF buffer size is 5 seconds, so old TFs have been already flushed before the topic is processed. I don't think it is a database issue, but more a time synchronization issue. You can try to echo your odometry topic on remote pc and check at the same time the logs and see if there is a large difference.

cheers,
Mathieu
Reply | Threaded
Open this post in threaded view
|

Re: Mysterious error regarding tf timeout

derektan1995
This post was updated on .
Hi Mathieu,

Thanks for your insights. Just to clarify, do you mean that the timestamp of the odom->base_link tf must be exactly synchronized with the odometry message itself? I followed this tutorial to publish both tf and odom topic message at the exact same rate from the internal computer inside the robot, which is then subscribed by Rtabmap node on the external computer.

http://wiki.ros.org/navigation/Tutorials/RobotSetup/Odom

On a sidenote, I made sure that both internal and external computers are synced perfectly using Chrony.

Very interestingly, I isolated the issue to the pointcloud input into rtabmap. If I were to significantly downsample my pointcloud (Ouster OS1-32) using passthrough and voxel filters, I won't have this error. I only face this error when I input pointcloud with minimal filters (many points). I find this a very weird error indeed.  I definitely did not face this problem when using the Velodyne VLP16.

Thanks,
Derek
Reply | Threaded
Open this post in threaded view
|

Re: Mysterious error regarding tf timeout

matlabbe
Administrator
Hi,

Are you using PTP time sync with ouster? By default ouster point clouds are stamped from 0 sec instead of time of computer, this could create a problem if odometry is stamped with this time, when your computer uses other TF published with time of your computer.

Yes odom TF and odometry topics should be stamped with same time.

cheers,
Mathieu
Reply | Threaded
Open this post in threaded view
|

Re: Mysterious error regarding tf timeout

derektan1995
Hi

Yup, just checked to ensure that the Ouster lidar is PTP time synced. I played around with some rtabmap parameters and came to realise that I've set the queue_size to 100, even though my camera frequency (30Hz), Lidar frequency (10Hz), Odom frequency (15Hz) are all rather low. Setting queue size to around 10 resolves that issue.

Once this problem was resolved, I quickly noticed another issue. In SLAM mode, after perhaps 200s of robot moving around at 0.5m/s, mapping frequency slows down significantly (from ~0.03s to 4s). This is despite rtabmap frequency being low (around 0.5s). From some of the forum post, I saw that this might be a timesync issue. It's weird because this error blows out of proportion only when the robot has moved for 200s. Being stationary is fine, peessumably because no mapping is done when stationary. Parameters like odom_sensor_sync freezes rtabmap. I wonder if you have any insights? Thanks!




Reply | Threaded
Open this post in threaded view
|

Re: Mysterious error regarding tf timeout

matlabbe
Administrator
Hi,

I don't see well the screenshot (too low resolution), but if mapping is taking 4 sec, there is a problem. What is the computer used? Also depending on the configuration, you may have to downsample the ouster point cloud. There is probably some tuning to do to use less power. You can take a look at the velodyne or ouster example (you can skip icp_odometry part if you don't use it):
https://github.com/introlab/rtabmap_ros/blob/master/launch/tests/test_velodyne.launch
https://github.com/introlab/rtabmap_ros/blob/master/launch/tests/test_ouster_gen2.launch

cheers,
Mathieu
Reply | Threaded
Open this post in threaded view
|

Re: Mysterious error regarding tf timeout

derektan1995
This post was updated on .
Hi Mathieu,

The links you have provided were really helpful. I followed some of your recommended parameters that works well with ouster lidar, namely reducing Grid/RangeMax and Grid/CellSize. Update map timing improved in general. I believe we are getting close to the heart of the issue.

However I still noticed that there are times when rtabmap gets stuck on specific iterations. As you see from the image below, map update can spike to 7s. I noticed this happens whenever the global costmap from the navigation stack automatically resizes, which is an important feature for autonomous exploration (See pane above rtabmap's pane where cursor is at). I believe the global costmap size is linked to rtabmap via the grid_size rtabmap parameter. How is rtabmap internally linked to the ROS navigation stack's global costmap? I wonder if there are ways for Rtabmap to be independent of global costmap size?

Perhaps more background on my use case. I am currently using an Intel NUC as an external PC connected to my robot. It runs all nodes, from perception, to rtabmap, to ROS navigation stack. I'm currently using it for autonomous navigation, as described in your guide here.  I am using the ouster lidar to build the map, a Realsense camera to perform global loop closures, and odometry coming from the robot itself.  Thanks.

Reply | Threaded
Open this post in threaded view
|

Re: Mysterious error regarding tf timeout

derektan1995
Hmm, I seemed to stop the costmap resizing issue by setting the grid_size to a large number (150). The intermittent 4-7s map update issue still persists. I will continue to play around with the parameters.