There are a lot of really good monocular slams coming out now days, is there any plan to make rtab map so that it works with a single camera?
Example of exterior slam w drone https://www.skydio.com/developer/ |
Administrator
|
Hi jacksonkr,
RTAB-Map is focusing on real-time mapping for autonomous applications, for which the scale of the environment should be always known while mapping. This is very difficult with a single camera (without additional sensor telling the scale), there will be always a scale drift. It is however possible to feed RTAB-Map with VIO (visual-inertial odometry) to know the scale with a single camera, though only trajectory will be estimated in real-time, no 3D maps. To get a 3D map online, at least a RGB-D camera, a stereo camera or a lidar is required. RTAB-Map could work on a setup more like their configuration using 6 stereo cameras: https://www.skydio.com/videos/#Gh5pAT1o2V8 skydio has indeed an impressive drone, if they open their SDK to work under ROS, it would be a very interesting platform to develop on from a research point of view! Interesting link: https://www.reddit.com/r/robotics/comments/83gdqp/im_the_cofounder_and_ceo_of_skydio_we_make_r1_a/ cheers, Mathieu |
Hi matlabbe,
Could you please tell a little more about the use of visual-inertial odometry? In the case of using a monocular camera, will it be inertial odometry with rotation matrix correction using the monocular camera? Why is it not possible to obtain a 3D map in real time? If it can be obtained offline, is it possible to localize using a monocular camera in the future? How good of an idea is it to use a monocular camera together with an IMU sensor in the case of a quadcopter? |
Administrator
|
Hi,
Disclaimer: I am not an expert of implementing a VIO algorithm, I mainly a user of it. From my understanding on how it works, a tightly coupled VIO would use the IMU as the main source of pose estimation, and will add tracked visual features in the optimization. To know the depth of the features, if you know approximately how much you move between two keyframes (from IMU integration for scale combined with epipolar geometry), it is possible to roughly triangulate the same feature seen by two frames. While you add more key-frames (in a local bundle adjustment style), the 3D pose estimation of each feature will be more and more accurate. I recommend to find a popular library doing VIO and read the referred papers. It is possible, but results would not be as accurate as a stereo camera or a RGB-D camera. You may search terms like "real-time dense stereo disparity from motion". For the offline case, photogrammetry softwares can do it pretty well, because of the same pixel, you can have more than twos cameras looking at it, so the triangulation can be relatively accurate. Probably. In RTAB-Map, it is possible to feed VIO pose and 3D tracked features, and a map with only features will be created (no dense 3D map), in which we can localize with single camera afterwards. The best example is if you try RTAB-Map iOS App on a iPhone/iPad without Lidar (or you turn off the Lidar in the mapping settings), it is basically doing VIO SLAM (loop closures are detected and graph is updated), though without the dense 3d reconstruction. I think it can be pretty good, at least for pose estimation, not sure for obstacle avoidance though (unless combined with another camera for stereo, or a TOF camera). |
Free forum by Nabble | Edit this page |