Hi,
The TF of the camera is wrong, it seems the camera is looking done, look how the ground is not aligned with the ground plane:
The reason why the walls look thick is because the ZED depth estimation interpolate a lot in areas in image where there is no texture. Look how the wall on the left and the ceiling are deformed:
You may give a try with rtabmap subscribing to left and right images instead (stereo mode), so that depth estimation is done with cv::StereoBM instead, which doesn't interpolate textureless areas in the image (only edges). There could be also some parameters tuning that could be done on ZED camera driver to make it interpolate less the textureless areas.
cheers,
Mathieu