Hi,
The TF between base and camera frames doesn't need to be fixed, but should be known (and very precise) so that images can be correctly transformed in base frame for visual odometry.
The robot maximum speed depends at which frame rate the odometry can be computed. If odometry can be computed at 30 Hz, you can move fast. If odometry can only be computed at 3-5Hz, you may need to move slowly (in particular rotating slowly). There also some others conditions that can affect odometry, see "Lost Odometry" section on this
page.
You can also look for
robot_localization package for sensor fusion.
cheers