.. _rig-support: Rig Support =========== COLMAP has native support for modeling sensor rigs during the reconstruction process. The sensors in a rig are assumed to have fixed relative poses between each other with one reference sensor defining the origin of the rig. A frame defines a specific instance of the rig with all or a subset of sensors exposed at the same time. For example, in a stereo camera rig, one camera would be defined as the reference sensor and have an identity `sensor_from_rig` pose, whereas the second camera would be posed relative to the reference camera. Each frame would then usually be composed of two images as the measurements of both of the cameras at the same time. Workflow -------- By default, when running the standard reconstruction pipeline, each camera is modeled with a separate rig and thus each frame contains only a single image. To model rigs, the recommended workflow is to organize images by rigs and cameras in a folder structure as follows (ensure that images corresponding to the same frame have identical filenames across all folders):: rig1/ camera1/ image0001.jpg image0002.jpg ... camera2/ image0001.jpg # same frame as camera1/image0001.jpg image0002.jpg # same frame as camera1/image0002.jpg ... ... rig2/ camera1/ ... ... ... As a next step, we would first extract features using:: colmap feature_extractor \ --image_path $DATASET_PATH/images \ --database_path $DATASET_PATH/database.db \ --ImageReader.single_camera_per_folder 1 By default, the resulting database now contains a separate rig for each camera and a separate frame for each image. As such, we must adjust the relationships in the database with the desired rig configuration. This is done using:: colmap rig_configurator \ --database_path $DATASET_PATH/database.db \ --rig_config_path $DATASET_PATH/rig_config.json where the `rig_config.json` could look as follows, if the relative sensor poses in the rig are known a priori:: [ { "cameras": [ { "image_prefix": "rig1/camera1/", "ref_sensor": true }, { "image_prefix": "rig1/camera2/", "cam_from_rig_rotation": [ 0.7071067811865475, 0.0, 0.7071067811865476, 0.0 ], "cam_from_rig_translation": [ 0, 0, 0 ] } ] }, { "cameras": [ { "image_prefix": "rig2/camera1/", "ref_sensor": true }, ... ] }, ... ] Notice that this modifies the rig and frame configuration in the database, which contains the full specification of rigs that we later feed as an input to downstream processing steps. With known calibrated camera parameters, each camera can optionally also have specified `camera_model_name` and `camera_params` fields. For more fine-grain configuration of rigs and frames, the most convenient option is to manually configure the database using pycolmap by either using the `apply_rig_config` function or by individually adding the desired rig and frame objects to the reconstruction for the most flexibility. Next, we run standard feature matching. Note that it is important to configure the rigs before sequential feature matching, as images in consecutive frames will be automatically matched against each other. Finally, we can reconstruct the scene using the standard `mapper` command with the option of keeping the relative poses in the rig fixed using `--Mapper.ba_refine_sensor_from_rig 0`. Unknown rig sensor poses ------------------------ If the relative poses of sensors in the rig are not known a priori and we only know that a specific set of sensors are rigidly mounted and exposed at the same time, one can attempt the following two-step reconstruction approach. Before starting, ensure to organize your images as detailed above and perform feature extraction with the `--ImageReader.single_camera_per_folder 1` option. Next, reconstruct the scene without rig constraints by modeling each camera as its own rig (the default behavior of COLMAP without further configuration). Note that this can be a partial reconstruction from a subset of the full set of input images. The only requirement is that each camera must have at least one registered image in the same frame with a registered image of the reference camera. If the reconstruction was successful and the relative poses between registered images look roughly correct, we can proceed with the next step. The `rig_configurator` can also work without `cam_from_rig_*` transformations. By providing an existing (partial) reconstruction of the scene, it can compute the average relative rig sensor poses from all registered images:: colmap rig_configurator \ --database_path $DATASET_PATH/database.db \ --input_path $DATASET_PATH/sparse-model-without-rigs-and-frames \ --rig_config_path $DATASET_PATH/rig_config.json \ [ --output_path $DATASET_PATH/sparse-model-with-rigs-and-frames ] The provided `rig_config.json` must simply omit the respective `cam_from_rig_rotation` and `cam_from_rig_translation` fields. Now, we can either run rig bundle adjustment on the (optional) output reconstruction with configured rigs and frames:: colmap bundle_adjuster \ --input_path $DATASET_PATH/sparse-model-with-rigs-and-frames \ --output_path $DATASET_PATH/bundled-sparse-model-with-rigs-and-frames or alternatively start the reconstruction process from scratch with rig constraints, which may lead to more accurate reconstruction results:: colmap mapper --image_path $DATASET_PATH/images \ --database_path $DATASET_PATH/database.db \ --output_path $DATASET_PATH/sparse-model-with-rigs-and-frames Example ------- The following example shows an end-to-end example for how to reconstruct one of the ETH3D rig datasets using COLMAP's rig support:: wget https://www.eth3d.net/data/terrains_rig_undistorted.7z 7zz x terrains_rig_undistorted.7z colmap feature_extractor \ --database_path terrains/database.db \ --image_path terrains/images \ --ImageReader.single_camera_per_folder 1 The ETH3D dataset conveniently comes with a groundtruth COLMAP reconstruction that we use to configure the sensor rig poses as well as camera models using:: colmap rig_configurator \ --database_path terrains/database.db \ --rig_config_path terrains/rig_config.json \ --input_path terrains/rig_calibration_undistorted with the `rig_config.json`:: [ { "cameras": [ { "image_prefix": "images_rig_cam4_undistorted/", "ref_sensor": true }, { "image_prefix": "images_rig_cam5_undistorted/" }, { "image_prefix": "images_rig_cam6_undistorted/" }, { "image_prefix": "images_rig_cam7_undistorted/" } ] } ] Notice that we do not specify the sensor poses, because we used an existing reconstruction (in this case, the groundtruth but it can also be a reconstruction without rig constraints, as explained in the previous section) to automatically infer the average rig extrinsics and camera parameters. Next, we sequentially match the frames, since they were captured as a video:: colmap sequential_matcher --database_path terrains/database.db Finally, we reconstruct the scene using the mapper while keeping the groundtruth sensor rig poses and camera parameters fixed:: mkdir -p terrains/sparse colmap mapper \ --database_path terrains/database.db \ --Mapper.ba_refine_sensor_from_rig 0 \ --Mapper.ba_refine_focal_length 0 \ --Mapper.ba_refine_extra_params 0 \ --output_path terrains/sparse Reconstruction from 360° spherical images ----------------------------------------- COLMAP can handle collections of 360° panoramas by rendering virtual pinhole images (similar to a cubemap) and treating them as a camera rig. Since the rig extrinsics and camera intrinsics are known, the reconstruction process is more robust. We provide an example Python script to reconstruct a 360° collection:: python python/examples/panorama_sfm.py \ --input_image_path image_directory \ --output_path output_directory