Camera Pose Estimation from a Stereo Setup


Lab

Investigators: Sébastien Gilbert and Robert Laganière and Gerhard Roth



Abstract

This project addresses the problem of tridimensional registration of a moving rigid object or, alternatively, computation of the camera motion with respect to a fixed environment. Matching, tracking and 3D reconstruction of feature points by a stereoscopic vision setup allows the computation of the homogeneous transformation matrix linking two consecutive scene captures. Robustness to errors is provided by the scene rigidity constraint. Accumulation of error is compensated through loop detection in the calculated camera positions.



Camera Calibration

The cameras are calibrated using a checkerboard calibration pattern, precisely positioned with respect to the table top. The correspondence between 3D points and image points allows the computation of the projection matrix. The cameras are sufficiently close to the pinhole camera model, such that radial distortion can be neglected.



Matching and Tracking

The matching process is the task of identifying corresponding feature points in a stereo pair. The search for high correlation value is guided by the fundamental matrix, obtained from the calibration parameters. It limits the search area to a thin corridor around the epipolar line of a feature point. The tracking algorithm also searches for high correlation values between feature points, but performs its search in a disk instead of along an epipolar line.



3D Reconstruction

The task of computing the 3D coordinates of a feature point, from its image coordinates in at least two images and the cameras calibration parameters is referred to as 3D reconstruction. The availability of more than one projection equations of the form u = lambda P X, allows the computation of the least-square solution for X, the 3D coordinates in homogeneous form.


Robust Registration

While the object is moving in front of the cameras, feature points are matched and tracked, such that a cloud of 3D points can be computed before and after the motion. The rigid transformation the cameras have experienced with respect to the object can be computed from the two clouds of 3D points. A RANSAC algorithm can be included to add robustness to outliers resulting from erroneous matching and tracking, exploiting the scene rigidity constraint.


Error correction

Since every new camera position is computed with respect to the previous one, error will accumulate. In order to compensate for error accumulation, it is desirable to detect whenever a camera is pointing in a direction in which it has pointed before. This detection allows tracking between non-consecutive image captures and the resetting of the calculated camera positions.



Result of tracking between non-consecutive image captures

Experimental Results

The availability of the corrected projection matrices allows the computation of an upper limit to the volume occupancy of the object through shape-from-silhouette techniques, assuming the background can be extracted.



Result of silhouette intersection of the duck toy

In augmented reality applications, it is important to maintain the registration between the camera and the fixed environment, such that virtual objects are displayed realistically with respect to their surroundings. The proposed method allows to track the camera motion and therefore to display realistically virtual objects.

A moving russian headstock with its attached reference frame


Publications

Sébastien Gilbert, Robert Laganière and Gerhard Roth. Stereo Motion from Feature Matching and Tracking, IEEE Instrumentation and Measurements Technology Conference, pp. 1246-1250, Ottawa, Ontario, 2005.
PDF [880 kB]



Copyright © 2001-2004 VIVA Lab