Online estimation of The Trifocal Tensor
The trifocal tensor has been used in
many video applications such as image-based rendering, 3D modeling and
augmented reality. Most existing techniques compute a chain of tensors from a
video sequence, which is decomposed into sets of view triplets. Generally, the
actual computation of the tensor is not part of the real-time processing loop.
These methods also suffer from error accumulation during a long sequence. We
propose a keyframe-based approach for online estimation of the tensor in live
video. It works with a single camera that moves freely inside the scene of
interest. Image features taken from an initial triplet set are tracked across a
video sequence. Then, as the camera is moving, the tensor associated with each
frame is estimated online.
The paper:
Li J., Laganière R., Roth G.,,
"Online Estimation of Trifocal Tensors for Augmenting Live Video,"
IEEE/ACM Symposium on Mixed and Augmented Reality,
Arlington, VA, pp. 182-190, Nov. 2004 .
Application to augmented reality
We demonstrate the applicability of our approach to augmented reality. The goal is
to automatically insert into live video a computer generated model of an object
that is not physically present in the scene.
Our proposed approach is illustrated in the above figure. The system has, as input, three camera views, denoted by V1, V2, V3 respectively. They contain a square pattern which is purposely placed inside the scene at the capture time. Note that this pattern does not have to be present anymore once the three keyframes are obtained.
The initialization step consists in obtaining both an initial estimate of the tensor and a large set of matched triplets. Several alternatives can be envisaged in order to achieve this goal, including a tensor-based guided-matching and the PVT tool developed by Dr.Gerhard. The feature points of the obtained triplet set that belong to one reference view will constitute the initial set of point to be tracked. Match pairs between the other fixed views will serve as a match pool that will be used, during the process, to update the list of points to be tracked.
Once the initialization process is completed, the online tensor estimation and augmentation process can start. The detected points in one reference view are tracked from one frame to the next. This leads to new positions of the points for which we still have the correspondences in the two fixed ones. Using this updated triplet set, robust and fast estimation of the tensor is achieved. Once a new tensor is obtained, the square pattern specified in the two fixed reference views, V1 and V3, is transferred into the moving camera view to generate a virtual image of this pattern, with which the ARToolKit method is implemented to embed the virtual object.
Obviously, when points are tracked over time, more and more features are unavoidably lost. And if nothing is done, the tracked set will eventually vanish. To overcome this problem, the match set is updated after each tensor estimation. Indeed, using the pool of match pair available in the two fixed views, it becomes possible to transfer new points on the image using the newly estimated tensor. This last step ensures the long term viability of the estimation process. In a multi-camera implementation, points from view close to the current reference views would also be transferred, thus allowing the identification of the view toward which the moving camera is transiting.
|
In this experiment, a pattern was pasted on the wall when the three reference images were captured. |