Virtual Navigation in Image-Based Representations of Real World Environments

NAVIRE

Virtual Navigation in Image-Based Representations of Real World Environments

[Image Acquisition and Panorama Generation]

[Camera Pose Estimation]

[Panorama Alignment and Rectification]

[View Interpolation]

[Panorama Compression and Transmission]

[Rendering and Navigation]

[Perceptual Aspects]

The objective of this project is to address and solve the technical challenges required to achieve effective and natural virtual navigation in image-based renditions of real environments. The visual database is composed of a large collection of actual images of the site of interest, mostly composed of 360° panoramas. A navigation software allow the user to virtually explore the environment by moving from one panorama to another and by looking at the scene from various points of view. Smooth and uncontrained navigation is achieved by both ensuring a dense coverage of the environment and through appropriate viewpoint interpolation. The project covers all aspect of the image-based environement representation problem, from acquisition to display.
Investigators:

Complete list of Participants: click here
List of Publications: click here

1. Image Acquisition and Panorama Generation

Images of the environement are acquired using the Point Grey Ladybug spherical digital video camera. This multi-sensor system has six cameras: one points up, the other five point out horizontally in a circular configuration. Each sensor is a single Bayer-mosaicked CCD with 1024 x 768 pixels. The sensors are configured such that the pixels of the CCDs map approximately onto a sphere, with roughly 80 pixels overlap between adjacent sensors.

An example of (demosaicked) images captured from the Ladybug sensor.

These images must then be combined in order to produce a panoramic image. A 360° panorama is formed by collecting all light incident on a point in space that is to say that a 2D plenoptic function has to be built from the intensity value extracted from the Ladybug sensors. The resulting plenoptic function can then be reprojected on any type of surface. We use here a cubic representation that offers the advantage of being easily manipulable and that can be rendered very efficiently on standard graphic hardware. In addition, the fact that such a cubic panorama is effectively made of six identical planar faces, each of them acting as a standard perspective projection camera with 90° field of views, makes the representation very convenient to handle as all standard linear projective geometry concepts being still applicable.

Example of the generated 360° panorama represented on a cube (faces laid out here in a cross pattern).
Java Cube Viewer :
A cubic panorama viewer written in JAVA and integrated with Google Maps.

VRML view of the Panoramic Cube

You will need the Cortona VRML Web Client to view this cube,to the download the VRML client click here

Panorama Correction

In order to produce high-quality panoramas, several problems need to be addressed:

Adequate Bayer demosaicking of the raw visual data must be performed in order to extract correct color information;

Bilinear Bayer de-mosaicking
New Adaptive de-mosaicking method
Remaining color component cross-talk is addressed by locally adaptive filtering
In order to handle varying lighting conditions in panoramic scenes, images are captured using individual exposure control for every CCD sensor. For the panorama generation, the images have to be aligned according to their gain and exposure time settings.
White balance (in our case based on the gray world assumption) can rely on a far more complete observation when working with panoramas instead of single images.
The dynamic range of the panorama from individually exposure controlled CCDs exceeds the common 8bit range. Therefore, as well as for increased overall visibility of details, panoramas are enhanced using a Retinex algorithm.

Original Image

Retinex enhanced Image

Some related NAVIRE publications:

Mark Fiala,
Immersive Panoramic Imagery,
in Proc. Canadian Conference on Computer and Robot Vision, pp. 386-391, Halifax, Canada, May 2005.

Eric Dubois,
Frequency-domain methods for demosaicking of Bayer-sampled color images,
in IEEE Signal Processing Letters, vol. 12, pp. 847-850, Dec 2005.

2. Camera Pose Estimation

To build a complete image-based representation of a given environements multiple panoramas must be captured. To be able to navigate from one panorama to an adjacent one, the position of each of these panoramas must be known.

In the case of outdoor environments, one simple solution consists in using a GPS device during the capture process such that each image is associated with an absolute geo-position.

A collection of geo-referenced panoramas.

These panoramas need then to be connected in order to produce a navigation graph specifying what panoramas are available from a given point of view. The GPS solution is however not always applicable: accuracy can be insufficient, satellite signal might be lost and for an indoor situation GPS usually doesn't work.

Image Matching

When no positional devices are used, the camera positions can be estimated from the observations. To do so, correspondences between the views must be obtained. When a large number of images taken from various view points need to be matches, scale-invariant feature matching constitute an excellent approach. The Hessian-Laplace operator combined with the SIFT descriptor have been selected here. In our comparative studies, this approach has shown both good repeatability and matching reliability.

Mutli-scale Hessian-Laplace features.

Matches between two panoramas.

In addition, the matches between the views can help to determine the locations where the panoramic sequences cross each other along with the orientation of each panorama.

3D Reconstruction

Once an image set has been matched, bundle adjustment techniques can be used to compute the camera position. To garantee the convergence of the estimation process, an iterative procedure is proposed here that:

estimates the position of subsets of the available views;
registers them with respect to the others;
and re-estimate their positions based on the previous estimates.

The resulting computed 3D camera positions:

3. Panorama Alignment and Rectification

One of the difficulties that occurs when navigating between adjascent panoramas is to make sure that these panoramas are consistently oriented. Indeed, when a user looks at a particular direction and moves to the following panorama, the navigator must display the correct section of the panorama, the one that corresponds to what the user should see when looking in that same direction.

Panorama alignment consists therefore in determining the pan angle between two adjascent cubes. This can be done using feature pointc correspondences between cubes. Indeed, each feature define a direction in the each cube reference frame. When the translation between the two cubes can be neglected, the rotation angle can be found through a least square minimization process that align the directional vectors given by a set of feature matches.

In a more general setup, the essential matrix between cube pair can be used to align the cubes. Once this essential matrix estimated form the set of matches, the rotational and translational components can be obtained through decomposition. This information can also be used to rectify the cubes: in the context of cubic panorama, rectification means regenerating the cubic representation by applying rotations that will make parallel all corresponding rectified cubes' faces.

Original Cube Images

Rectified Cube Images

The essential matrix also defines the epipolar geometry between two cubes. This one can be used in a E-guided RANSAC matching strategy.

A cube epipolar geometry.
Some related NAVIRE publications:

Mark Fiala,
Automatic Alignment and Graph Map Building of Panoramas,
in IEEE International Workshop on Haptic Audio Visual Environments and their Applications, pp. 103-108, Oct. 2005.

Florian Kangni and Robert Laganière,
Epipolar Geometry for the Rectification of Cubic Panoramas,
in Canadian Conference on Robot Vision, June 2006.

4. View Interpolation

Some related NAVIRE publications:

L. Zhang, D. Wang, and A.Vincent, L. Zhang, D. Wang and A. Vincent,
Adaptive reconstruction of Intermediate Views from Stereoscopic Images,
in IEEE Trans. on Circuits and Systems for Video technology, 2005.

5. Panorama Compression and Transmission

If the user of the navigation system is located remotely over a network, the video sequence corresponding to the virtual camera output must be efficiently compressed for transmission to the user. The compression can make use of international standards for video compression to allow standard software modules to do the decompression. However, the virtual video has many special features that can be exploited by the compression system.

6. Rendering and Navigation

The NAVIRE Viewer

Cubic panoramas are displayed in a Cube Viewer, which uses accelerated graphics hardware to render the six aligned images of the cube sides. The 360° view orientation is controlled in real-time using standard input (ie: mouse and keyboard) or an Intersense inertial tracker3 mounted on a Sony i-glasses4 HMD. The current pitch and heading of the user is displayed on the interface, as well as a 2D map of the environment, when available. The environment consists of a number of panorama locations connected in a graph topology.

The cube-viewer tool is used by the user to virtually navigate
inside the remote environment.

When an arrow becomes visible, this means that the user
can move in this direction.

Virtual exploration is then achieved by smoothly move from
one panorama to another.

Indoor and outdoor environments can thus be explored.

View Augmentation

The objective here is to add the capability of adding virtual objects into the NAVIRE panoramic scenes

Some related NAVIRE publications:

Derek Bradley, Alan Brunton, Mark Fiala, Gerhard Roth,
Image-based Navigation in Real Environments Using Panoramas,
in IEEE International Workshop on Haptic Audio Visual Environments and their Applications, Oct. 2005.

7. Perceptual Aspects

Humans have difficulty navigating/learning in virtual environments relative to the real world. The rendering of virtual environments may be inadequate to represent subtle visual cues required for efficient navigation/learning. In this study, we want to empirically assess navigation/learning ability in two virtual environments that differ in the quality of their visual rendering

Image-based representation of the environment
Computer-generated model of the environment

[NAVIRE participants] [NAVIRE publications] [Private Area] [VIVA Lab Projects] [VIVA Lab]

This project is funded by NSERC, Strategic Grant STPGP 269997.
This site is managed by Akshay Bhatia