Stereo Vision

When we use our two eyes to look at the world around us, our brain is able to combine the two slightly different views from each eye to produce three dimensional (3-D) perception. Having these three dimensions to work with is useful because we human beings are then able to make judgements about distances, angles, shapes and volumes.

The majority of machine vision algorithms work on 2-D cases. For industrial applications, there are many ways of obtaining three dimensional information about the world, e.g., using special purpose sensors like acoustics, radar, or laser range finders. Another commonly used technique called stereo vision, similar in concept to human binocular vision, is to use two cameras to obtain two images from which distance information can be obtained. Compared to the alternatives mentioned above, stereo vision has the advantage that it achieves the 3-D acquisition without energy emission or moving parts. For any particular application, the key issue in making stereo vision practical is to find the most suitable combination of algorithms that will provide reliable estimates of distance.

The way that machine stereo vision generates the third dimension is achieved by finding the same features in each of the two images, and then measuring the distances to objects containing these features by triangulation; that is, by intersecting the lines of sight from each camera to the object. Finding the same points or other kinds of features in two images such that the matched points are the same projections of a point in the scene is called matching and is the fundamental computational task underlying stereo vision. Matching objects at each pixel in the image leads to a distance map.


As shown in the figure, two images are obtained from the left and right cameras observing a common scene. This pair of stereo images allows us to obtain the 3-D information about the object. The example shown in the figure is a bent circuit board.

Once we have obtained a distance map of the scene, we can then measure the shape and volume of objects in the scene or even view them from virtual or imaginary camera angles. The models obtained can be output in various formats allowing integration with other applications.

Possible application areas of stereo vision are:

  • industrial inspection for 3-D objects (quality control, deformation analysis, food inspection, printed web defect analysis)
  •  3-D sensing (three dimensional measurement of objects), 3-D growth monitoring
  • Z-keying
  • novel view synthesis, image-based rendering, virtual environments
  • autonomous vehicles, robotics
  • medical, biomedical and bioengineering (stereoendoscopy, stereoradiographs, automatic creation of three dimensional model of a human face or dental structure from stereo images)
  • scanning electron microscope
  • surveillance (motion tracking and object tracking to measure paths)
  • transport (traffic scene analysis)
  • digital photogrammetry, remote sensing (generating Digital Elevation Models, surveying, cartography)
  • 3-D database for urban and town planning
  • stereolithography, stereosculpting (automatic acquisition of digital 3-D information used in CAD-CAM systems. This information can be fed into computer controlled milling machines for rapid solid modelling)
  • asset monitoring and management
  • 3-D model creation for e-commence or on-line shopping

Fast Stereo Matching Demo

We have developed fast algorithms to carry out dense stereo matching which is then used for generating 3-D data. You can test the computational speed and reliability of the algorithms by accessing our demo page.