Vision is probably the most commonly used, complex and flexible sense used by biological systems to gather information about the environment. Biological vision systems are capable of extracting many different types of information from the environment. For example, some can determine colour, some can see in the infra red part of the spectrum, some can detect changes in polarisation of light passing through the atmosphere while others can use multiple eyes to determine depth information.
However there is one source of visual information that is believed to be exploited by all biological vision systems, and that is motion information.
Artificial vision systems have become important in many industrial and commercial applications, however limitations in computational power have restricted these systems to using still images. The rapid increase in cheap computing power is now making the development of real time motion processing systems practical.
The Quantitative Imaging Group in the CSIRO Division of Mathematics, Informatics and Statistics is commencing work in the area of image motion analysis. This page discusses some of the potential areas of interest.
There are a number of applications where motion information offers significant advantages over still image processing.
Motion compensation for video compression is probably the most common form of image motion processing in commercial use today. Motion compensation improves the compression rate of video compression systems.
Motion is a very powerful cue for segmentation. In fact the human visual system is capable of isolating objects by using only motion information. Consider the picture below. This is an example of perfect camouflage, since the object has a texture that is identical to the background. Try stopping the animation and notice how quickly it becomes impossible to segment the object from the background. This is an example of a classical psychological test used to demonstrate the importance of motion perception in human vision.
Motion based segmentation can be used as a preprocessing step for many conventional forms of image processing. Examples include face recognition and vehicle identification.
There is a significant amount of interaction between different areas of the human visual system. For example, information is shared between the stereo and motion subsystems, and between the stereo and line completion subsystems. It is therefore likely that robust artificial vision systems will combine motion processing with other forms of visual processing.
Visual motion information is vital to mobile creatures for tasks such as navigation, obstacle avoidance and limb coordination. Artificial motion processing systems capable of performing these tasks are likely to be useful in similar roles in robotic systems that interact with complex and changing environments.
Motion in the visual field that is caused by the motion of the observer can convey important information about the observer’s motion. This information can be used to stabilise the observer (i.e. to help maintain balance), avoid obstacles and to identify the direction of heading of the observer. If the observer’s motion is known, then the visual information can be used to infer information about the structure of the environment. Systems capable of performing these kinds of tasks are likely to be useful for self navigating robots.
Hand-eye (or limb-eye) coordination in humans and animals is fundamental to many forms of interaction with the environment. Vision helps to produce a closed loop control system between the limb and the target. Artificial “visual servo” systems may help to control more complex, flexible and lower cost robot limbs.
Motion information is also a useful cue for some forms of classification. In security systems it is desirable to eliminate false alarms that might be caused by animals or wind. Distinguishing between shapes in still images can be very difficult, but if motion is used as an additional cue then more information is available. It seems that humans are very good at identifying things in this way.
Consider the picture below – this set of eleven points could be almost anything. Now click on the image to view the animated version, and it becomes immediately obvious that the points are human in origin.
Some forms of communication, such as sign language and lip reading, obviously require motion processing. Both could be important in human-computer interaction.
For more information about this work, contact Richard Beare.