Spectroscopy is a commonly used technique for analysing the composition of materials. In industry it is used to analyse ores, soils, chemicals, polymers, pharmaceuticals and many other materials. It is also a valuable technique for predicting the performance of manufactured products based on the properties of raw materials. At CSIRO, we have developed a range of statistical methods for analysing spectroscopic data and used them to solve a number of important industrial problems. Our methods can add value to your analyses by extracting more useful information from spectroscopic data.
Spectroscopy is the study of the interaction of electromagnetic radiation with a chemical substance. Modern spectroscopic methods are based on the phenomena of absorption, reflectance, fluorescence, emission or scattering.
For example infrared spectroscopy is one of the most commonly used analytical methods in industry today. The fraction of radiation reflected by a sample is measured over many contiguous wavelengths. This produces the characteristic reflectance spectrum of the sample. Some examples of infrared spectra are shown in Figure 1 (b) below.
Spectroscopic data present unique challenges because they are highly correlated. Sophisticated statistical methods need to be applied to such data either to identify the components of a mixture (and their concentration) or for prediction purposes. Some of the techniques used include partial least squares, principal component regression, penalised discriminant analysis and neural networks.
Hyperspectral images produce a spectrum (represented by several hundred numbers) at each pixel in an image. While greyscale or colour images can discriminate between, say, rocks and vegetation, hyperspectral images can discriminate between different types of rock or vegetation.
The major airborne applications are in mineral exploration, environmental monitoring and military surveillance. Major airborne hyperspectral scanners include NASA’s AVIRIS with 224 channels and HyMapTM (128 channels). Both scanners record their images at visible and infrared wavelengths. Figure 1 (a) shows 54 AVIRIS shortwave infrared images (between 1960 and 2490 nanometres (nm)) of Oatman, formerly the site of a goldmine in Arizona. The images have been atmospherically corrected and normalised in a certain way.
Fig. 1 (b) shows the 54 values at 6 pixels in Fig. 1 (a) as 6 reflectance spectra. They are shown as a “stackplot”, one above the other. This can be done because in many applications it is the shape of a spectrum (and not its brightness) which enables its identification. The six spectra in Fig. 1 (b) have been chosen for their spectral distinctness.
There are also beginning to be “terrestrial” applications of hyperspectral image analysis, in areas such as cancer detection, pharmaceuticals and food inspection.
One of the most important challenges is the volume of data generated, especially in airborne surveys. These days, it is not uncommon to generate hundreds of gigabytes of data in such a survey. Therefore there is a major need for information extraction algorithms which are automated, fast and reliable! In particular there is a need to automatically and quickly identify the mineral or vegetation species that spectra such as those in Fig. 1 (b) represent. We have developed a spatial version of our spectral identification package, The Spectral Assistant, for use with hyperspectral images. It is called The Spatial Spectral Assistant (TSSA).
However, airborne hyperspectral image data present additional difficulties:
1. The width of a pixel in images such as those in Fig. 1 (a) is between 5 and 30 metres. Therefore most pixels will contain a mixture of materials, and so there is a need to “unmix” the spectrum recorded at each pixel into the spectra of its constituent materials. Unmixing is also sometimes called supervised mixture decomposition and can be thought of as an extension of classification. TSSA is designed to do this given a library of pure spectra, representing all the materials in a given hyperspectral image.
2. Atmospheric gases, viewing geometry and topography significantly distort the spectra recorded by hyperspectral scanners. These spectra need to be “corrected” so that they can be matched against a suitable spectral library. The data in Figs. 1 (a) and 1 (b) have been corrected with a leading correction package. Unfortunately, correction is difficult and existing packages are still not sufficiently reliable for use with spectral libraries. For instance, 5 of the 6 spectra in Fig. 1 (b) have an “absorption feature” just above 2000nm. This is due to residual carbon dioxide.