Automated pneumoconiosis detection: cascade machine learning model
Pneumoconiosis is an incurable respiratory illness caused by long-term inhalation of respirable dust, such as coal, asbestos, and silica. It is more commonly known as black lung.
About 25,000 people died of pneumoconiosis globally in 2013. In Queensland, Australia, about 100 cases of mine dust lung diseases have been diagnosed since 1984. But with the recent re-emergence of pneumoconiosis, more cases are feared to have been missed.
Poor dust control and patchy medical screening are to blame for the resurgence of this potentially deadly disease even in developed countries. To date, there has been a lack of systematic, automated, and objective systems for detecting the presence and assessing the progression of pneumoconiosis for individual coal miners other than by expert radiologists.
Due to the small incidence of the pneumoconiosis and the restrictions on sharing of patient data, the number of available pneumoconiosis X-rays is insufficient, which causes imbalanced datasets and introduces significant challenges for training deep learning models.
The Imaging and Computer Vision Group at CSIRO’s Data61 is working on the use of both real and synthetic pneumoconiosis radiographs to train a cascade machine learning model for the automated detection of pneumoconiosis.
Early detection of pneumoconiosis through routine medical screening is critical to preventing complications including death.
Fig. 1 above shows the architecture of the proposed cascade learning model, which includes:
- Machine learning based lung field segmentation – we used a pixel-based machine learning algorithm that employs Pixel Classification (PC) to distinguish between lung and non-lung areas in a radiograph.
- Cycle-Consistent Adversarial Networks (CycleGAN) image generator – we trained a CycleGAN using 56 normal and 56 pneumoconiosis lung fields to generate 1,000 normal and pneumoconiosis lung field images, respectively.
- Image augmentation – all images are normalized. For training images, their mean is set to 0, they are also divided by their standard deviation, randomly zoomed, flipped horizontally. Their pixel intensities are also sheared.
- Convolutional Neural Network (CNN) based image classifier – the classifier is composed of 15 neural network layers, including 8 convolutional layers to extract feature maps, 4 pooling layers, and 3 dense layers.
Whilst automatically detecting pneumoconiosis from chest X-rays, our proposed method outperforms others and achieves a sensitivity of 93.33%, a specificity of 88.46% and an overall accuracy of 90.24%.
We hope this technology can be used for the pre-screening of occupational lung diseases, and to address the issues of variability in identifying pneumoconiosis, and the shortage of B-readers.
The cascade learning model can be potentially used in other medical imaging applications when training dataset is imbalanced or lack of diversity.
Our highly skilled team of world class researchers and engineers is open to partnerships and collaborations for research, development, and commercialisation.