Synthetic Arabidopsis Dataset

Synthetic Arabidopsis Dataset

We introduce a synthetic dataset of 10,000 top down images of Arabidopsis plants. Leaf instance segmentation labels for each image are also presented. This dataset was designed to accompany the real dataset provided with the Leaf Segmentation Challenge of the Computer Vision Problems in Plant Phenotyping. Our dataset can be downloaded at: . Furthermore, we release a leaf instance segmentation pre-trained model based on the Mask-RCNN architecture.

Contact Dr. Peyman Moghadam for further information.

Detailed documentation and instructions are available in the Synthetic-arabidopsis-dataset

A visualisation of selected images from the synthetic Arabidopsis dataset with their corresponding leaf segmentation labels.

Pre-trained Model and Code

A leaf segmentation model trained on this synthetic data is also available. Furthermore, a python script to run this model is available at: The model requires Matterport’s implementation of Mask-RCNN. Instructions for setting up and running the code can be found in the code repository readme file.


To attribute this database, please include the following citations:

D. Ward and P. Moghadam.Synthetic Arabidopsis Dataset. v4. CSIRO. Data Collection, 2018.

D. Ward, P. Moghadam, and N. Hudson. Deep Leaf Segmentation Using Synthetic Data. In Proceedings of the British Machine Vision Conference (BMVC) Workshop on Computer Vision Problems in Plant Pheonotyping (CVPPP), 2018.


The Synthetic Arabidopsis Dataset is licensed under CSIRO Data Licence.

The leaf_segmenter_public repository is licensed under CSIRO Open Source Software License Agreement (variation of the BSD / MIT License).

Supporting Publication: 

D. Ward, P. Moghadam and N. Hudson, Using synthetic data to boost automated image-based plant phenotyping. The 5th International Plant Phenotyping Symposium (IPPS 2018), Adelaide, Australia, October 2018.