Deep learning for real‑time fruit detection and orchard fruit load estimation: benchmarking of ‘MangoYOLO’
Article
Article Title | Deep learning for real‑time fruit detection and orchard fruit load estimation: benchmarking of ‘MangoYOLO’ |
---|---|
ERA Journal ID | 5325 |
Article Category | Article |
Authors | Koirala, A (Author), Walsh, K.B (Author), Wang, Z (Author) and McCarthy, C |
Journal Title | Precision Agriculture |
Journal Citation | 20 (6), pp. 1107-1135 |
Number of Pages | 28 |
Year | 2019 |
Publisher | Springer |
Place of Publication | United States |
ISSN | 1385-2256 |
1573-1618 | |
Digital Object Identifier (DOI) | https://doi.org/10.1007/s11119-019-09642-0 |
Web Address (URL) | https://link.springer.com/article/10.1007/s11119-019-09642-0 |
Abstract | The performance of six existing deep learning architectures were compared for the task of detection of mango fruit in images of tree canopies. Images of trees (n = 1 515) from across five orchards were acquired at night using a 5 Mega-pixel RGB digital camera and 720 W of LED flood lighting in a rig mounted on a farm utility vehicle operating at 6 km/h. The two stage deep learning architectures of Faster R-CNN(VGG) and Faster R-CNN(ZF), and the single stage techniques YOLOv3, YOLOv2, YOLOv2(tiny) and SSD were trained both with original resolution and 512 × 512 pixel versions of 1 300 training tiles, while YOLOv3 was run only with 512 × 512 pixel images, giving a total of eleven models. A new architecture was also developed, based on features of YOLOv3 and YOLOv2(tiny), on the design criteria of accuracy and speed for the current application. This architecture, termed ‘MangoYOLO’, was trained using: (i) the 1 300 tile training set, (ii) the COCO dataset before training on the mango training set, and (iii) a daytime image training set of a previous publication, to create the MangoYOLO models ‘s’, ‘pt’ and ‘bu’, respectively. Average Precision plateaued with use of around 400 training tiles. MangoYOLO(pt) achieved a F1 score of 0.968 and Average Precision of 0.983 on a test set independent of the training set, outperforming other algorithms, with a detection speed of 8 ms per 512 × 512 pixel image tile while using just 833 Mb GPU memory per image (on a NVIDIA GeForce GTX 1070 Ti GPU) used for in-field application. The MangoYOLO model also outperformed other models in processing of full images, requiring just 70 ms per image (2 048 × 2 048 pixels) (i.e., capable of processing ~ 14 fps) with use of 4 417 Mb of GPU memory. The model was robust in use with images of other orchards, cultivars and lighting conditions. MangoYOLO(bu) achieved a F1 score of 0.89 on a day-time mango image dataset. With use of a correction factor estimated from the ratio of human count of fruit in images of the two sides of sample trees per orchard and a hand harvest count of all fruit on those trees, MangoYOLO(pt) achieved orchard fruit load estimates of between 4.6 and 15.2% of packhouse fruit counts for the five orchards considered. The labelled images (1 300 training, 130 validation and 300 test) of this study are available for comparative studies. |
Keywords | deep learning, fruit detection, mango, yield estimation |
Contains Sensitive Content | Does not contain sensitive content |
ANZSRC Field of Research 2020 | 300802. Horticultural crop growth and development |
460304. Computer vision | |
400799. Control engineering, mechatronics and robotics not elsewhere classified | |
Public Notes | Files associated with this item cannot be displayed due to copyright restrictions |
Byline Affiliations | Central Queensland University |
Centre for Agricultural Engineering | |
Institution of Origin | University of Southern Queensland |
https://research.usq.edu.au/item/q5882/deep-learning-for-real-time-fruit-detection-and-orchard-fruit-load-estimation-benchmarking-of-mangoyolo
345
total views7
total downloads1
views this month0
downloads this month