On the scalability of CNNs for apple detection using RGBD data

Picking apples is a highly repetitive task that can be automized using AI. In these tasks, the importance of portable, and low cost devices becomes present. Because the robot should be able to cover large distances outside, carrying big and expensive machines would not be productive and would create dangers. This robot requires the ability to locate the apples, which can be done using machine learning algorithms like Neural Networks. Convolutional Neural Networks are often used for these tasks involving images. Among portable devices specially made for these tasks, there is the Jetson Nano, a device created by Nvidia. We trained multiple models using a dataset of apples including depth values. We also augmented the data so the model would generalize better. We found that using depth data improves the F1-score of the predictions of the models by a few percent consistently. Using augmentations, however, did not result in such conclusive results. Overall, the difference in performance between the models trained on RGB data, and augmented RGB data was negligible. When testing the performance of the models on the Jetson Nano, we found that the difference in inference time was often about 15x longer than on more powerful, non-portable machines. The usability of the Jetson Nano fully depends on the inference time required. This paper can be used as a basis for deciding the usability of the Jetson Nano on certain object detection tasks.

Click here for the poster and paper of this project.

Post Author: Meintsje de Vries