« Back To Vision & Imaging Industry Insights
AIA Logo

Member Since 1984


AIA - Advancing Vision + Imaging has transformed into the Association for Advancing Automation, the leading global automation trade association of the vision + imaging, robotics, motion control, and industrial AI industries.

Content Filed Under:

Agriculture and Food & Beverage Agriculture and Food & Beverage


Vision Guided Robots Pick Fruit

POSTED 05/15/2018  | By: Winn Hardin, Contributing Editor

While developing a vision-based robotic system for use in a constrained factory environment might be considered a relatively straightforward task, building a comparable system for use in an unconstrained agricultural environment such as crop picking is considerably more complex.

Vision-based robotic systems designed for agricultural purposes must be able to capture images and identify crops of various colors, shapes, sizes, and textures, discriminating between the fruit from the rest of a scene, even as illumination conditions change. Having done so, they must then compute the location of the fruit in a three dimensional space and instruct one or more robotic grippers to pick the fruit from a tree at that location.

Figure 1: A project dubbed SWEEPER will use the technology developed in the CROPS project to create a robotic solution for harvesting sweet pepper under real-world conditions.To make the automatic harvesting robot useful in a real application, the fruits must be identified accurately and rapidly. To do so, the crop identification rates for the systems must be as high as possible, a significant challenge for developers of such systems.

Despite the availability of a wide range of imaging sensors, the inaccuracy of many systems that have been developed in the past was due to several reasons such as inefficient illumination, occlusion, or inaccurate distance estimation. Researchers are currently racing to overcome such problems in order to make their crop picking robotic systems a reality.

Developers of such systems can choose from a wide variety of image sensors (3D cameras, time of flight, NIR imagers and hyperspectral cameras) in such vision-based robotic systems in order to collect enough data from a scene to enable the system software to accurately identify the location of fruit from the background. In some applications, a limited number of sensor inputs may be all that is required, while in others, multi-sensor inputs may be needed in order to determine the location of the produce to be harvested.

In the EU-FP7-project CROPS, research into the development of a sweet pepper harvesting robot showed how capturing images of fruit using a Allied Vision Technology (Stadtroda, Germany) Stingray F201C camera at multiple camera positions and viewing angles increased the ability of the system to detect fruit more effectively. Now, a project dubbed SWEEPER will use the technology developed in the CROPS project to create a robotic solution for harvesting sweet pepper under real-world conditions (Figure 1).

While capturing multiple images of a particular crop can add to the effectiveness of agricultural vision guided robotic systems, so too can capturing additional range data from a scene. “2D cameras with efficiency inside the NIR range can already deliver a lot of information for picking the right fruit. In addition, 3D technology (either by stereo or laser or time of flight) can deliver the depth information for the robot,” said René von Fintel, the Head of Product Management at Basler (Ahrensburg, Germany).”

Indeed, to automate the branch shaking operation commonly used when harvesting cherries from trees, researchers at the Biological Systems Engineering Department at Washington State University developed a vision-based system to locate shaking positions for automated cherry harvesting that used both RGB and 3D image capture techniques.

Figure 2: Researchers at the Biological Systems Engineering Department at Washington State University developed a vision-based system to locate shaking positions for automated cherry harvesting. The circle below the largest cherry cluster represents the shaking position. Reference: Amatya, S.; Karkee, M.; Zhang, Q.; Whiting, M.D. Automated Detection of Branch Shaking Locations for Robotic Cherry Harvesting Using Machine Vision. Robotics 2017, 6, 31.The color images of the trees were acquired using a Bumblebee XB3 (FLIR, Point Grey Research, Richmond, BC, Canada) stereo-vision camera. The RGB images captured by the central camera were used for detecting branches and cherries while a PMD Technologies (San Jose, CA) time-of-flight (ToF) based 3D CamCube camera was employed to capture the depth information. First, branch and cherry regions were located in 2D RGB images after which depth information provided by a 3D camera was then mapped on to them. The result was that the vision-based system was then able to make decisions on the number and location of shaking positions and estimate their 3D location in the canopy (Figure 2).

When an RGB camera provides insufficient contrast to enable effective object classification, more spectral bands can additionally be captured from a scene using either multispectral or hyperspectral imaging. Indeed, this was the approach taken by researchers at the Centre for Automation and Robotics in Madrid, Spain who developed a vision-based system to detect and localize fruits from different kinds of crops.

The multisensory system consisted of an Allied Vision Technology Prosilica GC2450 high resolution CCD color camera, a multispectral imaging system and a Mesa SwissRanger SR-400011 TOF 3D camera. The RGB camera and multispectral imaging system provided the input data required for the detection and characterization of areas of interest that could belong to fruits, whereas the TOF 3D camera supplied simultaneously fast acquisition of accurate distances and intensity images of targets, enabling the localization of the fruit.

While acquiring a variety of different data types can aid in the classification process, a number of different options are open to vision-based robotic system developers to process that data once acquired. Suitable image processing methods that can be employed to distinguish the fruit from a background include classification through deep learning technologies, segmentation by hyperspectral imaging and shape-based matching.

Of all of these analysis methods, deep learning techniques are the ones that are currently garnering much attention due to the fact that they can be used to identify fruit and vegetables precisely, and to differentiate them from other objects in a scene robustly and at great speed.

With MVTec Software’s (Munich, Germany) HALCON software, for example, users are able to train their own classifier to identify a particular type of produce using a Convolutional Neural Network (CNN). Training is performed by presenting a sufficient amount of labeled training images to the CNN. The software then analyzes these images and automatically learns which features can be used to identify the given produce. Once the network has been trained, the newly created CNN classifier can then be used to classify new image data.

 “This is a big advantage compared to all previous classification methods that required skilled engineers with programming and vision knowledge,” said Thomas Huenerfauth, Product Owner, HALCON Library at MVTec.

Indeed, working on a project known as DeepFruits, researchers at the Queensland University of Technology (Brisbane, Australia) have already demonstrated how CNNs can be employed to distinguish fruit from imagery obtained from RGB and near infrared cameras. One of the advantages of system they developed is that it was possible to retrain it to perform the detection of new types of fruit very quickly, with the entire process taking just four hours per fruit.

In the development of vision-based robotic harvesting machines, the capturing and processing of images to determine the location of produce is only part of a multifaceted problem that also involves developing a control system that can work in cohorts with the vision system to actuate a robot gripper to pick the produce.

The type of gripper or actuator employed in such systems is highly dependent on the nature of the produce to be harvested. This has led to the development of many different types of grippers and cutting devices. While soft fruit may demand the use of a soft touch gripper that doesn't do any more damage to the fruit than as a human picker, harder fruit can be picked by an actuator that grasps the fruit by vacuum and then rotates the fruit to sever it from the tree.

As an automation specialist, Festo (Islandia, NY) helped in the EU-FP7-project CROPS developing two such gripping and cutting devices with partners at the Technical University of Munich. The gripping element on their pepper gripper were passively adaptive FinRay fingers, which were used previously on Festo’s Bionic Handling Assistant -- a flexible gripper arm modeled on an elephant’s trunk equipped with a knife to cut the peppers off the plant. On the apple gripper, the gripping element consisted of membrane jaws which separated the fruit from plants either by twisting and pulling, or using a third finger, which pressed on the stalk of the apple.

Due to the many disparate types of crops that are currently harvested by hand and the many different types of conditions that they are grown under, developing a universal robot that can address the needs of harvesting multiple different types of crops may still be some way into the future. But over the next few years, many such vision guided robotic harvesting systems will undoubtedly be deployed that will be able to harvest specific types of produce reasonably effectively.