« Back To Vision & Imaging Tech Papers
Canon USA - 3D Machine Vision Logo

Component Supplier

Member Since 2019


Content Filed Under:

Aerospace and Automotive Aerospace , Automotive , Containers (glass, plastic, metal) , Electronics/Electrical Components , Fabricated Metals , Miscellaneous Manufacturing , Plastics , and Robotics

Vision Guidance for Robotics Vision Guidance for Robotics

See More

3D Machine Vision: One Small Step for Robots, A Giant Leap for Factory Automation

POSTED 06/20/2019

Robots play an essential role in automating many processes on the factory floor. As manufacturers forge ahead with automation, some tasks are proving more difficult for robots than others. Random bin picking is one of them.

Selecting and picking parts from an unstructured pile requires machines to provide vision and decision-making abilities to digitally instruct robots — a multifunctional challenge that has perplexed engineers for years.

The introduction of 3-D machine vision systems overcomes the technological hurdles of machines being able to “see,” analyze, and make decisions in an unpredictable environment. Designed specifically for factory work, these new machines meet production requirements of high accuracy, speed, and low maintenance while also being durable enough to withstand factory conditions.

Automating random bin picking can reduce costs, increase productivity and capacity, and reduce workplace injuries by assuming tasks that pose danger to humans (e.g., the handling of hot, sharp, or heavy products). Perhaps more significantly, this new technology demonstrates the ability for machine vision to adapt to changing environments which has the potential to open the floodgates for automation.


Bin picking is a low-skill, highly monotonous job for humans. The job entails taking a part from a bin and delivering it to the next step in a process, often a production or assembly line. When parts are positioned predictably in a bin, the picking process is easier to automate.

To date, robots have mainly been used in predictable work processes. But, with random bin picking, parts placement in a bin is not predictable. To complicate the situation, each time a part is removed from the bin, the remaining parts may shift in position and orientation. Thus, the environment is subject to change every time a part is picked.

Let’s consider how random bin picking works with a human. When a human looks at a bin full of identical, overlapping parts, he/she sees the individual parts. The decision-making process for which part to pick is simple because human fine motor skills, required for gripping, are much more refined than those of robots. Even young children are fully capable of this seemingly simple task of unloading a bin of randomly piled parts.

Yet, to automate this process requires teaching a machine to “see” individual parts, including capturing their position and orientation, to “analyze” and “decide” which part to pick next and how the robot should approach the part for picking — all of which is more complex than it sounds.


Today, random bin picking is done either by humans or by 2-D picking systems with human assistance.

While 2-D vision can recognize the outline of individual parts in a pile, it cannot distinguish the depth of the parts. For example, 2-D is not able to determine whether a part is on top or bottom of another part. This limitation creates the need for a human to remove the parts from the bin and lay them on a flat surface. The robot can pick up and distribute the parts from there.


Now, imagine a robot able to pick individual parts from a pile of hundreds in a bin from top to bottom. That requires 3-D vision in order to be able to see the depth of the parts, as well as software to make decisions about which part is best to pick next. Machine vision tells the robot how to position, enabling it to pick the next part.

The integration between the vision and the robots is one of the more challenging aspects of 3-D bin picking. Once a part is selected to be the next for picking, and the orientation and position of the part is known, the robot’s hand needs to be positioned in the proper “pick position” so the part can be picked up successfully the first time. The transfer of digital instructions to the robot requires integration of hardware and software to the robot.

Canon’s 3-D RV series machine vision hardware consists of a projector and a camera. The hardware creates 2-D and 3-D images, collects the data, and sends it to the software.

The brain of any system is its software. The software receives the data, analyzes it, and based on the parameters given, recognizes the parts and decides which part to pick next.

The software includes a unique library (or dictionary) for each part. The library helps the software to identify patterns, recognize part position and orientation in a pile, and then decide which part to pick next based on criteria for ease and accuracy.

The software is also programmed to help the robot with decisions such as what task to do when the bin runs out or optimizing the safest path for the part to be delivered to its destination without interfering with other parts or bin walls.


To meet the needs of factory customers, the 3-D machine vision system needed to be compact in size, maintenance-free, and dust- and water-resistant. These customer requirements were built into the design. For instance, to meet the need for a maintenance-free machine, Canon opted for natural air cooling instead of a built-in fan. This decision was made intentionally to avoid production stoppages which would result from mechanical fan breakdowns.

How does the 3-D machine vision process work? The following steps are shown in the image below.

[1] Various patterns are projected onto randomly piled parts.

[2] The distance between the randomly piled parts and the sensor is measured.

[3] The pre-registered pattern dictionary and 3D CAD models are used to recognize part positioning and orientation.

[4] The system determines whether the robot hand can grasp the part without making contact with other parts.

[5] Data is sent to the robot controller.

To help the system recognize the shapes and calculate their distances, Canon‘s system uses the active stereo method to image parts. A projector beams light onto the targeted object, the reflection of light is measured with the camera, and the object’s position is computed.

For Canon’s optical designers, the biggest challenge was ensuring high accuracy. The optical performance required for 3-D machine vision systems is quite different from that of conventional lenses. Conventional lenses must minimize image distortion and have resolution and brightness. In contrast, to ensure high speed and accuracy, 3-D machine vision uses the stereo ranging method, an approach based on the principle of the pinhole camera.

To achieve a balance of high speed and high accuracy, Canon’s software engineers developed new algorithms. 3-D machine vision operations can be divided into three basic steps: pattern projection, distance measurement, and part recognition. These steps enable the system to recognize the shape of each part and to pick it out of a mound of randomly piled parts. For production lines to operate at high speed, Canon’s system can measure the distance to the part and recognize it in as little as approximately 1.8 to 2.5 seconds. More information about the software algorithms can be found here. 

Simple set-up for the customer was another important goal for Canon. The new system only requires the customer to take five pictures of the randomly piled parts and to press the “Create Dictionary” button. Even users lacking any specialized knowledge can add new parts to their system.


1. High recognition accuracy

High accuracy parts recognition is what enables robots to precisely pick up small parts, thin parts, or complicated shapes. Precise picking requires a high accuracy camera and a good projector, along with software to analyze the data that comes from the hardware.

2. Simple installation and setup

With conventional 3D machine vision, the set-up is very difficult. One of the challenging aspects of set- up can be populating the library of parts. The latest systems have a much simpler user interface for ease of use.

Calibration is also easier to maintain on the newer software. After the initial set-up, the software monitors the hardware movement over time. If the hardware shifts beyond the recommended amount, the software signals the user to manually recalibrate to maintain precise bin picking.

3. One camera system

Some 3-D vision systems require an additional 2-D camera to identify part orientation once a part is placed by a robot after 3-D recognition. A one camera system eliminates the need for alignment and programming of multiple cameras.


Robotic bin picking can improve cost efficiency, productivity, and capacity, and has the potential to reduce workplace injuries. The cost of a robot equipped with machine vision is about $60,000 to $80,000.

The perfect application for 3-D machine vision is the automotive industry. Much of the bin picking for automotive parts today is done by humans. But it is a very mundane job and subject to the efficiency limitations of humans. Another application in the near future will be automotive assembly. Automotive inspection procedures are also a targeted for automation.

The significance of this improved machine vision and decision making ability is the potential for robots to perform a broader array of jobs — including tasks that require adapting to changing environments.

This technology sets the stage for a giant leap in factory automation.