Industry Insights
Theory vs. Commercial Reality in Machine Vision Algorithm Development
POSTED 03/22/2011
| By: Winn Hardin, Contributing Editor
Industrial machine vision is a unique technology that benefits greatly from developments in signal processing and computer hardware without significantly driving the direction of that technology. This is mainly because in terms of market size, machine vision falls behind broadcast and consumer imaging markets in both total hardware and software utilization.
However, as the machine vision market grows into new areas, such as human machine interactions similar to Microsoft’s Xbox Kinect stereovision system, automated classification and data mining of images on the Internet and in personal image libraries, automotive guidance systems, cell phones, and other high-volume applications, the gap between benefiting from other markets and driving those markets is shrinking.
While this gives machine vision a seat closer to the head of the technology ‘table,’ machine vision – like all these non-industrial applications – must bow to the fiscal balancing act of what is possible versus what is sustainable. In machine vision terms, this means balancing new software features against what is sustainable in industrial applications as it relates to the capabilities of designers, integrators, plant engineers and operators – few of whom have the dedicated skills as computer scientists or optical engineers capable of taking advantage of the latest, most powerful capabilities of the latest machine vision software. The result is a compromise that offers continuous improvements to widely used machine vision image processing algorithms within the parameters of the latest hardware developments, while tempering these complex features through smart HMI and graphic programming interfaces that offer specialist’s access to low-level code while still making it possible for generalists to develop powerful machine vision solutions.
MV Software: Then and Now
During the past 30 years, the universe of machine vision software has added a few new powerful tools, but the bulk machine vision software product enhancements from year to year have come from optimization of existing tools and a simplification of programming interfaces to make the technology more accessible to greater numbers of system developers.
Machine vision still focuses on feature classification, extraction, image processing, and pattern recognition. As sensor arrays have grown larger, multi-scale scale signal analysis has become more critical. And as machine vision has moved from the 2D inspection of parts on a conveyor to 3D robot guidance and inspection of complex parts, understanding projection and 3D ray tracing has become more critical. The industry still depends heavily on principal and independent components analysis, Markov models, various compression techniques and other powerful mathematical tools.
Growth in microprocessor speeds has opened the machine vision universe to color applications that are driving research into hardware-independent colorimetry, improved chrominance and luminance decompositions, and demosaicking algorithms. More recently, designers find that they have a greater variety of silicon platforms beyond the traditional Intel or AMD CPU, such as ASICs, DSPs, and now graphic processing units (GPUs) and field programmable gate arrays (FPGA). Even more recently, the first vision systems are on a chip are emerging for higher-volume, low-complexity machine vision applications, finally putting the power of the silicon fab behind the machine vision industry rather than leading it.
But how are leading machine vision companies commercializing this growing toolbox of machine vision signal processing tools in their products?
Adjusting to a 3D World
Cognex Corporation (Natick, Massachusetts) has recently introduced a new 3D-Locate tool that offers a useful illustration into how machine vision companies are adding powerful imaging processing tools while keep financial, computational, and customer constraints in mind.
Unlike 2D machine vision applications that assume a fixed distance between the camera and target object, 3D machine vision applications often include multiple, mobile cameras with distances that change constantly. To accommodate these changing parameters, Cognex’ 3D-Locate software tool begins with a calibration routine that takes into account intrinsic elements, such as the sensor and lens combination, and extrinsic elements, such as the cameras’ and (if present) robot’s physical relationship.
“Calibration is the first part of the solution,” explains John Petry, Vision Software Marketing Manager at Cognex. “You need a powerful calibration routine that can handle any number of cameras simultaneously. Then you need a powerful 2D feature detection capability to find features that you can use for 3D triangulation. And finally, you need solid triangulation code to come up with the part’s pose in 3D space.”
To convert 3D ray tracing and object location theory with a viable commercial product, Cognex’s software automates the calibration routine, whether system uses static cameras or robot-mounted cameras. For the highest accuracy, the process includes moving a calibration plate around the work area and taking multiple pictures with each camera, extracting fiducial features in each image using the company’s pattern matching software, and triangulating features from multiple cameras to determine camera-object relationships with 6-degrees of freedom. For automation engineers who value simplicity above maximum accuracy, the procedure can be as easy as acquiring images from one or more robot-mounted cameras moving in front of a fixed plate.
To create a precise model of the camera(s)-object relationship, the vision system must start with an accurate model of the camera’s image sensor and lens. “The accuracy of the overall 3D system begins first and foremost with a good mathematical model of the hardware,” says Cy Marrion, Senior Consulting Software Engineer at Cognex. “To create the camera model, you need an automated process where you acquire a number of images that fill in the model parameters. We simplify this by providing standard models for different types of lenses: short focal length, medium to long focal length, and telecentric. After you acquire the images, the software gives you feedback. For instance, if you use a middle focal length model on a short focal length lens, the software will tell you that your pixel accuracy is only one-half to one pixel. The idea here is to enable integrators to be self-sufficient, and everything starts with the accuracy of the camera model.” A stored camera model also simplifies preventive maintenance and error checking should a camera get bumped on the plant floor, which can result in misdirecting the robot. “We have a real-time calibration check procedure that only requires a single image per camera to verify that the camera is still in calibration,” concludes Cognex’s Petry. “No need to call the integrator back for an expensive fix.”
For high-magnification applications using telecentric lenses, multi-camera 3D machine vision requires a different tack to work correctly. Unlike perspective lenses where near objects appear larger than far objects, telecentric lenses do not yield visual clues to an object’s distance from a camera. This makes determining a pattern’s point of origin in 3D space more difficult, typically resulting in lower accuracy for the overall system. This is a particular problem in semiconductor, MEMS, and other electronics manufacturing where the vision system has to work at very high magnifications and short distances.
Cognex’ Principal Software Engineer, Lifeng Liu, recently developed a method to improve the location accuracy for a point of origin between two cameras that are viewing a planar feature, and optimizing the math behind the models to that end. The result is a 3D measurement accuracy approaching 0.1 of a pixel versus typical 1 to 2-pixel accuracy for high-magnification 3D systems. “The part can have higher or lower features – it can be a true 3D part – but the individual features are assumed to be planar and parallel,” Petry said. “We’ve actually developed higher resolution calibration plates down to 100 micron pitch just for these types of high-magnification applications.”
New Hardware, New Solutions
Most 3D systems that use multiple cameras benefit from the trend towards multicore processors. Depending on the ability of the system designer, a programmer can choose to use a single core per image, or – when images are large and filter convolutions relatively simply – split an image across multiple cores to accelerate the image processing. But it’s important to remember that using 4 cores doesn’t automatically generate a 4X improvement in speed or throughput; much depends on the skills of the programmer.
Today, GPUs, FPGAs, and ASICs are all being used to accelerate low-level machine vision algorithms for high-throughput or large-image systems. “These new chips aren’t the general-purpose savior of the machine vision industry, but they are tremendously powerful for a certain class of machine vision cases,” says Cognex’s Petry. “If you have large images and predictable operations, you can benefit from GPU acceleration. However, many of our tools perform a lot of decision making along the way, so the cost of the image transfer time to load the GPU can outweigh the processing benefits.
3D image processing tools are just one driver behind the growth of machine vision, particularly in vision-guided robotics. As robots and machine vision systems team up to tackle more and more industrial applications, it is entirely likely that the lines separating the two systems will blur, raising questions about who will run the workcell. Today, the robot controller is typically in charge of the operation because of the embedded OS and determinism that is critical to industrial and worker safety systems. However, today, machine vision systems often have greater processing power, begging the question: when will multicore and similar approaches be used to run embedded OS on machine vision systems, opening up new powerful data processing capabilities for a new generation of 3D robot applications. Stay tuned, this show’s not over.