The Dos and Don’ts of Modern Search Algorithms
| By: Winn Hardin, Contributing Editor
Look up machine vision definitions and you will read about cameras collecting an image and computers running specialized software to analyze the image, helping machines make an automated decision based on that image.
After the flash of the strobe and collecting photons to create an image, the first step down the yellow brick road to automated decision-making is for software to find something in the image. And to find, one must search (or get lucky, and luck has a low repeatability score).
Search algorithms have come a long way since the 1980s. Computationally intensive correlation functions have in large part given way to scale-pyramid and geometric-pattern search methods for many search functions – but not in every case. Sometimes, old school is still the best school when it comes to image processing search techniques.
2 Dos, 1 Don’t
According to Ben Dawson, Director of Strategic Development at Teledyne DALSA (Waterloo, Ontario, Canada), search is usually called upon to perform one of three functions – but only two of them are a good idea.
“Presence-absence is the first function,” Dawson says. “It’s the algorithmic equivalent of a photocell. You don’t care what it is, or where it is exactly, just if something is present. The classic example is counting items moving on a conveyor.”
The next application for search is locate. “A simple method is to use projections. You sum up rows or columns and look for sums that indicate where something is,” Dawson says. “This simple method is easily confused by background items, so modern search uses sophisticated algorithms, such as matching object outlines. Most modern search methods can also identify a limited range of objects. For more difficult identifications, you might use blob analysis, look at object curvature, color, and other features.”
Interestingly, the human brain has two different pathways for handling object location and identification. For example, if something’s coming at your head, you don’t care what it is, only that you need to duck. In those cases, a fast, subcortical pathway gets you out of the way. A second, slower cortical system identifies objects.
Pushing search algorithms to their breaking point happens when designers try to use search to do part inspections — not just if the part is there or whether it’s a nut or bolt, but if it’s the right bolt. This isn’t recommended because search algorithms try to ignore minor variations on a product to make search more robust.
Will It Correlate?
Search algorithms must overcome several challenges when locating a learned object in an image, such as perceptual changes to the image caused by changes in lighting. But they also must account for differences in scale, rotation, and orientation between the learned part and the part under test caused by changing spatial relationships between object and camera. These criteria – along with application-specific conditions such as image resolution, defect size, line speed, and so on – are the designer’s primary considerations when choosing a search technique.
“The traditional search technique is a variety of correlation, such as greyscale or normalized correlation,” explains Arnaud Lina, Software Engineering Group Manager for Matrox Electronic Systems Ltd. (Dorval, Quebec, Canada).
Correlation techniques can be easily parallelized “depending on how many degrees of freedom you’re looking for and if you have multiple objects, both of which challenge correlation techniques,” Lina says. “One of the main advantages of correlation techniques is that they are robust to large-scale, uniform changes in illumination.”
Correlation is also robust to parts of the image dropping out of focus. Geometric searches depend on edge detection, which sometimes can be adversely effected by lack of focus. In applications such as high-resolution printed circuit board (PCB) inspection, the height of a surface-mounted component may extend beyond the depth of field and therefore become blurry.
“While MVTec Halcon’s edge-based, geometric-search routines can deal with a large amount of blur robustly and, therefore, can be used in virtually all search applications, in rare cases correlation techniques are preferable for finding shapes in these types of applications,” says Carsten Steger, Research & Development Director at MVTec Software GmbH (Munich, Germany).
The problem with correlation techniques, according to Teledyne DALSA’s Dawson, is that they can be slow and not very robust. “It doesn’t have much rotational [+/-5 degrees] or scale tolerance,” he says. “Normalized correlation has pretty good tolerance, to changes from overall light and contrast. Teledyne DALSA provides both correlation and geometric search methods.”
Software engineers have developed numerous techniques for speeding up correlation techniques, from numeric shortcuts to liberal use of thresholds to quickly discard failed-match convolutions. Today’s cheaper microprocessors reduce the dollar-per-calculation price, but they cannot completely eliminate the additional time required.
Pyramids and Other Geometrics
The 1980s brought the next big breakthrough in search techniques, usually referred to as the scale-space or pyramid technique. This technique starts with full-resolution images of a golden part and the part under test at the bottom of the pyramid. Then each image is smoothed and its resolution is cut in half (sub-sampled). Repeating this process gives a “pyramid” of images going from full resolution up through smaller, lower-resolution images.
By searching for the target feature or object starting with the low-resolution image at the top of the pyramid and progressing downward to higher resolutions, system designers can quickly use correlation and geometric search to find their targets. According to MVTec’s Steger, the vast majority of search applications use the geometric search.
Geometric search techniques can define the object in multiple ways: the distance from the centroid to the edge; the curvature of the outer edge; and other factors. These methods are robust against non-uniform lighting changes as well as scale and rotation.
Combining geometric and correlation techniques with scale-space processing speeds search functions, which is one of the top priorities for image processing software developers. While software designers continue to find ways to accelerate 2D search techniques, one of the hottest new areas involves 3D search.
“We often use our registration search module for 3D searches,” says Matrox’s Lina. “Our 2D registration module can be used for stitching together images or for locating features inside an image. When it comes to 3D registration, we typically project the 3D map into a 2D space and get rough 3D location of the object before refining the 3D position. This speeds the overall search process for many applications.”
MVTec released its first 3D search algorithm using monocular 2D images from a calibrated camera in 2007. More recently, MVTec has introduced matching techniques for deformable objects, such as foil packages or agricultural products like bananas, based on 2D images and on point clouds.
“Since changes in perspective in a 2D image can be regarded as a special kind of deformation, this has also allowed us to introduce efficient perspective matching techniques for planar objects,” Steger says.
MVTec’s latest Halcon release (v13) makes the 3D search process more robust by using both 2D images and 3D point clouds when available. Furthermore, Halcon’s 2D shape-based matching has been speeded up substantially.
Today, both hardware and software companies are pulling out all the stops to make 3D imaging more accessible, from dedicated 3D sensors to wizards that help programmers decide how many scale levels their image can support, or the best features to use for geometric searches based on unprocessed images. These efforts are already yielding fruit, opening up new horizons (and needs) for search algorithm optimization.