Pattern Matching Speeds Object Location, Reduces Image-Processing Overhead
| By: Winn Hardin, Contributing Editor
In many machine vision systems, it is necessary to locate objects or features of objects as rapidly as possible so that further image-processing algorithms can extract additional features. For example, finding the correct orientation of a part within 2D or 3D space can speed up robotic-based pick-and-place applications. In food and beverage applications, pattern matching techniques allow for the reading and examination of specific characters or patterns, reducing the processing power needed to extract further data from an image.
There are two main approaches to pattern matching: those based on correlation, and geometric pattern matching. While they are technically dissimilar, both approaches rely on first locating a region or regions of a template image to provide reference data. Once extracted, this data is compared with a newly captured unknown image to find matching characteristics. Establishing a correspondence between the reference template image and the newly captured image allows for the location of objects in the newly captured image.
Historically, cross-correlation (CC) was one of the first statistical approaches used for pattern matching. In this rather brute-force approach, a simple sum of pairwise multiplications of corresponding pixel values of the template image and regions of the same size in the captured image is computed, yielding a similarity value between the images. However, because this value is subject to changes in reflectivity or illumination in the captured image, the approach has been replaced by normalized gray-scale correlation (NGC), in which the correlation value is invariant to global brightness changes. A detailed explanation of this process can be found on the website of Adaptive Vision.
While traditional CC and NGC are not invariant to large degrees of rotation, translation, and scale, such methods can be improved by rotating, translating, and scaling the template image and then using the template image to perform pattern matching.
“If geometrical transformations such as rotation, size changes, tilting, and changes in illumination have been taught during the teaching phase, the correlation value between the template and the captured image increases,” says Dr. Jonathan Vickers, Common Vision Blox Product Manager at Stemmer Imaging in Puchheim, Germany. However, such methods are more computationally intensive since numerous templates need to be correlated with the captured image.
To reduce this computational overhead, a pyramid-based approach can be used. In this method, both the template and the captured image are subsampled a number of times, in effect building two pyramids, with increasingly lower resolutions as levels increase. Correlation is performed at the top level of the pyramid and used as an initial estimate for a possible correlation match at the next level. This process is repeated at different levels and areas of successively increasing resolution of the pyramid until a suitable correlation coefficient is determined. The method is more fully explained by W. James MacLean and John K. Tsotsos in their paper “Fast Pattern Recognition Using Normalized Grey-Scale Correlation in a Pyramid Image Representation.” This method can be used with both correlation-based and geometric-based pattern matching methods (Figure A).
Geometric Pattern Matching
While standard correlation methods have limitations in terms of rotational, translational, and scale invariance, they are also limited if the part being inspected is somewhat occluded. To overcome this problem, geometric pattern matching techniques can be used to extract geometric features, such as shapes, dimensions, angles, and arcs, within a template image. Then their spatial relationships are used to find correspondences within a captured image.
This is the approach pioneered in 1999 by Bill Silver, cofounder of Cognex (Natick, Massachusetts). The company’s geometric pattern matching technology is marketed under the brand name PatMax. (See “Machine Vision Technology: Part Location with PatMax”) Its principles have been adopted by numerous other companies, including National Instruments (NI; Austin, Texas).
“There are two modes within Cognex’s PatFind,” says Kyle Voosen, Section Manager of Product Marketing at NI, “an area model that uses standard cross-correlation to find matches in an image and an edge model that uses edges, outlines, and shapes to find matches. This is very similar to the pattern matching and geometric matching methods used by NI.”
Voosen continues, “For most applications, pattern matching is often preferred because it is more straightforward to configure. However, geometric matching is useful for applications where there is inconsistent lighting, objects that change size, occlusion, or overlapping parts and perspective distortion.” (See “Geometric Matching Technique”)
Like NI, Matrox (Dorval, Quebec), MVTec (Munich, Germany), and Stemmer Imaging and Teledyne DALSA (Waterloo, Ontario) all offer NGC and geometric pattern matching tools. Matrox’s Imaging Library includes a pattern matching tool based on NGC and a tool that uses geometric features to find objects within images.
Teledyne DALSA offers an NGC tool in its Sapera Vision Software, and tools that support patterns defined by pixel intensity or geometric shape in its Sherlock software package. A Foundation Package from Stemmer Imaging’s Common Vision Blox contains a series of correlation tools.
MVTec offers both correlation-based and shape-based matching with its HALCON and MERLIC software. Its HALCON package uses a descriptor-based matching technique to locate planar objects with texture within images. To speed the process in its shape-based matching technology, MVTec employs an image pyramid approach. Both the template and the captured image are searched on the different pyramid levels.
Trees and Randomly Generated Templates
As well as geometric pattern matching with its CVB ShapeFinder software, Stemmer Imaging provides a number of other innovative pattern matching software products. While NGC and geometric pattern matching usually need one template image, the generation of a classifier based on decision trees requires several images per class. An example of this approach is the company’s CVB Minos software, which uses a learning algorithm to extract single features or the absence of features from gray-scale images and stores them in a tree structure.
According to Jonathan Vickers, grouping the single features in a tree structure enables a search that does not increase with the size of the learning set and can be used to differentiate two similar patterns. “It is fast. A small number of distinguishing features are being searched for rather than enough features to characterize a shape [geometric pattern matching] or comparing all pixels in the image [as with correlation methods],” says Vickers.
Perhaps even more impressive is Stemmer’s CVB Polimago pattern matching tool. Polimago uses a series of training images to characterize the variation in a target and then generates thousands of randomly generated views internally to synthetically create a much larger training set (Figure B). (See “Pattern Matching”)
“While many companies have adopted both NGC and geometric pattern matching methods, each one uses different algorithms to find salient features within an image, so performance characteristics can vary widely. Thus it may be difficult to determine which product is best suited for any particular application. Thus, when evaluating such software, systems integrators should carefully examine the accuracy, speed, and pattern training required with each method,” says Ron Pulicari, Senior Marketing Manager of Cognex.