Deep Learning: Welcome to the Second Machine Vision Revolution
| By: Winn Hardin, Contributing Editor
Insiders say that deep learning is bringing about a second machine vision revolution, enabling designers to create part specifications — and therefore develop successful machine vision solutions — that simply were not feasible before.
“We’ve been seriously investing in this technology the last five years, but it’s only in the last two to three years that it’s been viable,” explains Andy Long, CEO of automation integrator Cyth Systems. “But the acceleration in the demand for deep learning during the last 18 months is staggering.”
Unlike many significant machine vision technologies, such as smart cameras and 3D sensors, where adoption was driven by engineers interested in the technology, deep learning champions are often located in the C-level suite, with interest driven by capability as much as technology. “Executives are saying, we need to be invested in this technology and see what it can do,” adds Long.
A New Way to Do Machine Vision
Deep learning machine vision software essentially allows machines to learn from data representations — in this case images that have already been tagged by human inspectors — instead of task-specific algorithms. Using a software-based neural network, deep learning programs learn much like children do — eventually learning to recognize “good” from “bad” based on seeing thousands of images that have been tagged as good or bad.
“This reminds me of where the machine vision market was 30 years ago,” says John Petry, Director of Marketing, Vision Software, at Cognex Corporation. “Today, all of our customers are familiar with traditional machine vision in some capacity. They can pick up a machine vision tool, quickly learn how an alignment tool works, and solve an application. But with deep learning, we’re having technical discussions with the full automation team about where to use it, how to train the system, how to evaluate samples and defects, how fast a deep learning system can run, and if management can trust the results. These are the types of conversations we used to have 30 years ago.”
Despite making amazing progress on deep learning software providers like Cognex and MVTec Software GmbH are quick to point out that the technology isn’t suitable for every machine vision application. For example, MVTec’s initial deep learning algorithm, released in November 2016, focused on optical character recognition applications. The ability of deep learning algorithms to learn new fonts, account for skewed text and changes in 3D perspective, and much more made OCR a primary target — so much so that both companies now offer pretrained OCR neural networks.
“With our latest HALCON offering, our deep learning algorithms can basically solve any application, but for applications like locating an object based on a defined shape, we offer highly dedicated algorithms that have been developed over 50 years,” says Johannes Hiltner, Product Manager for HALCON image processing software at MVTec. “Deep learning works best for applications where there is not a defined shape, such as determining whether a scratch is acceptable or not acceptable. The downside is that deep learning for finding arbitrary shapes is better at classification than traditional algorithms, but not highly optimized for speed.”
Just this month, Cognex announced that its ViDi deep learning suite is now available through the company’s flagship image processing library, VisionPro, which includes the Cognex Designer object-based programming front end. “People think of cosmetic inspection when they think of deep learning, such as finding scratches,” says Cognex’s Petry. “But it’s also great at assembly verification, such as confirming that a piece of surgical tubing is part of a medical kit regardless of where the tubing is located or its perspective to the camera. PatMax [Cognex’s geometric search algorithm] can give you sub-pixel accuracy on a manufactured part, but when it comes to food, small electronics, and other objects that can appear very different based on the camera’s perspective, deep learning is better. And when you add a new capacitor to an existing production line, retraining the vision system is just a matter of showing it images of the new capacitor. The same ability to recognize an object from various perspectives also makes deep learning very effective at OCR, which is why we offer a pretrained library for that.”
Training, Testing, and Deep Learning
As stated earlier, deep learning in machine vision is based on software analyzing a “supervised” data set to learn what is a good or bad part, grouping, or assembly. Traditional machine vision software that’s analyzing two images — one of a scratch and another of a scribed line — has no way of knowing which image contains a defect versus a design. Deep learning software learns to differentiate between scratches and designs by reviewing thousands of images of each and reading the images’ metadata headers.
But while machine vision integrators have accumulated huge image libraries, many of those images are the property of the customer and cannot be used to train a new neural network as part of a deep learning solution. “There are publicly available data sets available today through Caffe and TensorFlow and other open-source deep learning programs, but most are not available for commercial projects,” says MVTec’s Hiltner. “As part of our offering, we’re providing pretrained networks that are optimized for a number of common industrial machine vision applications. By using our pretrained networks, customers can refine their application using a relatively small set of tagged images instead of tens or hundreds of thousands of images.”
Cognex doesn’t offer large libraries of trained neural networks beyond the OCR tool. Instead, Petry explains, its software breaks down the process into smaller pieces, each requiring only 20 to 50 image sets. “This lets us run on commercial CPUs and GPUs, and you can train the system in five minutes versus hours. One of the biggest benefits to deep learning is that an engineer can determine if an application can be solved in minutes rather than spending weeks trying to solve a problem only to determine at the very end that it is impossible with today’s technology,” says Petry.
Experienced machine vision integrators are developing processes for helping customers evaluate deep learning and generate a viable data set for their applications. “When we don’t have enough control over the part or can't set sufficient boundaries around the specification, that’s when we consider using deep learning,” says Steve Wardell, Director of Imaging at ATS Automation.
To develop a data set that represents the production line without interfering too much with existing production, ATS suggests a hybrid approach to candidate applications. Instead of manual inspectors evaluating the actual products as they come off the line, ATS inserts a camera and monitor between the inspector and the product. The inspector looks at products and tags them appropriately. The tagged images can be fed into a deep learning program to check the efficacy of a proposed solution.
“A lot of these projects are from our life sciences and pharmaceutical clients,” says Wardell. “These industries have a lot of regulation and validation requirements. We feel this hybrid approach is one way to really allow for the level of process validation that these industries require. Even if the deep learning software isn’t successful, we’re able to provide the customer with production data that’s invaluable to them, leading to process improvement.”
Cyth Systems uses its Neural Vision deep learning platform to capture images from production environments and to send those tagged data sets to the cloud for off-line processing. “We believe that the inspectors today are the people who should be training the next-generation machine vision system,” explains Cyth’s Long. “What we’re really talking about here is the democratization of machine vision. We designed Neural Vision so the user never needs to know about heterogenous computational platforms. The only thing they need to know is: That’s my part. I need it to look this way, not that way, and divert it or not.
“That’s the goal behind all the projects we develop at Cyth,” Long continues. “We’re removing the golden handcuffs that have constrained machine vision growth in the past. Right now, there are too many skills needed to program a machine vision system. We’re working so that you won’t need those skills at all. You don’t need to understand machine vision terminology. For me, technology is the driver, and the technology is accelerating faster than ever before. It’s a very exciting time.”