Vision & Imaging Blog

Machines Creating Predictive Videos from Photographs

By Vision Online Marketing Team

11/15/2016

3 minutes

What if all it took to generate a high quality video was a series of photographs?

What if that video could effectively predict the future based on the still shots?

While it’s long been easy to break down an existing video into its component frames, the opposite has not been true. Extrapolating even short video based on individual frames is a processing challenge requiring AI to “fill in the blanks.”

However, researchers at MIT are now developing advanced machine intelligence approaches that may make it possible.

Predictive Video – Can Machines Understand Cause and Effect?

When shown a photo, humans can intuitively determine what will happen next based on the motion they see. For example, someone on a skateboard will probably continue moving in the same direction at roughly the same speed.

This depends on an enormous amount of contextual information humans can easily take for granted because they’re exposed to thousands of examples of it in everyday life. People may not know the mathematical underpinnings of gravity or inertia, but they know it when they see it!

Using sophisticated neural networks, the MIT project – Generating Videos with Scene Dynamics – is promising. Researchers have begun to strengthen artificial intelligence in two key areas:

Generating short videos that adequately resemble existing footage;
Predicting future “frames” in terms of how pixels might change.

Some Limitations Still to Be Overcome

After being trained on more than 2 million short videos, the experimental AI is able to generate short clips similar to the footage it encounters. It’s important to realize, however, that this doesn’t reflect a deep “understanding” of events in the footage.

Certified System Integrator Program

Set Yourself at the Forefront of the Global Vision Market

Vision system integrators certified by A3 are acknowledged globally throughout the industry as an elite group of accomplished, highly skilled and trusted professionals. You’ll be able to leverage your certification to enhance your competitiveness and expand your opportunities.

GET CERTIFIED

Current systems are capable of constructing only a few seconds of low-resolution footage in response to input. The output often includes significant distortion – it’s easy for a human onlooker to tell the original and generated video apart.

For now, research is focused on providing systems with the capabilities to generate “plausible” futures rather than the correct ones. Experimental machine vision would have to be combined with a deep understanding of physics, processed in real time, to make “correct” predictions.

Directions for the Future: Transportation, Media, and More

With time, however, more advanced predictive capabilities could be integrated into a wide range of applications. Perhaps most interesting, these exact concepts could be applied to refine the systems autonomous vehicles used to identify and avoid non-stationary obstacles. There’s also a wide variety of possibilities in virtual reality, entertainment, and general media production.

For now, however, work is continuing apace at MIT. With untold terabytes of raw video out on the Web that can be used for experimental purposes, AI could be developing predictive visual intelligence sooner than anyone might think.

Join North America's Largest Automation Network

Want more than just insights? As an A3 member, you'll not only get access to our technical analysis and industry expertise, but you'll also have the opportunity to share your own automation success stories with our engaged community of 1,300+ organizations.

From exclusive industry insights to certification programs, networking events, and advertising opportunities - discover how A3 membership can accelerate your automation journey.

Explore Member Benefits Meet Our Members