Software Development

The future of image recognition technology is deep learning

Compared to typical machine learning, deep learning works by building neural networks that simulate the mechanism of the human brain and then interpreting and analyzing data, such as image, video and text.

Pepper, the first humanoid robot that uses AI to recognize faces and human emotions, introduced in 2014.

(Photo by Pexels user Alex Knight, used via a Creative Commons license)

The face-recognition technology behind smartphones, self-driving cars and diagnostic imaging in healthcare has made massive strides of late. These examples all use solutions that make sense of objects in front of them, hence the term “computer vision” — these computers are able to make sense of what they “see.”

During a recent Data Lab meetup at CompassRed in downtown Wilmington, Delaware, Chandra Kambhamettu, professor and director of the Video/Image Modeling and Synthesis Lab in the Department of Computer and Information Sciences at the University of Delaware, and Dave Wallin, manager of innovations at The Archer Group, offered a high-level explanation of how image technology works along with the deep learning technology that powers it.

Much of the innovation in image recognition relies on deep learning technology, an advanced type of machine learning and artificial intelligence. Typical machine learning takes in data, pushes it through algorithms and then makes a prediction, making it appear that the computer is “thinking” and coming to its own conclusions.

Deep learning, on the other hand, works by building deep neural networks that simulate the mechanism of the human brain and then interpreting and analyzing data, such as image, video and text.

This is especially important for image recognition: “You’d want something like a self-driving car to be able to tell the difference between a signpost and a pedestrian,” Wallin said.

Data is the governor here, the key difference being the way data is presented to the system. Machine learning algorithms “learn” to do things by understanding structured or labeled data, and then use it to produce further output with more sets of data. However, they need to be retrained with human intervention when the output is not the desired one.

Deep learning networks do not require human intervention because the nested algorithms run the data through different concepts which eventually learn from their own mistakes.

Face recognition leverages computer vision to extract discriminative information from facial images and deep learning techniques to model the appearances of faces, classify them and transmit the data. Algorithms extract facial features and compare them to a database to find the best match.


UD’s Kambhamettu offered a demonstration of a data delivery method called surface aggregates in topology, aka SAINT, whereby effective recognition is achieved by using representative data points by clustering the frames of a video sequence.

Deep learning is suitable for instances where there are boatloads of data to analyze or complex problems to solve. In addition to security and healthcare applications, Wallin noted the technology also has a place in augmented reality and image recognition for retail operations and online shopping. Companies can also use the technology to see how often their products appear in social media platforms, he said.

Subscribe to our Newsletters
Technically Media
Connect with companies from the community
New call-to-action