Computer vision
What is computer vision?
Computer vision is a field of artificial intelligence that trains computers to interpret and understand visual stimuli. Through digital images and deep learning models, machines can identify and classify objects, and react appropriately to what they 'see'. (SAS, n.d.)
Computer vision as a jigsaw puzzle
When you begin putting together a jigsaw puzzle, you might approach it by first differentiating the different pieces of the image, then identifying the edges, before modelling the subcomponents.
That's how a computer assembles visual images with its neural networks. By filtering and using deep network layers, computers can piece together all the parts of the image. However, computers don't get the final image to guide them the way we have a complete image on the top of a puzzle box. They are simply fed hundreds or thousands of related images to be trained to recognise specific objects and learn their differing features.
History of computer vision
1950s
Experimentation
Early experimentation in the field began when some of the first neural networks were used to detect the edges of an object and sort simple objects into categories.1970s
First commercial use
The first commercial use of computer vision interpreted typed or handwritten text using optical character recognition. This advancement was used to interpret written text for the blind.1990s
Facial recognition
With the maturation of the internet, large sets of images became available online for analysis. This led to the increasing popularity of facial recognition programs, as the growing datasets allowed machines to identify specific people in photos and videos.Today
Convergence
Several factors have converged in today's field of computer vision. Examples include mobile technology with built-in cameras, affordable and accessible computing power, availability of computer vision hardware, and new algorithms.
Computer vision works in 3 basic steps:
Acquiring an image in real-time through video, photos or 3D technology.
Processing the image through deep learning models, which are often trained by being fed thousands of labelled images.
Understanding the image, where an object is identified or classified.
Types of computer vision:
Image classification: grouping images into different categories
Object detection: identifying a specific object in an image
Semantic segmentation: classifying each pixel within an image
Image analysis: extracting a variety of visual features and meaningful information from an image
Face detection: localising and identifying faces in images
Optical character recognition: converting the printed or handwritten text into a digital format
Examples
Self-driving vehicles
The concept of self-driving vehicles was first introduced at the 1939 World Fair by General Motors. It was an electric vehicle model that relied on radio-controlled electromagnetic fields, operating from magnetised spikes fixed on the roadway. This vision was turned into a reality in 1985, in which the vehicle could turn left or right through the currents flowing through the embedded wires on the road.
Since then, the self-driving car has transformed into an autonomous vehicle that operates sensors, artificial intelligence, radars, and cameras. However, to ensure safety for its passenger, the autonomous vehicle has many things to consider and is still on its way to becoming fully developed. (Aventior, 2021; Tomorrow's World Today, n.d.)
The facial recognition tool in self-driving vehicles is a significant feature. As assisted by sensor technology, it can identify people, cars, and other objects on the road to ensure no accidents or collisions occur while driving. The data collected by sensors and cameras can also be used to create 3D maps. This can help identify objects on the road and decipher the risk level of the driving space, allowing the car to opt for alternate routes.
There are other ways in which computer vision can make autonomous vehicles safe and reliable:
Lane line detection allows the car to stay in the lane while self-driving through the segmentation techniques of deep learning technology.
Airbags can be deployed ahead of time in cases of collisions or probable accidents, as assisted by the decoding of surroundings.
Tracking cars on the road can help detect and predict the behavioural patterns of other drivers.
Low-light mode driving enables automatic adjustments of image capturing based on the light condition of the surroundings.
Data for deep learning model training is captured by self-driving cars to improve their situational awareness while driving (Aventior, 2021).