Research has shown that 84% of UK adults own a smartphone. As a result, taking a photo or recording a video and sharing it with friends has never been easier. Whether sharing directly with friends on popular messaging app WhatsApp, or uploading to booming social media platforms Instagram, TikTok, or YouTube, the digital world is an increasingly more visual one than ever before.
Internet algorithms index and search text with ease. When you use Google to search for something, chances are the results are fairly accurate or answer your question. However, images and videos aren’t indexed or searchable in the same way.
When uploading an image or video, the owner has the option to add meta descriptions. This is a text string which isn’t visible on screen but which tells algorithms what is in that particular piece of media. However, not all rich media has associated meta descriptions and they aren’t always accurate.
Computer vision is the field of study focused on solving the problem of making computers see by developing methods that reproduce the capability of human vision, and aims to enable computers to understand the content of digital images. It is a multidisciplinary field encompassing artificial intelligence, machine learning, statistical methods, and other engineering and computer science fields.
How computer vision applications operate
Many computer vision applications involve trying to identify and classify objects from image data. They do this using the following methods to answer certain questions.
- Object classification: What broad category of object is in this photograph?
- Object identification: Which type of a given object is in this photograph?
- Object verification: Is the object in the photograph?
- Object detection: Where are the objects in the photograph?
- Object landmark detection: What are the key points for the object in the photograph?
- Object segmentation: What pixels belong to the object in the image?
- Object recognition: What objects are in this photograph and where are they?
Other methods of analysis used in computer vision include:
- video motion analysis to estimate the velocity of objects in a video or the camera itself;
- image segmentation where algorithms partition images into multiple sets of views;
- scene reconstruction which creates a 3D model of a scene inputted through image or video; and
- image restoration where blurring is removed from photos using machine learning filters.
Why computer vision is difficult to solve
The early experiments of computer vision began in the 1950s. Since then it has spanned robotics and mobile robot navigation, military intelligence, human computer interaction, image retrieval in digital libraries, and the rendering of realistic scenes in computer graphics.
Despite decades of research, computer vision remains an unsolved problem. While some strides have been made, specialists are yet to reach the same level of success in computers as is innate in humans.
For fully-sighted humans, seeing and understanding what we’re looking at is effortless. Because of this ease, computer vision engineers originally believed that reproducing this behaviour within machines would also be a fairly simple problem to solve. That, it turns out, has not been the case.
While we know that human vision is simple for us, psychologists and biologists don’t yet have a complete understanding as to why and how it’s so simple. There is still a knowledge gap in being able to explain the complete workings of our eyes and the interpretation of what our eyes see within our brains.
As humans, we are also able to interpret what we see under a variety of different conditions – different lighting, angles, and distances. With a range of variables, we can still reach the same conclusion and correctly identify an object.
Without understanding the complexities of human vision as a whole, it’s difficult to replicate or adapt for success in computer vision.
Recent progress in computer vision
While the problem of computer vision doesn’t yet have an entire solution, progress has been made in the field due to innovations in artificial intelligence – particularly in deep learning and neural networks.
As the amount of data generated every day continues to grow, so do the capabilities in computer vision. Visual data is booming, with over three billion images being shared online per day, and computer science advancements mean the computing power to analyse this data is now available. Computer vision algorithms and hardware have evolved in their complexity, resulting in higher accuracy rates for object identification.
Facial recognition in smartphones has become a key feature of unlocking our mobile devices in recent years, a success which is down to computer vision.
Other problems which have been solved in this vast field also include:
- optical character recognition (OCR) which allows software to read the text from within an image, PDF, or a handwritten scanned document
- 3D model building, or photogrammetry, which may be a stepping stone to reproducing the identification of images from different angles
- safety in autonomous vehicles, or self-driving cars, where lane line and object detection has been developed
- revolutionising healthcare with image analysis features to detect symptoms in medical imaging and X-rays
- augmented reality and mixed reality, which uses object tracking in the real world to determine the location of a virtual object on the device’s display
The ultra-fast computing machines available today, along with quick and reliable internet connections, as well as cloud networks make the process of deciphering an image using computer vision much faster than when this field was first being investigated. Plus, with companies like Meta, Google, IBM and Microsoft also sharing their artificial intelligence research through open sourcing, it’s certain that computer vision research and discoveries will progress at a quicker speed than was seen in the past.
The computer vision and hardware market is expected to be worth $48.6 billion, making it a lucrative industry where the pace of change is accelerating.
Specialise in artificial intelligence
If you have an interest in computer vision, expanding your skills and knowledge in artificial intelligence is the place to start. With this grounding, you could be the key that solves many unanswered questions in computer vision – a field with potential for huge growth.
The University of York’s online MSc Computer Science with Artificial Intelligence will set you up for success. Study entirely online and part-time around current commitments, whether you already have experience in computer science or you’re looking to change your career into this exciting industry, this master’s degree is for you.