The next step in machine learning: deep learning

What is deep learning?

Deep learning is a sector of artificial intelligence (AI) concerned with creating computer structures that mimic the highly complex neural networks of the human brain. Because of this, it is also sometimes referred to as deep neural learning or deep neural networks (DNNs).

A subset of machine learning, the artificial neural networks utilised in deep learning are capable of sorting much more information from large data sets to learn and consequently use in making decisions. These vast amounts of information that DNNs scour for patterns are sometimes referred to as big data.

Is deep learning machine learning?

The technology used in deep learning means that computers are closer to thinking for themselves without support or input from humans (and all the associated benefits and potential dangers of this).

Traditional machine learning requires rules-based programming and a lot of raw data preprocessing by data scientists and analysts. This is prone to human bias and is limited by what we are able to observe and mentally compute ourselves before handing over the data to the machine. Supervised learning, unsupervised learning, and semi-supervised learning are all ways that computers become familiar with data and learn what to do with it.

Artificial neural networks (sometimes called neural nets for short) use layer upon layer of neurons so that they can process a large amount of data quickly. As a result, they have the “brain power” to start noticing other patterns and create their own algorithms based on what they are “seeing”. This is unsupervised learning and leads to technological advances that would take humans a lot longer to achieve. Generative modelling is an example of unsupervised learning.

Real-world examples of deep learning

Deep learning applications are used (and built upon) every time you do a Google search. They are also used in more complicated scenarios like in self-driving cars and in cancer diagnosis. In these scenarios, the machine is almost always looking for irregularities. The decisions the machine makes are based on probability in order to predict the most likely outcome. Obviously, in the case of automated driving or medical testing, accuracy is more crucial, so computers are rigorously tested on training data and learning techniques.

Everyday examples of deep learning are augmented by computer vision for object recognition and natural language processing for things like voice activation. Speech recognition is a function that we are familiar with through use of voice-activated assistants like Siri or Alexa, but a machine’s ability to recognise natural language can help in surprising ways. Replika, also referred to as “My AI Friend”, is essentially a chatbot that gets to know a user through questioning. It uses a neural network to have an ongoing one-to-one conversation with the user to gather information. Over time, Replika begins to speak like the user, giving the impression of emotion and empathy. In April 2020, at the height of the pandemic, half a million people downloaded Replika, suggesting curiosity about AI but also a need for AI, even if it does simply mirror back human traits. This is not a new idea as in 1966, computer scientist Joseph Weizenbaum created what was a precursor to the chatbot with the program ELIZA, the computer therapist.

How does deep learning work?

Deep learning algorithms make use of very large datasets of labelled data such as images, text, audio, and video in order to build knowledge. In its computation of the content – scanning through and becoming familiar with it – the machine begins to recognise and know what to look for. Like the human brain, each computer neuron has a role in processing data, it provides an output by applying the algorithm to the input data provided. Hidden layers contain groups of neurons.

At the heart of machine learning algorithms is automated optimisation. The goal is to achieve the most accurate output so we need the speed of machines to efficiently assess all the information they have and to begin detecting patterns which we may have missed. This is also core to deep learning and how artificial neural networks are trained.

TensorFlow is an open source platform created by Google, written in Python. A symbolic maths library, it can be utilised for many tasks, but primarily for training, transfer learning, and developing deep neural networks with many layers. It’s particularly useful for reinforcement learning because it can calculate large numbers of gradients. The gradient is how the data is seen on a graph. So, for example, the gradient descent algorithm would be used to minimise error function and would be represented graphically as the gradient at its lowest possible point. The algorithm used to calculate the gradient of an error function is “backpropagation”, short for “backward propagation of errors”.

One of the most used deep learning models in reinforcement learning, particularly for image recognition, Convolutional Neural Networks (CNN) can learn increasingly abstract features by using deeper layers. CNNs can be accelerated by using Graphics Processing Units (GPUs) because they can process many pieces of data simultaneously. They can help perform feature extraction by analysing pixel colour and brightness or vectors in the case of grayscale.

Recurrent Neural Networks (RNNs) are considered state of the art because they are the first of their kind to use an algorithm that lets them remember their input. Because of this, RNNs are used in speech recognition and natural language processing in applications like Google Translate.

Can deep learning be used for regression?

Neural networks can be used for both classification and regression. However, regression models only really work well if they’re the right fit for the data and that can affect the network architecture. Classifiers in something like image recognition, have more of a compositional nature compared with the many variables that can make up a regression problem. Regression offers a lot more insight than simply, “Can we predict Y given X?”, because it explores the relationship between variables. Most regression models don’t fit the data perfectly, but neural networks are flexible enough to be able to pick the best type of regression. To add to this, hidden layers can always be added to improve prediction.

Knowing when to use regression or not to solve a problem may take some research. Luckily, there are lots of tutorials online to help, such as How to Fit Regression Data with CNN Model in Python.

Ready to discover more about deep learning?

The University of York’s online MSc Computer Science with Artificial Intelligence from the University of York is the ideal next step if your career ambitions lie in this exciting and fast-paced sector.

Whether you already have knowledge of machine learning algorithms or want to immerse yourself in deep learning methods, this master’s degree will equip you with the knowledge you need to get ahead.