To most people, the terms deep learning and machine learning seem like interchangeable buzzwords of the AI world. However, that’s not true. Hence, everyone who seeks to better understand the field of artificial intelligence should begin by understanding the terms and its differences. The good news: It’s not as difficult as some articles on the topic suggest.
Prefer to consume this content in audio format? Check out our video below!
To break it down in a single sentence: Deep learning is a specialized subset of machine learning which, in turn, is a subset of artificial intelligence. In other words, deep learning is machine learning.
But let’s dig a little bit deeper.
What is machine learning?
Machine learning is the general term for when computers learn from data. It describes the intersect of computer science and statistics where algorithms are used to perform a specific task without being explicitly programmed; instead, they recognize patterns in the data and make predictions once new data arrives.
In general, the learning process of these algorithms can either be supervised or unsupervised, depending on the data being used to feed the algorithms. If you want to dive in a little bit deeper into the differences between supervised and unsupervised learning have a read through this article.
A traditional machine learning algorithm can be something as simple as linear regression. For instance, imagine you want to predict your income given your years of higher education. In a first step, you have to define a function, e.g. income = y + x * years of education. Then, give your algorithm a set of training data. This could be a simple table with data on some people’s years of higher education and their associated income. Next, let your algorithm draw the line, e.g. through an ordinary least squares (OLS) regression. Now, you can give the algorithm some test data, e.g. your personal years of higher education, and let it predict your income.
While this example sounds simple it does count as machine learning – and yes, the driving force behind machine learning is ordinary statistics. The algorithm learned to make a prediction without being explicitly programmed, only based on patterns and inference.
So much about machine learning in general – to summarize:
- Machine learning is at the intersection of computer science and statistics through which computers receive the ability to learn without being explicitly programmed.
- There are two broad categories of machine learning problems: supervised and unsupervised learning.
- A machine learning algorithm can be something as simple as an OLS regression.
Let's now examine how the term deep learning relates to all of this.
What is deep learning?
Deep learning algorithms can be regarded both as a sophisticated and mathematically complex evolution of machine learning algorithms. The field has been getting lots of attention lately and for good reason: Recent developments have led to results that were not thought to be possible before.
Deep learning describes algorithms that analyze data with a logic structure similar to how a human would draw conclusions. Note that this can happen both through supervised and unsupervised learning. To achieve this, deep learning applications use a layered structure of algorithms called an artificial neural network (ANN). The design of such an ANN is inspired by the biological neural network of the human brain, leading to a process of learning that’s far more capable than that of standard machine learning models.
Consider the example ANN in the image above. The leftmost layer is called the input layer, the rightmost layer of the output layer. The middle layers are called hidden layers because their values aren't observable in the training set. In simple terms, hidden layers are calculated values used by the network to do its "magic". The more hidden layers a network has between the input and output layer, the deeper it is. In general, any ANN with two or more hidden layers is referred to as a deep neural network.
Today, deep learning is used in many fields. In automated driving, for instance, deep learning is used to detect objects, such as STOP signs or pedestrians. The military uses deep learning to identify objects from satellites, e.g. to discover safe or unsafe zones for its troops. Of course, the consumer electronics industry is full of deep learning, too. Home assistance devices such as Amazon Alexa, for example, rely on deep learning algorithms to respond to your voice and know your preferences.
How about a more concrete example? Imagine the company Tesla using a deep learning algorithm for its cars to recognize STOP signs. In the first step, the ANN would identify the relevant properties of the STOP sign, also called features. Features may be specific structures in the inputted image, such as points, edges, or objects. While a software engineer would have to select the relevant features in a more traditional machine learning algorithm, the ANN is capable of automatic feature engineering. The first hidden layer might learn how to detect edges, the next how to differentiate colors, and the last learn how to detect more complex shapes catered specifically to the shape of the object we are trying to recognize. When fed with training data, the deep learning algorithms would eventually learn from their own errors whether the prediction was good, or whether it needs to adjust.
Overall, through automatic feature engineering and its self-learning capabilities, the deep learning algorithms need only little human intervention. While this shows the huge potential of deep learning, there are two main reasons why it has only recently attained so much usability: data availability and computing power.
Firstly, deep learning requires incredibly vast amounts of data (we will get to exceptions to that rule). Tesla’s autonomous driving software, for instance, needs millions of images and video hours to function properly.
Secondly, deep learning needs substantial computing power. However, with the emergence of cloud computing infrastructure and high-performance GPUs (graphic processing units, used for faster calculations) the time for training a deep learning network could be reduced from weeks (!) to hours.
But probably one of the most important advances in the field of deep learning is the emergence of transfer learning, i.e. the use of pre-trained models. The reason: Transfer learning can be regarded as a cure for the needs of large training datasets that were necessary for ANNs to produce meaningful results.
These enormous data needs used to be the reason why ANN algorithms weren't considered to be the optimal solution to all problems in the past. However, for many applications, this need for data can now be satisfied by using pre-trained models. In case you want to dig deeper, we recently published an article on transfer learning.
To sum up:
- Deep learning is a specialized subset of machine learning.
- Deep learning relies on a layered structure of algorithms called an artificial neural network.
- Deep learning has huge data needs but requires little human intervention to function properly.
- Transfer learning is a cure for the needs of large training datasets.
The main differences between machine learning and deep learning
This is a common question and if you have read this far, you probably know by now that it should not be asked in that way. Deep learning algorithms are machine learning algorithms. Therefore, it might be better to think about what makes deep learning special within the field of machine learning. The answer: the ANN algorithm structure, the lower need for human intervention, and the larger data requirements.
First and foremost, while traditional machine learning algorithms have a rather simple structure, such as linear regression or a decision tree, deep learning is based on an artificial neural network. This multi-layered ANN is, like a human brain, complex and intertwined.
Secondly, deep learning algorithms require much less human intervention. Remember the Tesla example? If the STOP sign image recognition was a more traditional machine learning algorithm, a software engineer would manually choose features and a classifier to sort images, check whether the output is as required, and adjust the algorithm if this is not the case. As a deep learning algorithm, however, the features are extracted automatically, and the algorithm learns from its own errors (see image below).
Thirdly, deep learning requires much more data than a traditional machine learning algorithm to function properly. Machine learning works with a thousand data points, deep learning oftentimes only with millions. Due to the complex multi-layer structure, a deep learning system needs a large dataset to eliminate fluctuations and make high-quality interpretations.
Got it. But what about coding?
Deep learning is still in its infancy in some areas but its power is already enormous. It is mostly leveraged by large companies with vast financial and human resources since building deep learning algorithms used to be complex and expensive. But this is changing. We at Levity believe that everyone should be able to build his own custom deep learning solutions.
If you know how to build a Tensorflow model and run it across several TPU instances in the cloud, you probably wouldn't have read this far. If you don't, you have come to the right place. Because we are building this platform for people like you. People with ideas about how AI could be put to great use but who lack time or skills to make it work on a technical level.
I am not going to claim that I could do it within a reasonable amount of time, even though I claim to know a fair bit about programming, deep learning and even deploying software in the cloud. So if this or any of the other articles made you hungry, just get in touch. We are looking for good use cases on a continuous basis and we are happy to have a chat with you!