While deep learning has received great appraise in the space of artificial intelligence, it comes with huge data and time requirements. However, a new method has emerged that promises to cure the drawbacks of deep learning: Transfer learning. Let's dive deeper into what AI guru Andrew Ng famously described as "the next driver of machine learning's success".
What is transfer learning and why should you care?
Imagine yourself back in the days when you tried to ride a bicycle for the first time. It was difficult and took time. You needed to learn everything from scratch: How to keep the balance, how to steer the wheel, how to brake.
Now back to the present: Imagine you want to learn how to ride a motorcycle. You don’t need to start from zero. It is much easier for you to learn how to keep the balance or use the brakes. Even though you are in a different setting, you can transfer the skills learned from riding a bicycle. That’s also the essence of transfer learning.
A more formal definition of transfer learning goes as follows:
“Transfer learning is [...] the improvement of learning in a new task through the transfer of knowledge from a related task that has already been learned.”
Olivas & Guerrero (2009): Handbook Of Research On Machine Learning Applications
Prefer to watch a video rather than read a blog post? Check out this video from our YouTube channel!
Having learned how to keep the balance on a bicycle improves your learning of how to keep the balance on a motorcycle. Similarly, an algorithm that has learned how to recognize dogs can be trained to recognize cats with relative ease by transferring certain abstract concepts.
All the above are rough concepts and before you go to battle with this knowledge, we need to dive one level deeper into the matter.
How conventional machine learning algorithms work
In brief, machine learning is the general term for when computers learn from data without being explicitly programmed. Instead, machine learning algorithms recognize patterns in the data and make predictions once new data arrives. If you are new to the field, we recommend that you first read about the different disciplines of artificial intelligence.
So far, conventional machine learning algorithms have been built to learn specific tasks. They are designed to work in isolation and this works well both in theory and practice. But training algorithms from scratch also has drawbacks.
As specialized algorithms, they reach high performance only in their specific area of expertise. No matter how state-of-the-art they are, they are only state-of-the-art for a specific thing. If tasked with a new problem, they would not know what to do and make wrong predictions.
Recall the bicycle example again: Imagine you have learned how to ride a bicycle. Even if you were a world champion in trick-cycling, you would have to start from scratch when learning how to ride a motorcycle. The world champion would be a rookie again.
Similarly, models have to be rebuilt from scratch in conventional machine learning. Since model training requires time and money, many problems aren’t profitable with a traditional learning approach.
Besides, most machine learning algorithms require vast amounts of data. Deep learning models, in particular, most often need millions of data points to generate meaningful results. These data needs are often difficult to satisfy in practice.
That is also one of the primary reasons why machine learning has mainly been a privilege to large companies: Smaller enterprises just haven’t had the required resources to continuously feed and train machine learning algorithms from scratch.
Enter transfer learning
Transfer learning is a technique that enables algorithms to learn a new task by using pre-trained models. Let’s see how conventional machine learning and transfer learning compare:
In traditional learning, a machine learning algorithm works in isolation. When given a large enough dataset, it learns how to perform a specific task. However, when tasked to solve a new problem, it cannot resort to any previously gained knowledge. Instead, a conventional algorithm needs a second dataset to begin a new learning process.
In transfer learning, the learning of new tasks relies on previously learned tasks. The algorithm can store and access knowledge. The model is general instead of specific.
Benefits of transfer learning
This technique of transfer learning unlocks two major benefits:
First, transfer learning increases learning speed. With fewer new things to learn, the algorithm is faster to generate high-quality output. To use an analogy, an ice hockey player is likely to learn more quickly to play field hockey than an average person because certain concepts apply to both disciplines.
Second, transfer learning reduces the amount of data required. In traditional learning, an algorithm can only learn when fed with enough training data, sometimes millions of data points. This data might not be available at all or too expensive to generate and prepare for the model.
We hope that it is more clear by now why Andrew Ng sees transfer learning as the next driver of machine learning's success: By leveraging previous knowledge, transfer learning makes projects feasible that were unfeasible before – both in terms of data and budget.
Transfer learning at Levity
Our whole architecture is built around transfer learning and we constantly strive for what is commonly referred to as "state-of-the-art performance". Thanks to transfer learning, our users can train their algorithms with relatively little data and get satisfying results to start with. So whenever a user hits "train model", the best-suited model is automatically selected and trained as explained above.
While maintaining a high level of performance, we can thereby significantly reduce the need for training data, speed up the training process, and keep the cost at a low level. If you want to see it in action, just drop us a note and we take you deeper into the rabbit hole!