Did you ever get burned by a hot object as a child and vow never to make the same mistake again? That was a critical moment of learning. Through a variety of experiences, we develop and become better at navigating the world around us. In addition to humans, this ability is also possessed by animals, plants (yes, really), and, most recently, Machines 🤖. There is a lot of marketing buzz surrounding Machine Learning, Deep Learning, and Artificial Intelligence – and a lot of nonsense. Our aim is to help you make sense of these terms with this article. So let’s get to it! 🚀
Let us begin with a generic term, one that has been around for a long time: Artificial Intelligence (AI).
AI describes the ability of a machine to interpret and respond to inputs in an intelligent way. This discipline is decades old, with the first research stemming from the 1950s. Over time AI had its ups and downs – ‘winters,’ where it was deemed a dead discipline, and summers where every ice cream shop thought it needed to jump on the bandwagon 🚎. The most recent AI hype is mostly due to a specific method within the AI universe: Machine Learning (ML).
ML Algorithms rely on data instead of specific rules to make their decisions. You have most probably applied a ML algorithm yourself already, even before it became ‘hip’. We’re talking about linear regression (or more colloquial “line-fitting”): Fitting a curve to best describe some (dirty) data. This is a two-parameter ML model.
Obviously, a lot of the problems we want to solve are not describable with a such a simple linear relationship. That’s where Neural Networks and Deep Learning (DL) come into play.
Neural Networks & Deep Learning
Neural Networks are a subset of ML algorithms. They have been around since the dawn of AI – some 70 years ago! But while they have lived the destiny of sleeping beauty 👸 💤 for most of their existence, in recent years they have risen to power and been at the core of the most recent AI frenzy.
Neural networks are loosely modelled after the human brain 🧠 with neurons propagating information forward, often over multiple “layers”. Stacking a lot of these layers onto each other is then called ‘Deep Learning’. The power of these massive networks, that often have 100s of millions of trainable weights (as opposed to the two weights of linear regression) is that they can model – and thus – learn almost any logic.
Sentiment Analysis & Natural Language Processing
Let’s illustrate the differences with a concrete example: Sentiment Analysis.
Problem Statement: We want our software to classify the sentences we provide it with into “Positive”, “Neutral” and “Negative”.
By the way: This widely used method is part of the “Natural language processing” domain, a class of algorithms that deals with human language (either written or oral). Natural language processing is a subset of the AI family.
A simple rule-based system (which counts as AI as well) might have a list of words that count as “positive” or “negative”. For example it might say that sentences containing the words “Good”, Great” and “Happy,” should be considered to be “Positive”. However, just adding the word “Not” before one of these positive indicators changes the sentiment entirely. So you would have to implement some more logic that if any of the “Positive” words is preceded by “Not” the sentiment is inversed. But what about the sentence “Great service. Not.” Further rules would be needed to cover this case as well. As you can see it is a daunting and never-ending task to dissect language based on fixed rules.
That’s why most of today’s cutting edge sentiment analysis tools rely on ML: Instead of defining which words mean what, you show the algorithm a large number of examples, from which it will deduct the rules on what is positive and negative itself. How does that work concretely? Let’s look at one training step:
- You pass the software an input example: “I liked the movie.”
- You ask the software to predict the sentiment of this sample. As it’s not trained yet, the output is basically random – it might answer “Negative”.
- You correct it, providing the correct answer “Positive”. Then you tell the system to adjust its internal rules, such that its prediction for this specific example would have been correct.
- Repeat. A lot.
By repeating this over and over again, showing the system 100’000+ examples its internal rules will become a better and better model of our language. The goal is to get the model to generalize well, meaning its internal rules actually capture the essence of what we want it to understand and not capturing some spurious correlation.
How Does Caplena Employ These Techniques?
Caplena uses advanced ML models like Transformers to detect topics in your data, do sentiment analysis and classify text comments into various categories. But how is it that you can customize the topics you want without, or, only very few training examples, opposed to the 100’000+ examples mentioned above? The answer to this includes Pre-Training, Transfer Learning and Augmented Intelligence. But those are topics for a next time 😊
In the meantime, if you’re more interested in the practical part, you can check out how our multilingual text analysis tool works on your data during a Free Trial.
Did you enjoy this tutorial? 🚀 Feel free to suggest more topics in which you would like more guidance, and we will do our best to make it happen!
✉️Just email our Head of Marketing sheila [at ] caplena.com
To give Caplena a try for free, click here.