AI vs ML vs DL
Inspired from the book Deep Learning with Python by François Chollet
Symbolic AI- Giving explicit rules for Manipulating knowledge(to get things done), But it solves only well-defined problems. Not any fuzzy problems like Image classification, Speech recognition. That’s why ML has been aroused.
Machine Learning: Rather than giving explicit rules for the system to run, Rules will be given by the machines.
Lady Ada Lovelace(18th-century Mathematician) said that
“Analytical engines with no pretensions will not create anything, It assists us with what we already know”
It raises a question for Alan Turing that is, could computers go beyond what humans think and know. If it is presented with many examples (Both question and answers) It will provide us with the rules. “Imitation game” It can imitate based on the data given.
To have the machine to learn, It needs to have three things they are:
- Input Data
- The output of those inputs
- A way to Measure -Evaluation metrics, This measurement is the feedback signal, To tune the model that is learning.
So, Learning was done by representations. That is, based on certain representations of the input data, Output will be mapped. Based on the mapping between the input and outputs rules are generated with the help of algorithms.
E.g: There are x and Y axes with black and white dots. We can classify them by changing their coordinates, Like X> 0 are blacks and x< 0 are whites. When we change how our data represents it will be easier for us to get the output.
We will search for better representations of our data to solve the problems.
That is the whole concept of machine learning. Machine first searches for the better representation of our data then it evaluates to better their learning.
But here comes the question of how that machine knows the possible coordinate transformations of the given input data?
“ Machines are not much creative in searching for the transformations. It works on the basics of hypothesis space.”
With the raw input, we search for the default representations in the data, based on the evaluation of representation and mapping of the data, based on the learning outcome will be evaluated we change the representations of the data.
Deep in Deep learning: Deep does not mean it understands deeply, Deep is used to point there is more number of layers are used to represent the input data.
Other Machine learning models which use one or two layers to represent data are called shallow learning.
These layer representations are stacked one above another and some of us say they are loosely inspired by the human brain, But there is no proof that the human brain and the neural network work the same way. https://twitter.com/AlejandroPiad/status/1342094375820668930. One of the best I’ve read describing why the analogy of the human brain to the neural network is useless and Dangerous.
Deep learning is the mathematical learning representation of data by layers.
Multistage information distillation operation where information passed from one end and it goes through successive filters and comes out as purified.
Layers consist of what transformations need to be done for input data, That is called weights. Initially, weights are assigned randomly. All layers of the network consist of weights, Weights are the kind of transformations that needs to be done.
The deep neural network will send input from one end and passes on to several layers and then it maps the target. Each layer consists of weights as transformations to input data. But the weights are assigned randomly at the start. So, there will be a difference between the predicted output and the actual output. To rectify it by updating weights, loss functions and optimizers are used.
Once there is a predicted value and the true target, loss function or objective function will be used to calculate the difference between the predicted and the actual value. Once the loss function has been calculated it is sent to optimizers(This is where backpropagation model works) That is it goes back to the network and updates it weights so the loss function between the actual and the predicted value is minimum.
What makes deep learning different?
Classical Machine learning models(Probabilistic models, trees and kernel methods)are mostly comprised of single layers for representing data, it is good for the structural data but not for the perceptual data which required more complex representations of the data to learn.
So, Why can’t we build a pipeline with one by one shallow learning models?
The problem with that approach is, based on the feedback we need to change all the models accordingly. But in deep learning, it does not require a human intervention based on the feedback loop each representation are changed accordingly.
To be continued…