A description of the state of present day machine learning organized by the main techniques/algorithms, the domains from which the techniques originated, the history of the field and how the search for the master algorithm continues.
1. The Master algorithm
- The AI algorithm that can replicate/reproduce all of human knowledge.
- Areas that might yield the master algorithm: NeuroScience, Statistics, Evolution, Physics, Computer Science
- 5 approaches to AI (and the fields that influenced them): Bayesian (Statistics), Symbolist (Inverse deduction) , SVM (Analogies), Connectionist (Neuroscience), Genetic programming/Evolutionary algorithms (Evolution)
2. The Humes Inference question
- Can everything be inferred from limited knowledge? Can the past ever be used to accurately predict the future?
- Is a Master Algorithm feasible?
- Rationalism (All knowledge comes from reasoning) vs empiricism (All knowledge comes from experimentation/observation)
- The Symbolist approach:
- Knowledge is just the manipulation of symbols
- Inverse deduction: Deduce complex rules from simple rules
- E.g. Decision tress
3. The Connectist approach
- Neural networks:
- Theory behind how the brain learns (Hebbs): Axions fire across neurons, reinforced every time they are fired, developing a memory
- Perceptron: Weighted inputs + Thresholding function
- Drawback: Can classify only when there is a linear boundary
- E.g. XOR has three regions 00, 01, 10, 00
- E.g. Gender + Age (0/1): How to classify if condition is true for Male/Young and Female/Old but not for other cases
- Neural networks: Multilayer perceptrons with backprop to learn
- Others: Autoencoders, Boltzmann machines, CNNs
4. The Evolutionary/Genetic approach
- Based on genetic algorithms
- A set of solutions, combined continuously in iterations, at each iteration, weakest solutions are discarded
- Iteration continue until an optimal solution is reached.
5. The Bayesian approach
- Based on Bayes' rule: Probability of cause of an effect, given probability of cause, probaility of effect, and probability of effect given cause.
- P(cause|effect) = P(cause)*P(effect|cause)/P(effect)
- P(cause): Prior
- P(cause| effect): Posterior
- P(effect | cause)/P(effect): Support
- Imagine a Venn diagram: 4 areas: C, E, C and E, Not C/Not E => Four probs: E, C, E|C, C|E
- Bayesian networks: Networks of Bayesian infererers: Used to infer probabilites of complex sequences
- E.g. Hidden Markov Models, Kalman Filters (Continuous variable version of discrete HMMs)
6. The Analogist approach
- Find similarities between examples
- E.g.Nearest Neighbor, SVM
7. Unsupervised approaches: Learning without teaching
- E.g. KMeans, Principal Component Analysis, Reinforcement learning, Relational learning
8. Unifying the algorithms
- Metalearning: Techniques to combine approaches. All reduce variance
- Random forests:Bootstrapping, Bagging, Boosting