This is an introductory course on machine learning that can be taken at your own pace.
It covers the basic theory, algorithms and applications. Machine learning (Scientific American introduction) is a key technology in Big Data, and in many financial, medical, commercial, and scientific applications. It enables computational systems to adaptively improve their performance with experience accumulated from the observed data. Machine learning is one of the hottest fields of study today, taken up by graduate and undergraduate students from 15 different majors at Caltech.
The course balances theory and practice, and covers the mathematical as well as the heuristic aspects. The lectures follow each other in a story-like fashion; what is learning? can we learn? how to do it? how to do it well? what are the take-home lessons? The technical terms that go with that include linear models, the VC dimension, neural networks, regularization and validation, support vector machines, Occam’s razor, and data snooping.
The focus of the course is understanding the fundamentals of machine learning. If you have the discipline to follow the carefully-designed lectures, do the homeworks, and discuss the material with others on the forum, you will graduate with a thorough understanding of machine learning, and will be ready to apply it correctly in any domain. Welcome aboard!
The 18 lectures use incremental viewgraphs to simulate the pace of blackboard teaching. This lecture was recorded on April 3, 2012, in Hameetman Auditorium at Caltech, Pasadena, CA, USA.
http://work.caltech.edu/telecourse.html
Lecture 1: The Learning Problem: Introduction; supervised, unsupervised, and reinforcement learning. Components of the learning problem.
Lecture 2: Is Learning Feasible?: Can we generalize from a limited sample to the entire space? Relationship between in-sample and out-of-sample.
Lecture 3: The Linear Model I: Linear classification and linear regression. Extending linear models through nonlinear transforms.
Lecture 4: Error and Noise: The principled choice of error measures. What happens when the target we want to learn is noisy.
Lecture 5: Training versus Testing: The difference between training and testing in mathematical terms. What makes a learning model able to generalize?
Lecture 6: Theory of Generalization: How an infinite model can learn from a finite sample. The most important theoretical result in machine learning.
Lecture 7: The VC Dimension: A measure of what it takes a model to learn. Relationship to the number of parameters and degrees of freedom.
Lecture 8: Bias-Variance Tradeoff: Breaking down the learning performance into competing quantities. The learning curves.
Lecture 9: The Linear Model II: More about linear models. Logistic regression, maximum likelihood, and gradient descent.
Lecture 10: Neural Networks: A biologically inspired model. The efficient backpropagation learning algorithm. Hidden layers.
Lecture 11: Overfitting: Fitting the data too well; fitting the noise. Deterministic noise versus stochastic noise.
Lecture 12: Regularization: Putting the brakes on fitting the noise. Hard and soft constraints. Augmented error and weight decay.
Lecture 13: Validation: Taking a peek out of sample. Model selection and data contamination. Cross validation.
Lecture 14: Support Vector Machines: One of the most successful learning algorithms; getting a complex model at the price of a simple one.
Lecture 15: Kernel Methods: Extending SVM to infinite-dimensional spaces using the kernel trick, and to non-separable data using soft margins.
Lecture 16: Radial Basis Functions: An important learning model that connects several machine learning models and techniques.
Lecture 17: Three Learning Principles: Major pitfalls for machine learning practitioners; Occam’s razor, sampling bias, and data snooping
Lecture 18: Epilogue: The map of machine learning. Brief views of Bayesian learning and aggregation methods
by caltech