This course provides a broad and rigorous introduction to machine learning, probabilistic reasoning and decision making in uncertain environments. The course should be of interest to undergraduate students in computer science, applied mathematics, sciences and engineering, and lower-level graduate students, looking to gain an introduction to the tools of machine learning and probabilistic reasoning with applications to data-intensive problems in the applied sciences, natural sciences and social sciences.
For students with interests in the fundamentals of artificial intelligence, this course will address three central, related questions in the design and engineering of intelligent systems. How can a system process its perceptual inputs in order to obtain a reasonable picture of the world? How can we build programs that learn from experience? And how can we design systems to deal with the inherent uncertainty in the real world?
Our approach to these questions will be both theoretical and practical. We will develop a mathematical underpinning for the methods of machine learning and probabilistic reasoning. We will look at a variety of successful algorithms and applications. We will also discuss the motivations behind the algorithms, and the properties that determine whether or not they will work well for a particular task.
There are no required texts for this class. Extensive lecture notes will be handed out. But there are several useful books, and they are strongly recommended for extra reading:
Lecture 1: Introduction and Course Overview [ slides ]
Lecture 2: Decision Trees [ slides | lecture notes | section notes ]
Lecture 3: Overfitting, Description Length, and Cross-Validation [ slides | lecture notes ]
Lecture 4: Ensemble Learning and Boosting [ slides | lecture notes | section notes ]
Lecture 5: Neural Networks I: Perceptrons [ slides | lecture notes ]
Lecture 6: Neural Networks II: Multi-Layer Networks [ slides | lecture notes | section notes ]
Lecture 7: Neural Networks III: Model Selection and Applications [ slides | lecture notes ]
Lecture 8: Support Vector Machines [ lecture notes | section notes ]
Lecture 9: Clustering [ slides | lecture notes ]
Lecture 10: Maximum Likelihood and Expectation Maximization [ lecture notes | section notes ]
Lecture 11: Dimensionality Reduction: PCA, ICA, and Autoencoders [ slides | lecture notes ]
Lecture 12: Dimensionality Reduction: Nonlinear Methods [ slides | lecture notes | section notes ]
Lecture 13: Matrix Factorization Models [ slides ]
Lecture 14: Bayesian Networks and Graphical Models [ notes: freely-available Bishop 8.1-8.3 | section notes ]
First Midterm: Covers Lectures 1-13
Lecture 15: Bayesian Networks: Inference [ notes: freely-available Bishop 8.3-8.4 ]
Lecture 16: Bayesian Networks: Applications and Approximate Inference [ notes ]
Lecture 17: Hidden Markov Models [ lecture notes | section notes ]
Lecture 18: Markov Decision Processes [ lecture notes | section notes ]
Lecture 19: Value and Policy Iteration [ lecture notes ]
Lecture 20: Reinforcement Learning [ lecture notes | section notes ]
Lecture 21: Partially-Observable Markov Decision Processes [ lecture notes ]
Lecture 22: Computational Learning Theory [ lecture notes | section notes ]
Lecture 23: Wrap-Up
Second Midterm: Covers Lectures 14-22