CSCI 6971/4971 Large Scale Matrix Computations and Machine Learning, Spring 2018

Randomized vs non-randomized Gauss-Siedel for solving a 250K-by-250K kernel regression problem.

Overview

Modern machine learning routinely deals with millions of points in high-dimensional spaces. Classical numerical linear algebra algorithms can be prohibitively costly in such applications, as they aim at machine precision and scale super-linearly in the size of the input data. Fortunately, machine precision is seldom necessary, and recent developments have established that randomization can be used to bring the costs of matrix algorithms closer to linear in the size of the input data; this is done by sacrificing, in a principled manner, computational accuracy for increased speed. This course surveys modern randomized numerical linear algebra and its applications to machine learning, with the goal of providing a solid foundation for developing and analyzing the performance of randomized matrix algorithms.

Topics will include fast matrix multiplication, fast linear regression, column subset selection, fast approximate low- rank decompositions, fast k-means and spectral clustering, fast approximate kernel methods, and promising research directions in the field.

Course Information

The syllabus is available as an archival pdf, and is more authoritative than this website.

Course Text: Lecture notes.

Grading Criteria:

Students are expected to have writing supplies on hand in each class to complete the in-class pop quizzes. If you are an athlete or for some other reason are not able to attend each class, make alternative arrangements with the instructor in the first two weeks of the course.

Letter grades will be computed from the semester average. Maximum lower bound cutoffs for A, B, C and D grades are 90%, 80%, 70%, and 60%, respectively. These bounds may be moved lower at the instructors discretion.

Topics and Schedule

Homeworks

All assignments must be typed (preferably in LaTeX) and are due at the start of class (defined as the first 15 minutes) via email. Late assignments will be penalized and accepted at the instructor's discretion.

Supplementary Materials