Data-driven modeling
Fall 2009
Department: Applied Mathematics, Columbia University
Instructor: Jake Hofman
Course number: E4990
Fridays, 10:30am-1:00pm 214 Mudd
Overview:
This course introduces mathematics from the fields of statistical inference and machine learning. The topics of regression, classification, clustering, and dimensionality reduction will be covered. Emphasis will be on formulating real-world modeling and prediction tasks as optimization problems and introducing and comparing optimization methods, including gradient descent, spectral methods, variational techniques, and monte carlo methods. Students will gain direct experience in acquiring data from online sources and will develop the necessary scientific computing skills to address practical problems, such as spam filtering, recommendation systems, topic discovery, and search engine ranking.
Pre-requisites:
Linear algebra (APMA E3101 or equivalent), Probability & Statistics (SIEO W4150 or equivalent). Previous exposure to a high-level programming language such as Python, MATLAB (e.g. COMS W1005), Ruby, Perl, R or similar is recommended.
URL in course directory: http://bit.ly/E4990