<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Data-driven modeling</title>
	<atom:link href="http://jakehofman.com/ddm/feed/" rel="self" type="application/rss+xml" />
	<link>http://jakehofman.com/ddm</link>
	<description>Spring 2012, Department of Applied Mathematics, Columbia University</description>
	<lastBuildDate>Mon, 30 Jan 2012 15:46:01 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Lecture 01</title>
		<link>http://jakehofman.com/ddm/2012/01/lecture-01-2/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=lecture-01-2</link>
		<comments>http://jakehofman.com/ddm/2012/01/lecture-01-2/#comments</comments>
		<pubDate>Mon, 30 Jan 2012 15:45:17 +0000</pubDate>
		<dc:creator>Jake</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://jakehofman.com/ddm/?p=188</guid>
		<description><![CDATA[Our first lecture was last Monday. We began with a high-level overview and introduction to the course. Data-driven modeling: Lecture 01 A few main points here. First, for many tasks&#8212;e.g. spam classification, image recognition, etc.&#8212;we (people) are able to learn relatively quickly and easily, such that we can generalize well after seeing few, loosely-structured examples. [...]]]></description>
			<content:encoded><![CDATA[<p>
Our first lecture was last Monday. We began with a <a href="http://www.slideshare.net/jakehofman/datadriven-modeling-lecture-01">high-level overview and introduction</a> to the course.
</p>
<div style="width:425px" id="__ss_11335458"><strong style="display:block;margin:12px 0 4px"><a href="http://www.slideshare.net/jakehofman/datadriven-modeling-lecture-01" title="Data-driven modeling: Lecture 01">Data-driven modeling: Lecture 01</a></strong><object id="__sse11335458" width="425" height="355"><param name="movie" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=ddmlecture01-120130091235-phpapp01&#038;stripped_title=datadriven-modeling-lecture-01&#038;userName=jakehofman" /><param name="allowFullScreen" value="true"/><param name="allowScriptAccess" value="always"/><param name="wmode" value="transparent"/><embed name="__sse11335458" src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=ddmlecture01-120130091235-phpapp01&#038;stripped_title=datadriven-modeling-lecture-01&#038;userName=jakehofman" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" wmode="transparent" width="425" height="355"></embed></object></div>
<p>
A few main points here. First, for many tasks&#8212;e.g. spam classification, image recognition, etc.&#8212;we (people) are able to learn relatively quickly and easily, such that we can generalize well after seeing few, loosely-structured examples. Second, it&#8217;s often the case that while we learn from example well, we have difficulty in explicitly formalizing this procedure, making it difficult to write programs that mimic our behaviors. The hope, however, is that we can devise methods to enable machines to systematically learn and generalize from observed data. The good news is that empirical results in many areas&#8212;from face recognition to recommendation systems&#8212;show encouraging results. The course will focus on a simple but effective subset of these techniques, with an emphasis on practical aspects of obtaining, dealing with, and learning from real-world data.
</p>
<p>
In the second half of class (slides 32 and on) we looked at two recent research projects that involve many topics to be covered in the class. Specifically, we looked at the <a href="http://www.pnas.org/content/107/41/17486.full">predictive power of search activity</a> and a study of <a href="http://messymatters.com/webdemo">demographic diversity on the Web</a>, with an application to <a href="http://bit.ly/surfpreds">predicting user demographics from browsing activity</a>. A few themes emerged, several of which will appear throughout the rest of the course:</p>
<ul>
<li>Regardless of scale, it&#8217;s difficult to find the right questions to ask of the data</li>
<li>Cleaning and normalizing data is a substantial amount of of the work</li>
<li>Simple methods (e.g., linear models) work surprisingly well, especially with lots of data</li>
</ul>
<p>
We&#8217;ll start in on some math and hacking along these lines next week.</p>
]]></content:encoded>
			<wfw:commentRss>http://jakehofman.com/ddm/2012/01/lecture-01-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

