jakehofman.com

home

introduction

i'm currently a research scientist in the social dynamics group at yahoo! research new york. i received my ph.d. from columbia university's physics department, studying applications of machine learning to text, image, and network data. please see my resume for more project and background information.

this site serves several purposes, from presenting and organizing my current research and teaching efforts to publishing code and tips that i hope others will find useful.

i'm on twitter as @jakehofman.

news

2010.05.23: slides and code for my icwsm 2010 tutorial, large-scale social media analysis with hadoop
2009.12.10: our recent paper, "what can search predict?" is posted (blog post, slate article)
2009.10.02: slides for my hadoopworld nyc talk: social network analysis with hadoop
2009.09.11: the course website is up for data-driven modeling (columbia applied math, fall 2009)
2009.05.13: our kdd paper has been covered by mit, slashdot, lifehacker, the chicago tribune, and wired
2009.04.22: our centmail paper was presented and demonstrated at the www 2009 developers track
2008.09.25: we have posted the call for papers for the NIPS 2008 workshop on analyzing graphs
2008.06.24: "a bayesian approach to network modularity" available at physical review letters online 2008.07.04: received "best student presentation award" at mlg 2008 (mining and learning with graphs)
2008.03.18: a scientific american article on my ph.d. advisor, chris wiggins; also a summary from ams
2008.03.14: coverage of analysis of the 2008 aps march meeting co-authorship network from nature blogs

latest tips

my latest geek tips, also available on twitter, tumblr, or as plain text:

applescript awk bash c consumer css cu cvs debian emacs excel firefox flickr gcc gentoo gmail google grammar graphviz hadoop html idvd imagemagick iphone iphoto ipod irc itunes java javascript keynote latex lifehack linux mac macosx matlab mobile mysql network networking perl php python quicktime razr rss rstats rsync ruby safari sed sge shell sms sql ssh svn test trac treo unix video windows word wordpress x11

2010.08.23 python: hidden features of python http://bit.ly/d4H2yw (via nyc python meetup)
2010.08.06 sql: view column names/types in sqlite if .schema fails: SELECT * FROM sqlite_master where type=table http://bit.ly/cmiPPy
2010.07.27 rstats: use xvfb to run R on a server without X11 http://bit.ly/dBmrEb
2010.07.26 shell: use the ssh escape character (default=~) to add port fwding to an existing connection http://bit.ly/bcDoCF (via @dggoldst)
2010.07.22 rstats: use opts(axis.text.x=theme_text(angle=-90)) to rotate x axis labels in ggplot2 http://bit.ly/bvRXIF

del.icio.us links

my most recent bookmarks at del.icio.us/jhofman

Introducing R
historious - your personalized search world
Icons of the Web
l1_logreg: l1_logreg: A large-scale solver for l1-regularized logistic regression problems
Howfuckedismydatabase.com