What’s Happening and What Happened: Searching the Social Web

Abstract

Every day millions of users share links and post comments on different social networks. At scale, this behavior can be very useful for building a new type of search engine that exploits relevant links and their associated metadata in a temporal fashion. Our goal is to find links that are relevant on social networks as a mechanism to discover what people are talking about at a given point in time and make such information searchable and persistent. In other words, a continually updated archive of relevant content that is currently being shared, beyond the obvious trending news of the day. The techniques we use surface new and interesting content by mining social network posts that contain links, constructing diffusion trees from those links, and extracting related entities and other associated metadata. By looking at the size of the trees and their structure in combination with the conversation around each link and related topics, we designed and implemented a search engine that provides relevant fresh content and features a “wayback machine’’. We demonstrate the effectiveness of our approach by processing a dataset comprising millions of English language tweets generated over a one year period. Finally, we perform an offline evaluation of our techniques and conduct a use case study using an available data set of fake and real news links.