Luke McDowell's talk on DB Seminar

 

Title: Investigating Markov Logic Networks for Collective Classification
(paper to be presented at ICAART 2012)

Abstract
Collective Classification (CC) is the process of simultaneously inferring the class labels of a set of inter-linked nodes, such as the topic of publications in a citation graph. Recently, Markov Logic Networks (MLNs) have attracted significant attention because of their ability to combine first order logic with probabilistic reasoning.  A few authors have used this ability of MLNs in order to perform CC over linked data, but the relative advantages of MLNs vs. other CC techniques remains unknown. In response, this work compares a wide range of MLN learning and inference algorithms to the best previously studied CC algorithms. We find that MLN accuracy is highly dependent on the type of learning and the input rules that are used, which is not unusual given MLNs’ flexibility. More surprisingly, we find that even the best MLN performance generally lags that of the best previously studied CC algorithms. However, MLNs do excel on the one dataset that exhibited the most complex linking patterns. Ultimately, we find that MLNs may be worthwhile for CC tasks involving data with complex relationships, but that MLN learning for such data remains a challenge.

Bio
Luke McDowell is an associate professor of Computer Science at the U.S. Naval Academy in Annapolis, Maryland, U.S.A.  For 2011-2012, he is a visiting professor at EPFL, working in Prof. Karl Aberer's LSIR lab.  He graduated from Princeton University in 1997 with a B.S.E. degree in Electrical Engineering, and graduated from the University of Washington in Seattle with a Ph.D. in Computer Science in 2004.  His dissertation focused on exploring how to make the Semantic Web practical and useful for casual computer users.  Later work examined how to use the web to automatically extract useful content for the Semantic Web.  More recently, his work has been focusing on how to apply machine learning techniques to making sense of the vast amount of "linked" data that surrounds us in the work of webpages, social networks, biological networks, etc.   Much of this work has involved demonstrating how link-aware algorithms can significantly improve accuracy results, compared to traditional algorithms, for a wide-range of real-world data.