A Bayesian Model for Clustering Networked Documents

Derek Owens-Oas, Duke University, Department of Statistical Science
We introduce a novel Bayesian statistical model for simultaneously discovering topics and clustering documents which have a network structure. In much of existing literature for network topic models, links occur at a document-to-document level or a node-to-node level. Here, we model links at a document-to-node level, as they occur this way in our data. Specifically, we verify the model on political blog posts from 2012. Inference uses Gibbs sampling to sample from posterior distributions for topic assignments and block memberships. Top words from selected topics are displayed, discovered communities are discussed, and results are compared with a strong baselines from the joint network topic modeling literature.
February, 12 2018 | 12:45 pm - 2:00 pm | Gross Hall 230E

Return to seminar series