13 - 18 OCTOBER 2013, ATLANTA, GEORGIA, USA

Serendip: Turning Topics Back to the Text

Contributors: 
Eric Alexander, Joe Kohlmann, Robin Valenza, Michael Gleicher
Description
Statistical topic modeling is an increasingly popular approach to text analysis. Many existing visualization tools focus on analyzing the model itself, distinct from the documents upon which it was trained. In contrast, we seek to treat the model as a lens through which to view the original documents. This would enable the reader to observe trends and build hypotheses at multiple scales--ranging from across a corpus to within a single text--and find both algorithmic data and textual examples to defend these hypotheses. Supporting this workflow requires a multi-tiered framework that affords comparisons at three levels: the entire corpus, small sets of documents, and a single document. This framework is embodied in Serendip, a web-application that combines view-coordinated reorderable matrices, small multiples displays, and tagged text in order to allow readers to develop insight at and across multiple levels.