IEEE VIS 2024 Content: Visualizing Spatial Semantics of Dimensionally Reduced Text Embeddings

Visualizing Spatial Semantics of Dimensionally Reduced Text Embeddings

Wei Liu - Computer Science, Virginia Tech, Blacksburg, United States

Chris North - Virginia Tech, Blacksburg, United States

Rebecca Faust - Tulane University, New Orleans, United States

Room: Bayshore II

2024-10-14T16:00:00Z GMT-0600 Change your timezone on the schedule page
2024-10-14T16:00:00Z
Exemplar figure, described by caption below
Document projection of COVID-19 open research articles with gradient-based word explanations. (Top) A projection from a BERT model fine-tuned based on the data domain, featuring a spatial word cloud that captures the spatial semantics by showing key words that impact the projection. (Bottom) A heatmap of word impacts in a selected document, highlighting the word "smoking", which reflects the domain context.
Fast forward
Abstract

Dimension reduction (DR) can transform high-dimensional text embeddings into a 2D visual projection facilitating the exploration of document similarities. However, the projection often lacks connection to the text semantics, due to the opaque nature of text embeddings and non-linear dimension reductions. To address these problems, we propose a gradient-based method for visualizing the spatial semantics of dimensionally reduced text embeddings. This method employs gradients to assess the sensitivity of the projected documents with respect to the underlying words. The method can be applied to existing DR algorithms and text embedding models. Using these gradients, we designed a visualization system that incorporates spatial word clouds into the document projection space to illustrate the impactful text features. We further present three usage scenarios that demonstrate the practical applications of our system to facilitate the discovery and interpretation of underlying semantics in text projections.