Most existing multidimensional visualization techniques do not work well for
high dimensional categorical datasets. The major challenges include
preserving the discrete nature of the data and visually exploring the high
dimensional space. In this poster, we propose a new visual analytics approach
for high dimensional categorical data. Our methodology is to convert a
categorical dataset into a document corpus and then apply advanced document
analysis and visualization techniques to the corpus. Two prominent knowledge
discovery tasks, namely cluster analysis and multivariate analysis, are
supported. For cluster analysis, the Latent Dirichlet Allocation (LDA) topic
model is employed to discover subspace clusters in a categorical dataset. The
clusters are then visualized in a semantically rich visualization for
interactive visual analysis. For multivariate analysis, LDA is used for
dimension reduction and optimal rule mining is used to discover rules
describing multivariate relationships in the reduced subspaces. The
effectiveness of this approach has been illustrated by case studies on real
datasets.