Abstract:
Multivariate visualization techniques have attracted great interest as the
dimensionality of data sets grows. One premise of such techniques is that
simultaneous visual representation of multiple variables will enable the data
analyst to detect patterns amongst multiple variables. Such insights could
lead to development of new techniques for rigorous (numerical) analysis of
complex relationships hidden within the data. Two natural questions arise
from this premise: Which multivariate visualization techniques are the most
effective for high-dimensional data sets? How does the analysis task change
this utility ranking? We present a user study with a new task to answer the
first question. We provide some insights to the second question based on the
results of our study and results available in the literature. Our task led to
significant differences in error, response time, and subjective workload
ratings amongst four visualization techniques. We implemented three
integrated techniques (Data-driven Spots, Oriented Slivers, and Attribute
Blocks), as well as a baseline case of separate grayscale images. The
baseline case fared poorly on all three measures, whereas Datadriven Spots
yielded the best accuracy and was among the best in response time. These
results differ from comparisons of similar techniques with other tasks, and
we review all the techniques, tasks, and results (from our work and previous
work) to understand the reasons for this discrepancy.