13 - 18 OCTOBER 2013, ATLANTA, GEORGIA, USA

ScagExplorer: Using Scagnostics to Cluster Huge Datasets

Contributors: 
Tuan Nhon Dang, Leland Wilkinson
Description
We introduce a method for guiding interactive exploration of high-dimensional data. The method is based on nine characterizations of the 2D distributions of orthogonal pairwise projections on a set of points in multidimensional Euclidean space. These characterizations include measures such as, density, skewness, shape, outliers, and texture. Using with these measures, we can quickly generate a comprehensive summary of the 2D relations of variables in a large dataset with more than a hundred dimensions.