14 - 19 OCTOBER, 2012. SEATTLE, WASHINGTON, USA

Visualizing data in R and ggobi

Organizers: 
Di Cook
Organizers: 
Heike Hofmann
Organizers: 
Hadley Wickham
Description

R is an open-source statistical programming environment. It is widely used by academic, industry and government statisticians and is becoming increasingly popular in many applied domains. ggobi is an open-source interactive graphics package for visualizing high-dimensional data. 
In this day tutorial, you'll learn about:

  • Extracting knowledge from data by making plots using the ggplot2 package in R.
  • Approaches to visualization from a different tradition, a tradition
 that embraces the study of variation and variability

.
  • The use of a command line interface that provides vast flexibility but
 requires that users are comfortable with high-level programming.
  • Connecting to R to take advantage of the cutting edge statistical and 
 machine learning models and linking these with data plots using the rggobi package, to become a data explorer.

The course is split into two parts:

  • In the morning, we will be introducing working with R and the ggplot2 package.
Why is learning R worthwhile your time? - 
We will show-case some examples of problems that have been addressed using R and ggobi, and the methods that were employed to solve these problems. 
Following this we will delve deeper into the use of a command line system to create graphics, getting beyond defaults. 
The ggplot2 package allows us to easily get insights about complex relationships in the data and produce graphics that help to uncover the unknown.
  • In the afternoon, we will be discussing interactive graphics with ggobi and the link to R, which opens up all interactive capabilities to R's developers specifications.
We will highlight how to use modern algorithmic techniques in an interactive setting, that allows us to more closely inspect the output, and lift off the veil from the `black box' approach of a lot of these techniques are hampered by.
We will finish with a discussion on determining whether structure seen in plots is "real" or consistent with randomness.