ggplot2
and lattice
Ignore if you don't need this bit of support.
This is one in a series of tutorials in which we explore basic data import, exploration and much more using data from the Gapminder project. Now is the time to make sure you are working in the appropriate directory on your computer, perhaps through the use of an RStudio project. To ensure a clean slate, you may wish to clean out your workspace and restart R (both available from the RStudio Session menu, among other methods). Confirm that the new R process has the desired working directory, for example, with the getwd()
command or by glancing at the top of RStudio's Console pane.
Open a new R script (in RStudio, File > New > R Script). Develop and run your code from there (recommended) or periodicially copy "good" commands from the history. In due course, save this script with a name ending in .r or .R, containing no spaces or other funny stuff, and evoking "ggplot2", "lattice" and "comparison".
We use a lightly modified version of the usual Gapminder data. The rows have been reordered: sorted first on year, then on population. This ensures that big countries don't cover up little ones in our single year bubble charts. Also, the country colors are present as a character variable, because that is useful in lattice
workflows involving custom panel functions. You are encouraged to save the data file locally (vs. reading solely from the web), in case these webpages change. We drop Oceania here, as usual.
## data import from URL
gdURL <- "http://www.stat.ubc.ca/~jenny/notOcto/STAT545A/examples/gapminder/data/gapminderWithColorsAndSorted.txt"
kDat <- read.delim(file = gdURL, as.is = 7) # protect color
## alternative command is you save data file locally
#gDat <- read.delim("gapminderWithColorsAndSorted.txt")
str(kDat <- droplevels(subset(kDat, continent != "Oceania")))
## 'data.frame': 1680 obs. of 7 variables:
## $ country : Factor w/ 140 levels "Afghanistan",..: 24 58 133 66 59 47 1..
## $ continent: Factor w/ 4 levels "Africa","Americas",..: 3 3 2 3 3 4 2 4 ..
## $ year : int 1952 1952 1952 1952 1952 1952 1952 1952 1952 1952 ...
## $ pop : num 5.56e+08 3.72e+08 1.58e+08 8.65e+07 8.21e+07 ...
## $ lifeExp : num 44 37.4 68.4 63 37.5 ...
## $ gdpPercap: num 400 547 13990 3217 750 ...
## $ color : chr "#40004B" "#460552" "#A50026" "#611A6D" ...
We load the custom country color scheme. Remind yourself of what it looks like at this PDF. Again, you are encouraged to save a copy of this file locally, if you want to revisit this in future.
## get the country color scheme
gdURL <- "http://www.stat.ubc.ca/~jenny/notOcto/STAT545A/examples/gapminder/data/gapminderCountryColors.txt"
countryColors <- read.delim(file = gdURL, as.is = 3) # protect color
str(countryColors)
## 'data.frame': 142 obs. of 3 variables:
## $ continent: Factor w/ 5 levels "Africa","Americas",..: 1 1 1 1 1 1 1 1 ..
## $ country : Factor w/ 142 levels "Afghanistan",..: 95 39 43 28 118 121 ..
## $ color : chr "#7F3B08" "#833D07" "#873F07" "#8B4107" ...
head(countryColors)
## continent country color
## 1 Africa Nigeria #7F3B08
## 2 Africa Egypt #833D07
## 3 Africa Ethiopia #873F07
## 4 Africa Congo, Dem. Rep. #8B4107
## 5 Africa South Africa #8F4407
## 6 Africa Sudan #934607
Load the graphics packages:
library(ggplot2)
library(lattice)
## needed for both
jYear <- 2007 # this can obviously be changed
jPch <- 21
jDarkGray <- 'grey20'
jXlim <- c(150, 115000)
jYlim <- c(16, 96)
## needed for ggplot2 scale_fill_manual()
jColors <- countryColors$color
names(jColors) <- countryColors$country
## needed for lattice cex
jCexDivisor <- 1500 # arbitrary scaling constant
lattice
vs ggplot2
Bubble plot. lattice
on the left, ggplot2
on the right.
Spaghetti plot. lattice
on the left, ggplot2
on the right.
Do more side-by-side comparisons?
Learning R blog: recreating all the figures in Sarkar's lattice
book with both lattice
and ggplot2
Lattice: Multivariate Data Visualization with R available via SpringerLink by Deepayan Sarkar, Springer (2008) | all code from the book | GoogleBooks search
ggplot2: Elegant Graphics for Data Analysis available via SpringerLink by Hadley Wickham, Springer (2009) | online docs (nice!) | author's website for the book, including all the code | author's landing page for the package