Vistat http://vis.supstat.com 2014-07-02T18:45:49-07:00 vis@supstat.com Demonstration of the Law of Large Numbers http://vis.supstat.com/2013/04/law-of-large-numbers 2013-04-18T00:00:00-07:00 Lijia Yu http://vis.supstat.com/2013/04/law-of-large-numbers The Law of Large Numbers (LLN) basically states that the average obtained from a large number of trials should be close to the expected value, and will tend to become closer as more trials are performed.

The function lln.ani() in the animation package provides us a visualization method for the LLN. It plots the sample mean as the sample size grows to check whether the sample mean approaches to the population mean. Here we make an animation with the Chi-squared distribution as the population distribution.

library(animation)
ani.options(interval = 0.3)
lln.ani(FUN = function(n, mu) rchisq(n, df = mu), mu = 5, cex = 0.6)
]]>
Buffon's needle http://vis.supstat.com/2013/04/buffons-needle 2013-04-16T00:00:00-07:00 Lijia Yu http://vis.supstat.com/2013/04/buffons-needle Given a needle of length dropped on a plane ruled with parallel lines units apart, what is the probability that the needle will cross a line? This question is first posed in the 18th century by Georges-Louis Leclerc, Comte de Buffon. The answer is where is the distance between two adjacent lines, and is the length of the needle.

The solution, in the case where the needle length is not greater than the width of the strips, can be used to design a Monte Carlo method for approximating the number .

In the animation package, the function buffon.needle() can be used to simulate Buffon’s needle. There are three graphs made in each step: the top one is a simulation of the scenario, the bottom-left one can help us understand the connection between dropping needles and the mathematical method to estimate , and the bottom-right one is the simulation result for each drop.

library(animation)
ani.options(nmax = 100, interval = 0.5)
par(mar = c(3, 2.5, 0.5, 0.2), pch = 20, mgp = c(1.5, 0.5, 0))
buffon.needle(mat = matrix(c(1, 2, 1, 3), 2))

You can use larger nmax values in the code to drop the needle for more times.

]]>
Demonstration of the Central Limit Theorem http://vis.supstat.com/2013/04/central-limit-theorem 2013-04-15T00:00:00-07:00 Lijia Yu http://vis.supstat.com/2013/04/central-limit-theorem In Probability Theory, the Central Limit Theorem (CLT) states that, given certain conditions, the mean of a sufficiently large number of independent random variables, each with a well-defined mean and well-defined variance, will be approximately normally distributed.

As shown in the Bean Machine article, CLT has a number of variants. This article shows you as long as the conditions of CLT are satisfied, the distribution of the sample mean will be approximate to the Normal distribution when the sample size n is large enough, no matter what is the original distribution.

In the animation package, there is a function named clt.ani(). It shows the distribution of the sample mean when the sample size grows up. The test shapiro.test() is provided as a measure of normality.

Classical Central Limit Theorem

With the parameter FUN in the function clt.ani() you can select distribution which will be shown in the animation. Here is the example with the Poisson distribution.

library(animation)
ani.options(interval = 0.5)
par(mar = c(3, 3, 1, 0.5), mgp = c(1.5, 0.5, 0), tcl = -0.3)
lambda = 4
f = function(n) rpois(n, lambda)
clt.ani(FUN = f, mean = lambda, sd = lambda)

When CLT does not work

The Cauchy distribution is an example of a distribution which has no mean, variance or higher moments defined, so we cannot apply CLT to this distribution.

ani.options(interval = 0.5)
par(mar = c(3, 3, 1, 0.5), mgp = c(1.5, 0.5, 0), tcl = -0.3)
f = function(n) rcauchy(n, location = 0, scale = 2)
clt.ani(FUN = f, mean = NA, sd = NA)
]]>
The Bean Machine and the Central Limit Theorem http://vis.supstat.com/2013/04/bean-machine 2013-04-13T00:00:00-07:00 Lijia Yu http://vis.supstat.com/2013/04/bean-machine The bean machine, also known as the quincunx or Galton box, is a device invented by Sir Francis Galton to demonstrate the Central Limit Theorem, in particular that the Normal distribution is approximated from the Binomial distribution, or properly speaking, de Moivre–Laplace theorem.

The function quincunx() in the animation package shows you how balls bounce left and right as they hit the pins. You can see the height of ball columns in the bins approximates a bell curve.

library(animation)
ani.options(interval = 0.03, nmax = 213)
quincunx()
]]>
Cheat Sheets for Plotting Symbols and Color Palettes http://vis.supstat.com/2013/04/plotting-symbols-and-color-palettes 2013-04-08T00:00:00-07:00 R Core Team and Yihui Xie and Lijia Yu http://vis.supstat.com/2013/04/plotting-symbols-and-color-palettes This cheat sheet shows a few commonly used plotting symbols and color palettes in R. Now you do not need to memorize the facts like “19 is the big solid dot and 21 is an open circle that can have a different background color” – just bookmark this cheat sheet. In fact it was motivated by Dr Rafael Irizarry, who printed the color palettes on a piece of paper and pinned that to the wall in his office.

Plotting symbols (pch)

There are many plotting symbols in the graphics package. We can set the pch parameter to set the symbols. See ?points for more information.

Below is a figure containing the plot symbols from pch = 0 to 25 along with some character-based plot symbols. We can use, for example, plot(x, y, pch = 3) for plus signs (+) in a scatterplot.

plot of chunk pch

Color palettes

The default color palette in R:

(pal = palette())
## [1] "black"   "red"     "green3"  "blue"    "cyan"    "magenta"
## [7] "yellow"  "gray"
par(mar = rep(0, 4))
pie(rep(1, length(pal)), labels = sprintf("%d (%s)", seq_along(pal), 
  pal), col = pal)

plot of chunk default-pal

Below are the color palettes in RColorBrewer (if you do not want to use ggplot2, you should consider using this package to generate colors for your plots; just do not use the meaningless rainbow() palette):

library(RColorBrewer)
par(mar = c(0, 4, 0, 0))
display.brewer.all()

plot of chunk brewer-pal

# generate 8 colors from the Set2 palette
brewer.pal(8, "Set2")
## [1] "#66C2A5" "#FC8D62" "#8DA0CB" "#E78AC3" "#A6D854" "#FFD92F"
## [7] "#E5C494" "#B3B3B3"

In R, the function colors() returns a vector of 657 color names. When you really need to specify the color by its name, you are recommended to check out this nice color chart created by Earl F. Glynn.

]]>
Mathematical Annotation in R http://vis.supstat.com/2013/04/mathematical-annotation-in-r 2013-04-08T00:00:00-07:00 R Core Team and Lijia Yu and Karl Broman and Kevin Ushey http://vis.supstat.com/2013/04/mathematical-annotation-in-r Want to write mathematical symbols and expressions in R graphics? You can use an R expression() instead of normal text, e.g. plot(1:10, main = expression(alpha + beta)). Below is a demo that shows you everything about plotting math in R (it was written by the R Core Team; see ?plotmath for details):

demo(plotmath)

plot of chunk plotmath plot of chunk plotmath plot of chunk plotmath plot of chunk plotmath plot of chunk plotmath

Combining expressions and text

If you want to combine multiple mathematical expressions with text, use paste() inside expression(), as in the following.

par(mar = c(4, 4, 2, 0.1))
plot(rnorm(100), rnorm(100),
  xlab = expression(hat(mu)[0]), ylab = expression(alpha^beta),
  main = expression(paste("Plot of ", alpha^beta, " versus ", hat(mu)[0])))

plot of chunk math-text

Finally, if we want to include variables from an R session in mathematical expressions, and substitute in their actual values, we can use substitute().

par(mar = c(4, 4, 2, 0.1))
x_mean <- 1.5
x_sd <- 1.2
hist(rnorm(100, x_mean, x_sd),
  main = substitute(
    paste(X[i], " ~ N(", mu, "=", m, ", ", sigma^2, "=", s2, ")"),
    list(m = x_mean, s2 = x_sd^2)
  )
)

plot of chunk math-text-sub

]]>
Draw Easter Eggs http://vis.supstat.com/2013/03/draw-easter-eggs 2013-03-31T00:00:00-07:00 Lijia Yu http://vis.supstat.com/2013/03/draw-easter-eggs This article shows you how to draw an egg with R. I wrote this article after I learned to use the animation package for one week. So I will show you the simplest way to create an animated figure. The only function used in this package is ani.pause().

Let’s get started!

A simple egg

The equations that define an egg are:

In the figure below, let’s swap the x axis and y axis for an egg in the vertical direction.

t = seq(-pi, pi, by = 0.01)
H = 1
x = H * 0.78 * cos(t/4) * sin(t)
y = -H * cos(t)
par(mar = rep(0, 4))
plot(x, y, type = "l", xlim = c(-1, 1), ylim = c(-1, 1), asp = 1, 
  col = "orange1", lwd = 5, axes = FALSE)

plot of chunk draw-egg

We need a Rotation matrix to draw the rotating egg:

The ani.pause() function is called to pause for a time interval (by default specified in ani.options('interval')) and flush the current device. We draw a egg with with different angles in 30 images, and you will see the egg rotating below:

library(animation)
egg_rotation = function(H = 1, angle = seq(0, pi * 2, length = 30), 
  pos = c(0, 0)) {
  t = seq(-pi, pi, by = 0.01)
  for (i in 1:length(angle)) {
    x = H * 0.78 * cos(t/4) * sin(t)
    y = -H * cos(t)
    # Rotation matrix
    x1 = cos(angle[i]) * x - sin(angle[i]) * y + pos[1]
    y1 = sin(angle[i]) * x + cos(angle[i]) * y + pos[2]
    cols = colors()
    flag = sample(1:length(cols), 1)
    plot(x1, y1, type = "l", xlim = c(-1, 1), ylim = c(-1, 1), 
      asp = 1, col = cols[flag], lwd = 8, axes = FALSE)
    ani.pause(0.1)
  }
}
par(mar = rep(0, 4))
egg_rotation()

Another example:

library(animation)
egg = function(h = rnorm(1), angle = rnorm(1), pos = rnorm(2)) {
  t = seq(-pi, pi, by = 0.01)
  for (i in 1:10) {
    H = h - h/10 * i
    x = H * 0.78 * cos(t/4) * sin(t)
    y = -H * cos(t)
    # Rotation matrix
    x1 = cos(angle) * x - sin(angle) * y + pos[1]
    y1 = sin(angle) * x + cos(angle) * y + pos[2]
    cols = colors()
    flag = sample(1:length(cols), 1)
    plot(x1, y1, type = "l", xlim = c(-1, 1), ylim = c(-1, 1), 
      asp = 1, col = cols[flag], lwd = 8, axes = FALSE)
    ani.pause(0.1)
  }
}
par(mar = rep(0, 4))
set.seed(123)
for (j in 1:10) {
  egg()
  ani.pause(1)
}

3D eggs

If you want to draw a 3D egg. The rgl package can help you do it. The 3D egg is just a perturbation of a sphere. The function is:

Here we set $c=0.2, b=0.3$.

meshgrid <- function(a, b) {
  list(x = outer(b * 0, a, FUN = "+"), y = outer(b, a * 0, FUN = "+"))
}
library(rgl)
c = 0.2
b = 1.7
theta = seq(0, 2 * pi, length = 40 * 4)
phi = seq(0, pi, length = 40 * 4)
theta1 = meshgrid(theta, phi)$x
phi2 = meshgrid(theta, phi)$y
x = (1 + c * phi2) * sin(phi2) * cos(theta1)
y = (1 + c * phi2) * sin(phi2) * sin(theta1)
z <- b * cos(phi2)
surface3d(x, y, z, color = rainbow(10))
par3d(zoom = 0.7)

plot of chunk draw-3d-egg

References

]]>
Simulation of Coin Flipping http://vis.supstat.com/2013/03/simulation-of-coin-flipping 2013-03-27T00:00:00-07:00 Lijia Yu http://vis.supstat.com/2013/03/simulation-of-coin-flipping The function flip.coin() in the animation package provides a simulation to the process of flipping coins and computes the frequencies for heads and tails. Coin flipping is a well-known Bernoulli trial. When you flip a coin, there are two possible outcomes: head or tail. A fair coin has the probability 0.5 for head by definition.

Head or tail

We toss a fair coin 100 times below.

library(animation)
ani.options(nmax = 100, interval = 0.3)
par(mar = c(2, 4, 2, 2))
flip.coin(bg = "yellow")

Note the outcome is random, so if you run the code above again, you are likely to see different results, but on average you should get 50 heads and 50 tails in the long run.

Generalization

The coin here does not have to mean a coin literally. We can generalize it to an object that can produce $n$ possible outcomes. For example, three outcomes Head, Stand (a coin may stand on the table) and Tail with probabilities 0.45, 0.1 and 0.45 respectively:

ani.options(nmax = 100, interval = 0.3)
par(mar = c(2, 4, 2, 2))
flip.coin(faces = c("Head", "Stand", "Tail"), type = "n", prob = c(0.45, 
  0.1, 0.45), col = c(1, 2, 4))
]]>
Making Visual Illusions in R http://vis.supstat.com/2013/03/make-visual-illusions-in-r 2013-03-26T00:00:00-07:00 Lijia Yu http://vis.supstat.com/2013/03/make-visual-illusions-in-r This article demonstrates how to make visual illusions with R. A visual illusion is a distortion of form, size or color in the visual field. In the animation package, functions vi.grid.illusion() and vi.lilac.chaser() can be used to produce visual illusions.

Scintillating grid illusion

vi.grid.illusion() function provides illustrations for the Scintillating grid illusion and Hermann grid illusion. They are two most common types of grid illusions.

You can see dark dots appear and disappear rapidly at random intersections.

library(animation)
vi.grid.illusion()

plot of chunk scintillating-grid-illusion

Hermann grid illusion

From Hermann grid illusion picture, you can see grey blobs disappear when looking directly at an intersection.

vi.grid.illusion(type = "h", lwd = 22, nrow = 5, ncol = 5, col = "white")

plot of chunk hermann-grid-illusion

Lilac Chaser

We can draw a Lilac chaser with the function vi.lilac.chaser().

Stare at the center cross for a few (say 30) seconds to experience the illusion.

  • A gap running around the circle of lilac discs;
  • A green disc running around the circle of lilac discs in place of the gap;
  • The green disc running around on the grey background, with the lilac discs having disappeared in sequence.
ani.options(nmax = 20)
par(mar = c(1, 1, 1, 1))
vi.lilac.chaser()

Note

Don’t worry if you can’t see all the phenomena described. For many illusions, there is a percentage of people with perfectly normal vision who just don’t see it, often for reasons currently unknown.

Further reading

You can see more illusions created by Kohske in R, which also illustrated the power of the grid package.

]]>
Demonstration of the Gradient Descent Algorithm http://vis.supstat.com/2013/03/gradient-descent-algorithm-with-r 2013-03-24T00:00:00-07:00 Lijia Yu http://vis.supstat.com/2013/03/gradient-descent-algorithm-with-r In the animation package, there is a function named grad.desc(). It provides a visual illustration for the process of minimizing a real-valued function through the Gradient Descent Algorithm. The two examples below show you how to use the grad.desc() function.

A simple function

The default objective function in grad.desc() is . The arrows will take you to the minima step by step:

library(animation)
par(mar = c(4, 4, 2, 0.1))
grad.desc()

When the algorithm fails

This example shows how the gradient descent algorithm will fail with a too large step length.

To find a local minimum of a bivariate objective function:

ani.options(nmax = 70)
par(mar = c(4, 4, 2, 0.1))
f2 = function(x, y) sin(1/2 * x^2 - 1/4 * y^2 + 3) * cos(2 * x + 1 - 
  exp(y))
grad.desc(f2, c(-2, -2, 2, 2), c(-1, 0.5), gamma = 0.3, tol = 1e-04)
## Warning: Maximum number of iterations reached!

Apparently the arrows get lost eventually. You can replace gamma=0.3 with a smaller value and retry the function.

]]>