Sampling Distribution Simulation

This Java applet illustrates several concepts relating to sampling distributions. It is one of several simulations and demonstrations being developed as part of the Rice Virtual Lab in Statistics. Please send comments, problems with this applet, and suggestions for improvement to David Lane. The applet is a work in progress so your feedback is much appreciated.

To run the Applet, indicate your monitor size and click the "Begin" button.

 

Other simulations/demonstrations

Exercises

Understanding the concept of a sampling distribution
1. Click the "Animated sample" button. Five scores from a normal distribution will be sampled and plotted in a histogram. The mean of the sample will be computed and plotted in a second histogram. Repeat this 3 or 4 times or until you understand the how the "Distribution of Means" is created. The red line extends from the mean one standard deviation in each directon. The colored vertical bars on the X-axis correspond to the statistic of the same color.

2. Click the "5 samples" button to sample 5 samples of 5 scores each. The five means will be plotted. Click the "500 samples" and/or "2000 samples" until the distribution of means has stabilized. The sampling distribution of the mean is the distribution that is approached as the number of samples approaches infinity. With 5,000 to 10,000 you get a pretty good approximation.

3. The distribution plotted in (2) above is the sampling distribution of the mean of a sample size of 5. Approximate the sampling distribution of the mean for other sample sizes.

4. Any statistic you can compute in a sample has a sampling distribution. Approximate the sampling distribution of other statistics. The statistics available to compute are:

Mean
Median
Standard deviation (sd) (Using N in the denominator)
Variance (Using N in the denominator)
Mean absolute deviation from the mean (MAD)
Range

Standard error
1. The standard error is the standard deviation of the sampling distribution. Approximate the sampling distribution of the mean for N=5. The standard deviation of the distribution is the standard error of the mean. Find the standard error of the mean and the standard error of the range for N=10 using the normal distribution.

2. Determine how the standard error is affected by sample size. Plot the standard error of the mean as a function of sample size for different standard deviations? Can you discover a formula relating the standard error of the mean to the sample size and the standard deviation? If so, see if it holds for distributions other than the normal distribution.

3. Redo #2 above for the median.

Bias
1. A statistic is unbiased if the mean of the sampling distribution of the statistic is the parameter. Test to see if the sample mean is an unbiased estimate of the population mean. Try out different sample sizes and distributions.

2. Find a distribution/sample size combination for which the sample median is a biased estimate of the population median.

3. Is the sample variance an unbiased estimate of the population variance? If not, see if you can find a correction based on sample size. Does the correction hold for distributions other than the normal distribution?

4. For what statistic is the mean of the sampling distribution dependent on sample size?

Efficiency
1. For a normal distribution, compare the size of the standard error of the median and the standard error of the mean. Find a relationship that holds (approximately) across sample sizes?

2. Does this relationship hold for a uniform distribution?

3. Find a distribution for which the standard error of the median is smaller than the standard error of the mean. (You may find this difficult, but don't give up.)

4. Compare the standard error of the standard deviation and the standard error of the mean absolute deviation from the mean (MAD). Does the relationship depend on the distribution?

Central Limit Theorem
1. The central limit theorem states that the sampling distribution of the mean approaches a normal distribution as the sample size increases. Sample from the uniform distribution and determine how large a sample size is needed for the distribution to be a very close approximation of the normal distribution.

2. Do the same thing sampling from the skewed distribution.

3. Determine whether the sampling distribution of the median approaches a normal distribution as sample size increases.