It is also of interest to know how closely packed about its mean
value a distribution is. The most important measure of concentration is
the variance, denoted by Var(X) and defined by Var(X) = E{[X − E(X)]2}. By linearity of expectations, one has equivalently Var(X) = E(X2) − {E(X)}2. The standard deviation of X is the square root of its variance. It has a more direct interpretation than the variance because it is in the same units as X. The variance of a constant random variable is 0. Also, if c is a constant, Var(cX) = c2Var(X).
There is no general formula for the expectation of a product of random variables. If the random variables X and Y are independent, E(XY) = E(X)E(Y). This can be used to show that, if X1,…, Xn are independent random variables, the variance of the sum X1 +⋯+ Xn is just the sum of the individual variances, Var(X1) +⋯+ Var(Xn). If the Xs have the same distribution and are independent, the variance of the average (X1 +⋯+ Xn)/n is Var(X1)/n. Equivalently, the standard deviation of (X1 +⋯+ Xn)/n is the standard deviation of X1 divided by √n. This quantifies the intuitive notion that the average of repeated observations is less variable than the individual observations. More precisely, it says that the variability of the average is inversely proportional to the square root of the number of observations. This result is tremendously important in problems of statistical inference. (See the section The law of large numbers, the central limit theorem, and the Poisson approximation.)
Consider again the binomial distribution given by equation (3). As in the calculation of the mean value, one can use the definition combined with some algebraic manipulation to show that, if R has the binomial distribution, then Var(R) = npq. From the representation R = 1[A1] +⋯+ 1[An] defined above, and the observation that the events Ak are independent and have the same probability, it follows that
Moreover,
so Var(R) = npq.
The conditional distribution of Y given X = xi is defined by:
(compare equation (4)), and the conditional expectation of Y given X = xi is
One can regard E(Y|X) as a function of X; since X is a random variable, this function of X must itself be a random variable. The conditional expectation E(Y|X) considered as a random variable has its own (unconditional) expectation E{E(Y|X)}, which is calculated by multiplying equation (9) by f(xi) and summing over i to obtain the important formula
Properly interpreted, equation (10) is a generalization of the law of total probability.
For a simple example of the use of equation (10), recall the problem of the gambler’s ruin and let e(x) denote the expected duration of the game if Peter’s fortune is initially equal to x. The reasoning leading to equation (5) in conjunction with equation (10) shows that e(x) satisfies the equations e(x) = 1 + pe(x + 1) + qe(x − 1) for x = 1, 2,…, m − 1 with the boundary conditions e(0) = e(m) = 0. The solution for p ≠ 1/2 is rather complicated; for p = 1/2, e(x) = x(m − x).
There is no general formula for the expectation of a product of random variables. If the random variables X and Y are independent, E(XY) = E(X)E(Y). This can be used to show that, if X1,…, Xn are independent random variables, the variance of the sum X1 +⋯+ Xn is just the sum of the individual variances, Var(X1) +⋯+ Var(Xn). If the Xs have the same distribution and are independent, the variance of the average (X1 +⋯+ Xn)/n is Var(X1)/n. Equivalently, the standard deviation of (X1 +⋯+ Xn)/n is the standard deviation of X1 divided by √n. This quantifies the intuitive notion that the average of repeated observations is less variable than the individual observations. More precisely, it says that the variability of the average is inversely proportional to the square root of the number of observations. This result is tremendously important in problems of statistical inference. (See the section The law of large numbers, the central limit theorem, and the Poisson approximation.)
Consider again the binomial distribution given by equation (3). As in the calculation of the mean value, one can use the definition combined with some algebraic manipulation to show that, if R has the binomial distribution, then Var(R) = npq. From the representation R = 1[A1] +⋯+ 1[An] defined above, and the observation that the events Ak are independent and have the same probability, it follows that
Moreover,
so Var(R) = npq.
The conditional distribution of Y given X = xi is defined by:
(compare equation (4)), and the conditional expectation of Y given X = xi is
One can regard E(Y|X) as a function of X; since X is a random variable, this function of X must itself be a random variable. The conditional expectation E(Y|X) considered as a random variable has its own (unconditional) expectation E{E(Y|X)}, which is calculated by multiplying equation (9) by f(xi) and summing over i to obtain the important formula
Properly interpreted, equation (10) is a generalization of the law of total probability.
For a simple example of the use of equation (10), recall the problem of the gambler’s ruin and let e(x) denote the expected duration of the game if Peter’s fortune is initially equal to x. The reasoning leading to equation (5) in conjunction with equation (10) shows that e(x) satisfies the equations e(x) = 1 + pe(x + 1) + qe(x − 1) for x = 1, 2,…, m − 1 with the boundary conditions e(0) = e(m) = 0. The solution for p ≠ 1/2 is rather complicated; for p = 1/2, e(x) = x(m − x).