Expected value

Given a random variable X with distribution f, the expected value of X, denoted E(X), is defined by E(X) = ∑ixif(xi). In words, the expected value of X is the sum of each of the possible values of X multiplied by the probability of obtaining that value. The expected value of X is also called the mean of the distribution f. The basic property of E is that of linearity: if X and Y are random variables and if a and b are constants, then E(aX + bY) = aE(X) + bE(Y). To see why this is true, note that aX + bY is itself a random variable, which assumes the values axi + byj with the probabilities h(xiyj). Hence,


If the first sum on the right-hand side is summed over j while holding i fixed, by equation (8) the result is

which by definition is E(X). Similarly, the second sum equals E(Y).

If 1[A] denotes the “indicator variable” of A—i.e., a random variable equal to 1 if A occurs and equal to 0 otherwise—then E{1[A]} = 1 × P(A) + 0 × P(Ac) = P(A). This shows that the concept of expectation includes that of probability as a special case.
As an illustration, consider the number R of red balls in n draws with replacement from an urn containing a proportion p of red balls. From the definition and the binomial distribution of R,

which can be evaluated by algebraic manipulation and found to equal np. It is easier to use the representation R = 1[A1] +⋯+ 1[An], where Ak denotes the event “the kth draw results in a red ball.” Since E{1[Ak]} = p for all k, by linearity E(R) = E{1[A1]} +⋯+ E{1[An]} = np. This argument illustrates the principle that one can often compute the expected value of a random variable without first computing its distribution. For another example, suppose n balls are dropped at random into n boxes. The number of empty boxes, Y, has the representation Y = 1[B1] +⋯+ 1[Bn], where Bk is the event that “the kth box is empty.” Since the kth box is empty if and only if each of the n balls went into one of the other n − 1 boxes, P(Bk) = [(n − 1)/n]n for all k, and consequently E(Y) = n(1 − 1/n)n. The exact distribution of Y is very complicated, especially if n is large.
Many probability distributions have small values of f(xi) associated with extreme (large or small) values of xi and larger values of f(xi) for intermediate xi. For example, both marginal distributions in the table are symmetrical about a midpoint that has relatively high probability, and the probability of other values decreases as one moves away from the midpoint. Insofar as a distribution f(xi) follows this kind of pattern, one can interpret the mean of f as a rough measure of location of the bulk of the probability distribution, because in the defining sum the values xi associated with large values of f(xi) more or less define the centre of the distribution. In the extreme case, the expected value of a constant random variable is just that constant.

Post a Comment

Previous Post Next Post