Variance d'une variable aléatoire bornée

22

Supposons qu'une variable aléatoire ait une borne inférieure et une borne supérieure [0,1]. Comment calculer la variance d'une telle variable?

variance standard-deviation measurement-error Piotr
la source

8

De la même manière que pour une variable illimitée - en définissant les limites d'intégration ou de sommation de manière appropriée.

Scortchi - Réintégrer Monica

2

Comme l'a dit @Scortchi. Mais je suis curieux de savoir pourquoi vous pensiez que cela pourrait être différent?

Peter Flom - Réintègre Monica

3

À moins que vous ne sachiez rien de la variable (auquel cas une borne supérieure de la variance pourrait être calculée à partir de l'existence de bornes), pourquoi le fait qu'elle soit bornée entre-t-il dans le calcul?

Glen_b -Reinstate Monica

6

Une limite supérieure utile de la variance d'une variable aléatoire qui prend des valeurs dans

[a,b] $[a,b]$ , avec une probabilité

1 $1$ est

(b−a)2/4 $(b-a)^2/4$ et est obtenu par une variable aléatoire discrète qui prend des valeurs

a $a$ et

b $b$ avec une égale probabilité

12 $\frac{1}{2}$ . Un autre point à garder à l'esprit est que la variance est garantie d'exister alors qu'une variable aléatoire non bornée peut ne pas avoir de variance (certaines, comme les variables aléatoires de Cauchy n'ont même pas de moyenne).

Dilip Sarwate

7

Il existe une variable aléatoire discrète dont la variance est égale à

(b−a)24 $\frac{(b-a)^2}{4}$ exactement:une variable aléatoire qui prend les valeurs

a $a$ et

b $b$ avec une probabilité égale

12 $\frac{1}{2}$ . Donc, au moins, nous savons qu'une limite supérieure universelle sur la variance ne peut pas être inférieure à

(b−a)24 $\frac{(b-a)^2}{4}$ .

Dilip Sarwate

46

Vous pouvez prouver l'inégalité de Popoviciu comme suit. Utiliser la notation $m=\inf X$ et $M=\sup X$ . Définissez une fonction $g$ par

g (t) = E [(X - t) 2] .

$g(t)=\mathbb{E}\left[\left(X-t\right)^2\right] \, .$ Calcul de la dérivée

g′ $g'$ et résolution de

g' (t) = - 2 E [X] + 2 t = 0,

$g'(t) = -2\mathbb{E}[X] +2t=0 \, ,$ nous constatons que

g $g$ atteint son minimum à

t=E[X] $t=\mathbb{E}[X]$ (notons que

g′′>0 $g''>0$ ).

Maintenant, considérons la valeur de la fonction $g$ au point spécial $t=\frac{M+m}{2}$ . Ce doit être le cas que

V a r [X] = g (E [X]) \leq g (M + m 2) .

$\mathbb{Var}[X]=g(\mathbb{E}[X])\leq g\left(\frac{M+m}{2}\right) \, .$ Mais

g (M + m 2) = E [(X - M + m 2) 2] = 1 4 E [((X - m) + (X - M)) 2] .

$g\left(\frac{M+m}{2}\right) = \mathbb{E}\left[\left(X - \frac{M+m}{2}\right)^2 \right] = \frac{1}{4}\mathbb{E}\left[\left((X-m) + (X-M)\right)^2 \right] \, .$ Puisque

X−m≥0 $X-m\geq 0$ et

X−M≤0 $X-M\leq 0$ , nous avons

((X - m) + (X - M)) 2 \leq ((X - m) - (X - M)) 2 = (M - m) 2,

$\left((X-m)+(X-M)\right)^2\leq\left((X-m)-(X-M)\right)^2=\left(M-m\right)^2 \, ,$ ce qui implique que

1 4 E [((X - m) + (X - M)) 2] \leq 1 4 E [((X - m) - (X - M)) 2] = ( M - m ) 2 4 .

$\frac{1}{4}\mathbb{E}\left[\left((X-m) + (X-M)\right)^2 \right] \leq \frac{1}{4}\mathbb{E}\left[\left((X-m) - (X-M)\right)^2 \right] = \frac{(M-m)^2}{4} \, .$ Nous avons donc démontré l'inégalité de Popoviciu

V a r [X] \leq ( M - m ) 2 4 .

$\mathbb{Var}[X]\leq \frac{(M-m)^2}{4} \, .$

Zen
la source

3

Belle approche: c'est bien de voir des démonstrations rigoureuses de ce genre de choses.

whuber

22

+1 Sympa! J'ai appris les statistiques bien avant que les ordinateurs ne soient en vogue, et une idée qui nous a été expliquée était que

E [(X - t) 2] = E [((X - μ) - (t - μ)) 2] = E [(X - μ) 2] + (t - μ) 2

$E[(X-t)^2] = E[((X-\mu)-(t-\mu))^2] = E[(X-\mu)^2]+(t-\mu)^2$ ce qui a permis le calcul de la variance en trouvant la somme des carrés des écarts de tout point commode

et en ajustant ensuite le biais. Ici bien sûr, cette identité donne une preuve simple du résultat que

a une valeur minimale à

sans la nécessité de dérivés etc.t $t$

g(t) $g(t)$

t=μ $t=\mu$

Dilip Sarwate

18

Soit une distribution sur . Nous montrerons que si la variance de est maximale, alors ne peut avoir aucun support à l'intérieur, d'où il résulte que est Bernoulli et le reste est trivial. $F$ $[0,1]$ $F$ $F$ $F$

En termes de notation, soit le ème moment brut de (et, comme d'habitude, on écrit et pour la variance). $\mu_k = \int_0^1 x^k dF(x)$ $k$ $F$ $\mu = \mu_1$ $\sigma^2 = \mu_2 - \mu^2$

Nous savons que n'a pas tout son support à un moment donné (la variance est minime dans ce cas). Cela implique entre autres que se situe strictement entre et . Pour argumenter par contradiction, supposons qu'il existe un sous-ensemble mesurable à l'intérieur pour lequel . Sans aucune perte de généralité, nous pouvons supposer (en changeant en si besoin est) que $F$ $\mu$ $0$ $1$ $I$ $(0,1)$ $F(I)\gt 0$ $X$ $1-X$ : en d'autres termes, est obtenu en coupant toute partie de au-dessus de la moyenne et a une probabilité positive. $F(J = I \cap (0, \mu]) \gt 0$ $J$ $I$ $J$

Modifions en en retirant toute probabilité de et en la plaçant à . $F$ $F'$ $J$ $0$ Ce faisant, devient $\mu_k$

μ' k = μ k - \int J x k d F (x) .

$\mu'_k = \mu_k - \int_J x^k dF(x).$

As a matter of notation, let us write $[g(x)] = \int_J g(x) dF(x)$ for such integrals, whence

μ' 2 = μ 2 - [x 2], μ' = μ - [x] .

$\mu'_2 = \mu_2 - [x^2], \quad \mu' = \mu - [x].$

Calculate

σ' 2 = μ' 2 - μ' 2 = μ 2 - [x 2] - (μ - [x]) 2 = σ 2 + ((μ [x] - [x 2]) + (μ [x] - [x] 2)) .

$\sigma'^2 = \mu'_2 - \mu'^2 = \mu_2 - [x^2] - (\mu - [x])^2 = \sigma^2 + \left((\mu[x] - [x^2]) + (\mu[x] - [x]^2)\right).$

The second term on the right, $(\mu[x] - [x]^2)$ , is non-negative because $\mu \ge x$ everywhere on $J$ . The first term on the right can be rewritten

μ [x] - [x 2] = μ (1 - [1]) + ([μ] [x] - [x 2]) .

$\mu[x] - [x^2] = \mu(1 - [1]) + ([\mu][x] - [x^2]).$

The first term on the right is strictly positive because (a) $\mu \gt 0$ and (b) $[1] = F(J) \lt 1$ because we assumed $F$ is not concentrated at a point. The second term is non-negative because it can be rewritten as $[(\mu-x)(x)]$ and this integrand is nonnegative from the assumptions $\mu \ge x$ on $J$ and $0 \le x \le 1$ . It follows that $\sigma'^2 - \sigma^2 \gt 0$ .

We have just shown that under our assumptions, changing $F$ to $F'$ strictly increases its variance. The only way this cannot happen, then, is when all the probability of $F'$ is concentrated at the endpoints $0$ and $1$ , with (say) values $1-p$ and $p$ , respectively. Its variance is easily calculated to equal $p(1-p)$ which is maximal when $p=1/2$ and equals $1/4$ there.

Now when $F$ is a distribution on $[a,b]$ , we recenter and rescale it to a distribution on $[0,1]$ . The recentering does not change the variance whereas the rescaling divides it by $(b-a)^2$ . Thus an $F$ with maximal variance on $[a,b]$ corresponds to the distribution with maximal variance on $[0,1]$ : it therefore is a Bernoulli $(1/2)$ distribution rescaled and translated to $[a,b]$ having variance $(b-a)^2/4$ , QED.

whuber
la source

Interesting, whuber. I didn't know this proof.

Zen

6

@Zen It's by no means as elegant as yours. I offered it because I have found myself over the years thinking in this way when confronted with much more complicated distributional inequalities: I ask how the probability can be shifted around in order to make the inequality more extreme. As an intuitive heuristic it's useful. By using approaches like the one laid out here, I suspect a general theory for proving a large class of such inequalities could be derived, with a kind of hybrid flavor of the Calculus of Variations and (finite dimensional) Lagrange multiplier techniques.

whuber

Perfect: your answer is important because it describes a more general technique that can be used to handle many other cases.

Zen

@whuber said - "I ask how the probability can be shifted around in order to make the inequality more extreme." -- this seems to be the natural way to think about such problems.

Glen_b -Reinstate Monica

There appear to be a few mistakes in the derivation. It should be

μ [x] - [x 2] = μ (1 - [1]) [x] + ([μ] [x] - [x 2]) .

$\mu[x] - [x^2] = \mu(1 - [1])[x] + ([\mu][x] - [x^2]).$ Also,

[(μ−x)(x)] $[(\mu-x)(x)]$ does not equal

[μ][x]−[x2] $[\mu][x] - [x^2]$ since

[μ][x] $[\mu][x]$ is not the same as

μ[x] $\mu[x]$

Leo

13

If the random variable is restricted to $[a,b]$ and we know the mean $\mu=E[X]$ , the variance is bounded by $(b-\mu)(\mu-a)$ .

Let us first consider the case $a=0, b=1$ . Note that for all $x\in [0,1]$ , $x^2\leq x$ , wherefore also $E[X^2]\leq E[X]$ . Using this result,

σ 2 = E [X 2] - (E [X] 2) = E [X 2] - μ 2 \leq μ - μ 2 = μ (1 - μ) .

$\begin{equation} \sigma^2 = E[X^2] - (E[X]^2) = E[X^2] - \mu^2 \leq \mu - \mu^2 = \mu(1-\mu). \end{equation}$

To generalize to intervals $[a,b]$ with $b>a$ , consider $Y$ restricted to $[a,b]$ . Define $X=\frac{Y-a}{b-a}$ , which is restricted in $[0,1]$ . Equivalently, $Y = (b-a)X + a$ , and thus

V a r [Y] = (b - a) 2 V a r [X] \leq (b - a) 2 μ X (1 - μ X) .

$\begin{equation} Var[Y] = (b-a)^2Var[X] \leq (b-a)^2\mu_X (1-\mu_X). \end{equation}$ where the inequality is based on the first result. Now, by substituting

μX=μY−ab−a $\mu_X = \frac{\mu_Y - a}{b-a}$ , the bound equals

(b - a) 2 μ Y - a b - a (1 - μ Y - a b - a) = (b - a) 2 μ Y - a b - a b - μ Y b - a = (μ Y - a) (b - μ Y),

$\begin{equation} (b-a)^2\, \frac{\mu_Y - a}{b-a}\,\left(1- \frac{\mu_Y - a}{b-a}\right) = (b-a)^2 \frac{\mu_Y -a}{b-a}\,\frac{b - \mu_Y}{b-a} = (\mu_Y - a)(b- \mu_Y), \end{equation}$ which is the desired result.

Juho Kokkala
la source

8

At @user603's request....

A useful upper bound on the variance $\sigma^2$ of a random variable that takes on values in $[a,b]$ with probability $1$ is $\sigma^2 \leq \frac{(b−a)^2}{4}$ . A proof for the special case $a=0, b=1$ (which is what the OP asked about) can be found here on math.SE, and it is easily adapted to the more general case. As noted in my comment above and also in the answer referenced herein, a discrete random variable that takes on values $a$ and $b$ with equal probability $\frac{1}{2}$ has variance $\frac{(b−a)^2}{4}$ and thus no tighter general bound can be found.

Another point to keep in mind is that a bounded random variable has finite variance, whereas for an unbounded random variable, the variance might not be finite, and in some cases might not even be definable. For example, the mean cannot be defined for Cauchy random variables, and so one cannot define the variance (as the expectation of the squared deviation from the mean).

Dilip Sarwate
la source

this is a special case of @Juho's answer

Aksakal

It was just a comment, but I could also add that this answer does not answer the question asked.

Aksakal

@Aksakal So??? Juho was answering a slightly different and much more recently asked question. This new question has been merged with the one you see above, which I answered ten months ago.

Dilip Sarwate

0

are you sure that this is true in general - for continuous as well as discrete distributions? Can you provide a link to the other pages? For a general distibution on $[a,b]$ it is trivial to show that

V a r (X) = E [(X - E [X]) 2] \leq E [(b - a) 2] = (b - a) 2 .

$Var(X) = E[(X-E[X])^2] \le E[(b-a)^2] = (b-a)^2.$ I can imagine that sharper inequalities exist ... Do you need the factor

1/4 $1/4$ for your result?

On the other hand one can find it with the factor $1/4$ under the name Popoviciu's_inequality on wikipedia.

This article looks better than the wikipedia article ...

For a uniform distribution it holds that

V a r (X) = ( b - a ) 2 12 .

$Var(X) = \frac{(b-a)^2}{12}.$

Ric
la source

This page states the result with the start of a proof that gets a bit too involved for me as it seems to require an understanding of the "Fundamental Theorem of Linear Programming". sci.tech-archive.net/Archive/sci.math/2008-06/msg01239.html

Adam Russell

Thank you for putting a name to this! "Popoviciu's Inequality" is just what I needed.

Adam Russell

2

This answer makes some incorrect suggestions:

$1/4$ is indeed right. The reference to Popoviciu's inequality will work, but strictly speaking it applies only to distributions with finite support (in particular, that includes no continuous distributions). A limiting argument would do the trick, but something extra is needed here.

whuber

2

A continuous distribution can approach a discrete one (in cdf terms) arbitrarily closely (e.g. construct a continuous density from a given discrete one by placing a little Beta(4,4)-shaped kernel centered at each mass point - of the appropriate area - and let the standard deviation of each such kernel shrink toward zero while keeping its area constant). Such discrete bounds as discussed here will thereby also act as bounds on continuous distributions. I expect you're thinking about continuous unimodal distributions... which indeed have different upper bounds.

Glen_b -Reinstate Monica

2

Well ... my answer was the least helpful but I would leave it here due to the nice comments. Cheers,R

Ric

Variance d'une variable aléatoire bornée

Réponses: