Este aviso fue puesto el 10 de junio de 2020.
En estadística , el lema fundamental de Neyman-Pearson es un resultado que describe el criterio óptimo para distinguir dos hipótesis simples
H
0
:
θ
=
θ
0
{\displaystyle H_{0}:\theta =\theta _{0}}
y
H
1
:
θ
=
θ
1
{\displaystyle H_{1}:\theta =\theta _{1}}
.
El lema debe su nombre a sus dos creadores, Jerzy Neyman y Egon Pearson .
Sea
X
1
,
X
2
,
…
,
X
n
{\displaystyle X_{1},X_{2},\dots ,X_{n}}
una muestra aleatoria de una población con función de densidad
f
(
x
;
θ
)
{\displaystyle f(x;\theta )}
donde
θ
∈
Θ
=
{
θ
0
,
θ
1
}
{\displaystyle \theta \in \Theta =\{\theta _{0},\theta _{1}\}}
y sean
0
<
α
<
1
{\displaystyle 0<\alpha <1}
,
k
∈
R
+
{\displaystyle k\in \mathbb {R} ^{+}}
y
C
{\displaystyle {\mathcal {C}}}
tales que
P
[
X
∈
C
|
H
0
]
=
α
{\displaystyle \operatorname {P} [\mathbf {X} \in {\mathcal {C}}|H_{0}]=\alpha }
λ
=
L
(
θ
0
)
L
(
θ
1
)
=
∏
i
=
1
n
f
(
x
i
;
θ
0
)
∏
i
=
1
n
f
(
x
i
;
θ
1
)
≤
k
{\displaystyle \lambda ={\frac {{\mathcal {L}}(\theta _{0})}{{\mathcal {L}}(\theta _{1})}}={\frac {\prod _{i=1}^{n}f(x_{i};\theta _{0})}{\prod _{i=1}^{n}f(x_{i};\theta _{1})}}\leq k}
si
x
∈
C
{\displaystyle \mathbf {x} \in {\mathcal {C}}}
.
λ
>
k
{\displaystyle \lambda >k}
si
x
∈
C
c
{\displaystyle \mathbf {x} \in {\mathcal {C}}^{c}}
.
entonces la prueba asociada a
C
{\displaystyle {\mathcal {C}}}
es una prueba más potente para probar
H
0
:
θ
=
θ
0
{\displaystyle H_{0}:\theta =\theta _{0}}
contra
H
1
:
θ
=
θ
1
{\displaystyle H_{1}:\theta =\theta _{1}}
, es decir,
C
{\displaystyle {\mathcal {C}}}
es la mejor región crítica.
Sea
X
1
,
X
2
,
…
,
X
n
{\displaystyle X_{1},X_{2},\dots ,X_{n}}
una muestra aleatoria de una población con distribución
N
(
μ
,
σ
0
2
)
{\displaystyle N(\mu ,\sigma _{0}^{2})}
donde
σ
0
2
{\displaystyle \sigma _{0}^{2}}
es conocida. Considere
H
0
:
μ
=
μ
0
H
1
:
μ
=
μ
1
α
{\displaystyle {\begin{aligned}H_{0}&:\mu =\mu _{0}\\H_{1}&:\mu =\mu _{1}\\\alpha \end{aligned}}}
siendo
μ
0
<
μ
1
{\displaystyle \mu _{0}<\mu _{1}}
.
En esta caso la función de verosimilitud es
L
(
x
1
,
…
,
x
n
;
μ
,
σ
0
2
)
=
∏
i
=
1
n
1
2
π
σ
0
2
exp
(
−
(
x
i
−
μ
)
2
2
σ
0
2
)
=
(
1
2
π
σ
0
2
)
n
exp
(
−
1
2
σ
0
2
∑
i
=
1
n
(
x
i
−
μ
)
2
)
{\displaystyle {\begin{aligned}{\mathcal {L}}(x_{1},\dots ,x_{n};\mu ,\sigma _{0}^{2})&=\prod _{i=1}^{n}{\frac {1}{\sqrt {2\pi \sigma _{0}^{2}}}}\exp \left(-{\frac {(x_{i}-\mu )^{2}}{2\sigma _{0}^{2}}}\right)\\&=\left({\frac {1}{\sqrt {2\pi \sigma _{0}^{2}}}}\right)^{n}\exp \left(-{\frac {1}{2\sigma _{0}^{2}}}\sum _{i=1}^{n}(x_{i}-\mu )^{2}\right)\end{aligned}}}
por el lema de Neyman-Pearson
L
0
L
1
=
(
1
2
π
σ
0
2
)
n
exp
(
−
1
2
σ
0
2
∑
i
=
1
n
(
x
i
−
μ
0
)
2
)
(
1
2
π
σ
0
2
)
n
exp
(
−
1
2
σ
0
2
∑
i
=
1
n
(
x
i
−
μ
1
)
2
)
=
exp
(
−
1
2
σ
0
2
∑
i
=
1
n
(
x
i
−
μ
0
)
2
+
1
2
σ
0
2
∑
i
=
1
n
(
x
i
−
μ
1
)
2
)
{\displaystyle {\begin{aligned}{\frac {{\mathcal {L}}_{0}}{{\mathcal {L}}_{1}}}&={\frac {\left({\frac {1}{\sqrt {2\pi \sigma _{0}^{2}}}}\right)^{n}\exp \left(-{\frac {1}{2\sigma _{0}^{2}}}\sum _{i=1}^{n}(x_{i}-\mu _{0})^{2}\right)}{\left({\frac {1}{\sqrt {2\pi \sigma _{0}^{2}}}}\right)^{n}\exp \left(-{\frac {1}{2\sigma _{0}^{2}}}\sum _{i=1}^{n}(x_{i}-\mu _{1})^{2}\right)}}\\&=\exp \left(-{\frac {1}{2\sigma _{0}^{2}}}\sum _{i=1}^{n}(x_{i}-\mu _{0})^{2}+{\frac {1}{2\sigma _{0}^{2}}}\sum _{i=1}^{n}(x_{i}-\mu _{1})^{2}\right)\end{aligned}}}
pero
∑
i
=
1
n
(
x
i
−
μ
)
2
=
∑
i
=
1
n
(
x
i
2
−
2
μ
x
i
+
μ
2
)
=
∑
i
=
1
n
x
i
2
−
2
μ
∑
i
=
1
n
x
i
+
n
μ
2
=
∑
i
=
1
n
x
i
2
−
2
μ
n
x
¯
+
n
μ
2
{\displaystyle {\begin{aligned}\sum _{i=1}^{n}(x_{i}-\mu )^{2}&=\sum _{i=1}^{n}(x_{i}^{2}-2\mu x_{i}+\mu ^{2})\\&=\sum _{i=1}^{n}x_{i}^{2}-2\mu \sum _{i=1}^{n}x_{i}+n\mu ^{2}\\&=\sum _{i=1}^{n}x_{i}^{2}-2\mu n{\bar {x}}+n\mu ^{2}\end{aligned}}}
por lo que
L
0
L
1
=
exp
[
−
1
2
σ
0
2
(
∑
i
=
1
n
x
i
2
−
2
μ
0
n
x
¯
+
n
μ
0
2
−
∑
i
=
1
n
x
i
2
+
2
μ
1
n
x
¯
−
n
μ
1
2
)
]
=
exp
[
−
1
2
σ
0
2
(
2
n
x
¯
(
μ
1
−
μ
0
)
+
n
(
μ
0
2
−
μ
1
2
)
)
]
=
exp
[
n
x
¯
(
μ
0
−
μ
1
)
σ
0
2
−
n
(
μ
0
2
−
μ
1
2
)
2
σ
0
2
]
≤
k
1
{\displaystyle {\begin{aligned}{\frac {{\mathcal {L}}_{0}}{{\mathcal {L}}_{1}}}&=\exp \left[-{\frac {1}{2\sigma _{0}^{2}}}\left(\sum _{i=1}^{n}x_{i}^{2}-2\mu _{0}n{\bar {x}}+n\mu _{0}^{2}-\sum _{i=1}^{n}x_{i}^{2}+2\mu _{1}n{\bar {x}}-n\mu _{1}^{2}\right)\right]\\&=\exp \left[-{\frac {1}{2\sigma _{0}^{2}}}\left(2n{\bar {x}}(\mu _{1}-\mu _{0})+n(\mu _{0}^{2}-\mu _{1}^{2})\right)\right]\\&=\exp \left[{\frac {n{\bar {x}}(\mu _{0}-\mu _{1})}{\sigma _{0}^{2}}}-{\frac {n(\mu _{0}^{2}-\mu _{1}^{2})}{2\sigma _{0}^{2}}}\right]\leq k_{1}\end{aligned}}}
lo anterior implica
n
x
¯
(
μ
0
−
μ
1
)
σ
0
2
−
n
(
μ
0
2
−
μ
1
2
)
2
σ
0
2
≤
k
2
=
ln
(
k
1
)
n
x
¯
(
μ
0
−
μ
1
)
σ
0
2
≤
k
3
=
k
2
+
n
(
μ
0
2
−
μ
1
2
)
2
σ
0
2
{\displaystyle {\begin{aligned}&{\frac {n{\bar {x}}(\mu _{0}-\mu _{1})}{\sigma _{0}^{2}}}-{\frac {n(\mu _{0}^{2}-\mu _{1}^{2})}{2\sigma _{0}^{2}}}\leq k_{2}=\ln(k_{1})\\&{\frac {n{\bar {x}}(\mu _{0}-\mu _{1})}{\sigma _{0}^{2}}}\leq k_{3}=k_{2}+{\frac {n(\mu _{0}^{2}-\mu _{1}^{2})}{2\sigma _{0}^{2}}}\end{aligned}}}
como
μ
1
>
μ
0
{\displaystyle \mu _{1}>\mu _{0}}
entonces
μ
0
−
μ
1
<
0
{\displaystyle \mu _{0}-\mu _{1}<0}
luego
x
¯
≥
k
=
k
3
σ
0
2
n
(
μ
0
−
μ
1
)
{\displaystyle {\bar {x}}\geq k={\frac {k_{3}\sigma _{0}^{2}}{n(\mu _{0}-\mu _{1})}}}
por lo tanto se rechaza
H
0
{\displaystyle H_{0}}
si
x
¯
≥
k
{\displaystyle {\bar {x}}\geq k}
, es decir la región de rechazo
C
{\displaystyle {\mathcal {C}}}
queda descrita como
C
=
{
(
X
1
,
X
2
,
…
,
X
n
)
:
X
¯
≥
k
}
{\displaystyle {\mathcal {C}}=\{(X_{1},X_{2},\dots ,X_{n}):{\bar {X}}\geq k\}}
Aplicaciones en estadística secuencial
editar