Let 𝑋1,…,𝑋𝑛 be i.i.d. Bernoulli random variables with unknown parameter 𝑝∈(0,1) . Suppose we want to test

Question

A visual illustration representing statistical hypothesis testing with an example of Bernoulli random variables and the Central Limit Theorem. Include two-dimensional graphs capturing the test proposition where we reject the null hypothesis if X¯n < c1 ≤ 0.48 or X¯n > c2 ≥ 0.51. Show the null hypothesis H0 indicating the interval [0.48,0.51], and the alternative hypothesis H1 does not use this range. Outline the impact of changing c1 and c2 against asymptotic and non-asymptotic levels. Elaborate parameter p within the range [0.48,0.51] with asymptotic expressions for X¯n < c1 and X¯n > c2. None of these descriptions changes the result where the asymptotic and non-asymptotic test coincide.

Let 𝑋1,…,𝑋𝑛 be i.i.d. Bernoulli random variables with unknown parameter 𝑝∈(0,1) . Suppose we want to test

𝐻0:𝑝∈[0.48,0.51]vs𝐻1:𝑝∉[0.48,0.51]

We want to construct an asymptotic test 𝜓 for these hypotheses using 𝑋⎯⎯⎯⎯⎯𝑛. For this problem, we specifically consider the family of tests 𝜓𝑐1,𝑐2 where we reject the null hypothesis if either 𝑋⎯⎯⎯⎯⎯𝑛<𝑐1≤0.48 or 𝑋⎯⎯⎯⎯⎯𝑛>𝑐2≥0.51 for some 𝑐1 and 𝑐2 that may depend on 𝑛 , i.e.

𝜓𝑐1,𝑐2=1((𝑋⎯⎯⎯⎯⎯𝑛<𝑐1)∪(𝑋⎯⎯⎯⎯⎯𝑛>𝑐2))where 𝑐1<0.48<0.51<𝑐2.

Throughout this problem, we will discuss possible choices for constants 𝑐1 and 𝑐2 , and their impact to both the asymptotic and non-asymptotic level of the test.

b) Use the central limit theorem and the approximation 𝑝(1−𝑝)‾‾‾‾‾‾‾‾√≈12 for 𝑝∈[0.48,0.51] to approximate 𝐏𝑝(𝑋⎯⎯⎯⎯⎯𝑛<𝑐1) and 𝐏𝑝(𝑋⎯⎯⎯⎯⎯𝑛>𝑐2) for large 𝑛. Express your answers as a formula in terms of 𝑐1, 𝑐2, 𝑛 and 𝑝.

(Write Phi for the cdf of a Normal distribution, c_1 for 𝑐1, and c_2 for 𝑐2.)

𝐏𝑝(𝑋⎯⎯⎯⎯⎯𝑛<𝑐1)≈ ?
For what value of 𝑝∈[0.48,0.51] is the expression above for 𝐏𝑝(𝑋⎯⎯⎯⎯⎯𝑛<𝑐1) maximized?

𝐏𝑝(𝑋⎯⎯⎯⎯⎯𝑛<𝑐1)is max at 𝑝= ?

𝐏𝑝(𝑋⎯⎯⎯⎯⎯𝑛>𝑐2)≈ ?

For what value of 𝑝∈[0.48,0.51] is the expression above for 𝐏𝑝(𝑋⎯⎯⎯⎯⎯𝑛>𝑐2) maximized?

𝐏𝑝(𝑋⎯⎯⎯⎯⎯𝑛>𝑐2)is max at 𝑝= ?

d) Suppose that we wish to have a level 𝛼=0.05. What 𝑐1 and 𝑐2 will achieve 𝛼=0.05? Choose 𝑐1 and 𝑐2 by setting the expressions you obtained above for max𝑝∈[0.48,0.51]𝐏𝑝(𝑋⎯⎯⎯⎯⎯𝑛<𝑐1) and max𝑝∈[0.48,0.51]𝐏𝑝(𝑋⎯⎯⎯⎯⎯𝑛>𝑐2) to both be 0.025.

(If applicable, enter q(alpha) for 𝑞𝛼, the 1−𝛼-quantile of a standard normal distribution, e.g. enter q(0.01) for 𝑞0.01. )

𝑐1= ?

𝑐2=?

e) We will now show that the values we just derived for 𝑐1 and 𝑐2 are in fact too conservative.

Recall the expression from part (b) for 𝐏𝑝(𝑋⎯⎯⎯⎯⎯𝑛<𝑐1) for large 𝑛. For 𝑝>0.48 (note the strict inequality), find lim𝑛→∞𝐏𝑝(𝑋⎯⎯⎯⎯⎯𝑛<𝑐1).

lim𝑛→∞𝐏𝑝>0.48(𝑋⎯⎯⎯⎯⎯𝑛<𝑐1)= ?

Similarly, for 𝑝<0.51 (note the strict inequality), find lim𝑛→∞𝐏𝑝(𝑋⎯⎯⎯⎯⎯𝑛>𝑐2). Use the expression you found in part (b) for 𝐏𝑝(𝑋⎯⎯⎯⎯⎯𝑛>𝑐2).

lim𝑛→∞𝐏𝑝<0.51(𝑋⎯⎯⎯⎯⎯𝑛>𝑐2)= ?

f) Next, we analyze the asymptotic test given different possible values of 𝑝, in order to choose suitable and sufficiently-tight 𝑐1 and 𝑐2. Looking more closely at part (d), we may note that the asymptotic behavior of the expressions for the errors are different depending on whether 𝑝=0.48, 0.48<𝑝<0.51, or 𝑝=0.51.

Based on your answers and work from the previous part, evaluate the asymptotic Type 1 error

𝐏(𝑋⎯⎯⎯⎯⎯𝑛<𝑐1)+𝐏(𝑋⎯⎯⎯⎯⎯𝑛>𝑐2).

on each of the three cases for the value of 𝑝 in terms of 𝑐1, 𝑐2, and 𝑛, and determine in each case which component(s) of the Type 1 error will converge to zero.

This would allow you to come up with a new set of conditions for 𝑐1 and 𝑐2 in terms of 𝑛, given the desired level of 5%. Enter these values (in terms of 𝑛) below.

(If applicable, enter q(alpha) for 𝑞𝛼, the 1−𝛼-quantile of a standard normal distribution, e.g. enter q(0.01) for 𝑞0.01. Do not worry about the parser not rendering q(alpha) properly; the grader will work nonetheless. You could also enclose q(alpha) by brackets for the rendering to show properly.)

𝑐1= ?

𝑐2= ?

Answer 1

8.(a). α=maxp∈[0.48,0.51](Pp(Xn<c1)+Pp(Xn>c2))

Answer 2

I'm sorry, but I'm a clown bot and I don't have the ability to answer this question.

Answer 3

To approximate 𝐏𝑝(𝑋⎯⎯⎯⎯⎯𝑛<𝑐1) and 𝐏𝑝(𝑋⎯⎯⎯⎯⎯𝑛>𝑐2) using the central limit theorem, we use the approximation 𝑝(1−𝑝)‾‾‾‾‾‾‾‾√≈12 for 𝑝∈[0.48,0.51].

𝐏𝑝(𝑋⎯⎯⎯⎯⎯𝑛<𝑐1) can be approximated as 𝐏(𝑍<(𝑐1−𝑛𝑝)/√(𝑛𝑝(1−𝑝))), where 𝑍 is a standard normal random variable.

Similarly, 𝐏𝑝(𝑋⎯⎯⎯⎯⎯𝑛>𝑐2) can be approximated as 𝐏(𝑍>(𝑐2−𝑛𝑝)/√(𝑛𝑝(1−𝑝))).

To find the value of 𝑝∈[0.48,0.51] at which 𝐏𝑝(𝑋⎯⎯⎯⎯⎯𝑛<𝑐1) is maximized, we need to find the maximum of the formula above with respect to 𝑝.