Skip to content

December 14, 2019

Unit 4 - Hypothesis testing

Notes on T-tests, Wald's test, likelihood ratio tests, and goodness of fit tests from MITx 18.6501x Fundamentals of Statistics

By flpvvvv 17 min read Statistics

This unit will present more tests based on CLT and sometimes Slutsky, like T-test when data is Gaussian, is unknown and Slutsky does not apply; like Wald’s test when we use asymptotic normality of MLE; like Implicit hypotheses when testing about multivariate parameters; like Goodness of fit when answering questions like “does my data follow a Gaussian distribution?”.

Asymptotic test - Clinical trials example

Let be i.i.d. test group samples distributed according to and let be i.i.d. control group samples distributed according to . Assume that are independent.

Hypotheses:

From (even don’t need CLT to get this):

We can get:

Assume that and , using Slutsky’s lemma, we can replace the variance by the sample variance :

where:

This is a one-side, two-sample test. Here we use instead of , because it is a no-biased variance estimator.

However, when the sample size is small, we cannot realistically apply Slutsky’s lemma, so we cannot replace the variance by the sample variance . Slutsky’s theorem only gives a good approximation when the sample size is very large.

The distribution

For a positive integer , the (pronounced “Kai-squared”) distribution with degrees of freedom is the law of the random variable , where .

If , then .

And .

If , then

The sample variance

Cochran’s theorem states that if , then the sample variance:

satisfies:

  • is independent of

Here it is because there is only degree of freedom: the sum of all the variables equal to : .

We often prefer the unbiased estimator of :

Then its expectation:

Student’s T distribution

For a positive integer , the Student’s T distribution with degrees of freedom (denoted by ) is the law of the random variable , where , and ( is independent of ).

Student’s T test

One-Sample, Two-Sided

The test statistic:

where is the sample mean of i.i.d. Gaussian observations with mean and variance , is the unbiased sample variance.

Since , and , then (by Cochran’s theorem), which is Student’s T distribution with degrees of freedom. So the distribution of is pivotal, and its quantiles can be found in tables.

The student’s T test of level is specified by:

where is the -quantile of .

Be careful that: The Student’s T test requires the data to be Gaussian. This test is non-asymptotic. That is, for any fixed , we can compute the level of our test rather than the asymptotic level.

One-Sample, One-Sided

The student’s T test of level is specified by:

where is the -quantile of .

Two-Sample

Back to the Clinical trials example, we have:

When the samples size is small, we can not use Slutsky’s lemma anymore, which means that we can not replace the variance by the sample variance anymore.

But we have approximately:

where the degrees of freedom is given by the Welch-Satterthwaite formula :

Wald’s test

According to Asymptotic Normality of the MLE:

where , and denotes the Fisher information.

Standardize the statement of asymptotic normality above:

The Wald’s test:

which is also:

Wald’s Test in 1 Dimension

In 1 dimension, Wald’s Test coincides with the two-sided test based on on the asymptotic normality of the MLE.

Given the hypotheses:

a two-sided test of level , based on the asymptotic normality of the MLE, is

where the Fisher information is the asymptotic variance of under the null hypothesis.

On the other hand, a Wald’s test of level is

Using the result from the problem above, we see that the two-sided test of level is the same as Wald’s test at level .

Example: Performing Wald’s Test on a Gaussian Data Set

Suppose . The goal is to hypothesis test between:

The Wald’s test of level is:

where:

and

Likelihood Ratio Test

Basic Form

Given the hypotheses:

The likelihood ratio in this set-up is of the form :

where is a threshold to be specified.

Likelihood Ratio Test (based on log-likelihood)

Consider an i.i.d. sample with statistical model , where .

Suppose the null hypothesis has the form:

for some fixed and given numbers .

Thus , the region defined by the null hypothesis, is

where consists of known values.

The likelihood ratio test involves the test-statistic:

where is the log-likelihood.

The estimator is the constrained MLE , and it is defined to be:

Wilks’ Theorem

Assume is true and the MLE technical conditions are satisfied. Then, is a pivotal statistic; i.e., it converges to a pivotal distribution.

Goodness of Fit Tests

Goodness of fit (GoF) tests: we want to know if the hypothesized distribution is a good fit for the data. In order to answer questions like:

  • Does have distribution ?
  • Does have a Gaussian distribution ?
  • Does have distribution ?

Key characteristic of GoF tests: no parametric modeling.

Suppose you observe i.i.d. samples from some unknown distribution . Let denote a parametric family of probability distributions (for example, could be the family of normal distributions ).

In the topic of goodness of fit testing, our goal is to answer the question “Does belong to the family , or is any distribution outside of ?

Parametric hypothesis testing is a particular case of goodness of fit testing. However, in the context of parametric hypothesis testing, we assume that the data distribution comes from some parametric statistical model , and we ask if the distribution belongs to a submodel or its complement . In parametric hypothesis testing, we allow only a small set of alternatives , where as in the goodness of fit testing, we allow the alternative to be anything.

GoF for Discrete Distributions

The probability simplex in , denoted by , is the set of all vectors such that:

where denotes the vector . Equivalently, in more familiar notation,

We want to test:

where is a fixed PMF.

The categorical likelihood of observing a sequence of i.i.d. outcomes can be written using the number of occurrences , of the outcomes as:

The categorical likelihood of the random variable , when written as a random function, is

(the sample space of a categorical random variable is ).

Let be the MLE:

then:

test : if is true, then is asymptotically normal and:

GoF for Continuous Distributions

Let be i.i.d. real random variables. The cdf of is defined as:

which completely characterizes the distribution of .

The empirical cdf (a.k.a. sample cdf) of the sample is defined as:

By the LLN, for all ,

By Glivenko-Cantelli Theorem (Fundamental theorem of statistics):

By the CLT, for all ,

(The variance of Bernoulli distribution is .)

Donsker’s Theorem states that if is continuous, then

where is a random curve called a Brownian bridge.

We want to test:

where is a continuous cdf. Let be the empirical cdf of the sample . If is true (), then , for all .

Kolmogorov-Smirnov test

The Kolmogorov-Smirnov test statistic is defined as:

and the Kolmogorov-Smirnov test is

Here, is the -quantile of the supremum of the Brownian bridge as in Donsker’s Theorem.

is called a pivotal statistic: If is true, the distribution of does not depend on the distribution of the ‘s and it is easy to reproduce it in simulations. In practice, the quantile values can be found in K-S Tables.

Even though the K-S test statistics is defined as a supremum over the entire real line, it can be computed explicitly as follows:

where is the order statistic , and represents the smallest value of the sample. For example, is the smallest and is the greatest of a sample of size .

Kolmogorov-Lilliefors Test

What if I want to test: “Does X have Gaussian distribution?” but I don’t know the parameters? A simple idea is using plug-in:

where: , , and is the cdf of .

In this case Donsker’s theorem is no longer valid.

Instead, we compute the quantiles for the test statistic:

They do not depend on unknown parameters! This is the Kolmogorov-Lilliefors test.

Example: Testing the Mean for a Sample with Unknown Distribution

Suppose that you observe a sample for some distribution with continuous cdf. Your goal is to decide between the null and alternative hypotheses:

Looking at a histogram, you suspect that have a Gaussian distribution. We would like to first test this suspicion. Formally, we would like to decide between the following null and alternative hypotheses:

We can use Kolmogorov-Lilliefors test to decide between and .

Suppose that the test we used in the previous part for and fails to reject.

Then we can use Student’s T test to decide between the original hypotheses and .

In practice, many of the methods for statistical inference, such as the student’s T test, rely on the assumption the data is Gaussian. Hence, before performing such a test, we need to evaluate whether or not the data is Gaussian. This problem gives an example of such a procedure. First we tested for the Gaussianity of our data, and since the Kolmogorov-Lilliefors test failed to reject, assuming that there was no error, we could apply the student’s T test to answer our original hypothesis testing question.