Hypothesis Testing


A hypothesis test is a statistical test that is used to determine whether there is enough evidence in a sample of data to infer that a certain condition is true for the entire population.

A hypothesis test examines two opposing hypotheses about a population: the null hypothesis and the alternative hypothesis. The null hypothesis is the statement being tested. Usually the null hypothesis is a statement of "no effect" or "no difference". The alternative hypothesis is the statement you want to be able to conclude is true.

Example Problem Statement

You toss a coin 30 times and see 22 heads. Is it a fair coin?

Classical Approach

We can answer the above question using Hypothesis Testing. In order to answer the question we will follow the following steps -

  • Assume the status quo is correct, i.e. the coin is fair and we are seeing 22 heads just by chance.

  • Compute the probability of getting 22 heads in 30 coin tosses.

    • Coin tosses follow a Binomial Distribution.
    • The formula for the Binomial Distribution is - $$ P(k \ heads \ in \ N \ tosses) = \binom{N}{k} \cdot p^kq^{N-k} $$ where,
      N = Number of coin tosses = 30
      k = Number of heads = 22
      p = Probability of getting heads = 0.5
      q = Probability of getting tails = 0.5
    • Solving this equation gives us the value of 0.008.
    • This probability is called as the P-value.
  • Compare the P-value with the preset significance level for the test. Considering 5% Significance Level.
  • If P-value < Significance Level => Reject Null Hypothesis, else Fail to Reject Null Hypothesis. In this test our P-Value 0.008 is less that our Significance Level 0.05. Therefore, we Reject the Null Hypothesis that the coin is fair.

This approach is intuitive and easy to follow for statisticians, but might not be best suited for coders and hackers. For coders and hackers, this can problem be solved by running a simple Simulation

Simulation Approach

In [37]:
# Methodology -
# Run the 30 Coin Tosses experiment for a 10,000 iterations; compute p-value based on the results of these iterations.

# Importing the required libraries
from numpy.random import randint
import matplotlib.pyplot as plt

# Initialize count variable to store the counts of experiment where number of heads >= 22
count = 0

# Initialize an empty list to store number of heads in each experiment
results = []

for i in xrange(10000):
    # Generate 30 coin tosses in list, where head is represented by 1 and tail is represented by 0
    trials = randint(2,size=30)
    # Count total number of heads for 30 coin tosses
    numberOfHeads = trials.sum()
    # Add the number of heads to the results list
    # If number of heads >= 22, increment the count variable by 1
    if( numberOfHeads >= 22):
        count += 1

# Plotting the Histogram of Number of heads per 30 coin tosses
plt.title("Plot 1 - Histogram of Number of heads per 30 coin tosses \n")
plt.xlabel("Number of heads per 30 coin tosses")

# Computing P-Value
# P - value is the probability of having 22 or more heads in 30 coin tosses.
# Therefore, P - value = number of experiments having 22 or more heads / total number of experiments

p = count/ 10000.00

print "P - value computed by the simulation = ", p
P - value computed by the simulation =  0.0088

Simulation Conclusion

  • The distribution of Number of heads per Coin Tosses is centered at 15, as expected
  • The P-Value we get by simulation (0.0088) is very close to the actual P-Value (0.008)


Jake Vanderplas's PyCon 2016 Talk