Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Statistics LibreTexts

9.1: Null and Alternative Hypotheses

  • Last updated
  • Save as PDF
  • Page ID 23459

The actual test begins by considering two hypotheses . They are called the null hypothesis and the alternative hypothesis . These hypotheses contain opposing viewpoints.

\(H_0\): The null hypothesis: It is a statement of no difference between the variables—they are not related. This can often be considered the status quo and as a result if you cannot accept the null it requires some action.

\(H_a\): The alternative hypothesis: It is a claim about the population that is contradictory to \(H_0\) and what we conclude when we reject \(H_0\). This is usually what the researcher is trying to prove.

Since the null and alternative hypotheses are contradictory, you must examine evidence to decide if you have enough evidence to reject the null hypothesis or not. The evidence is in the form of sample data.

After you have determined which hypothesis the sample supports, you make a decision. There are two options for a decision. They are "reject \(H_0\)" if the sample information favors the alternative hypothesis or "do not reject \(H_0\)" or "decline to reject \(H_0\)" if the sample information is insufficient to reject the null hypothesis.

\(H_{0}\) always has a symbol with an equal in it. \(H_{a}\) never has a symbol with an equal in it. The choice of symbol depends on the wording of the hypothesis test. However, be aware that many researchers (including one of the co-authors in research work) use = in the null hypothesis, even with > or < as the symbol in the alternative hypothesis. This practice is acceptable because we only make the decision to reject or not reject the null hypothesis.

Example \(\PageIndex{1}\)

  • \(H_{0}\): No more than 30% of the registered voters in Santa Clara County voted in the primary election. \(p \leq 30\)
  • \(H_{a}\): More than 30% of the registered voters in Santa Clara County voted in the primary election. \(p > 30\)

Exercise \(\PageIndex{1}\)

A medical trial is conducted to test whether or not a new medicine reduces cholesterol by 25%. State the null and alternative hypotheses.

  • \(H_{0}\): The drug reduces cholesterol by 25%. \(p = 0.25\)
  • \(H_{a}\): The drug does not reduce cholesterol by 25%. \(p \neq 0.25\)

Example \(\PageIndex{2}\)

We want to test whether the mean GPA of students in American colleges is different from 2.0 (out of 4.0). The null and alternative hypotheses are:

  • \(H_{0}: \mu = 2.0\)
  • \(H_{a}: \mu \neq 2.0\)

Exercise \(\PageIndex{2}\)

We want to test whether the mean height of eighth graders is 66 inches. State the null and alternative hypotheses. Fill in the correct symbol \((=, \neq, \geq, <, \leq, >)\) for the null and alternative hypotheses.

  • \(H_{0}: \mu \_ 66\)
  • \(H_{a}: \mu \_ 66\)
  • \(H_{0}: \mu = 66\)
  • \(H_{a}: \mu \neq 66\)

Example \(\PageIndex{3}\)

We want to test if college students take less than five years to graduate from college, on the average. The null and alternative hypotheses are:

  • \(H_{0}: \mu \geq 5\)
  • \(H_{a}: \mu < 5\)

Exercise \(\PageIndex{3}\)

We want to test if it takes fewer than 45 minutes to teach a lesson plan. State the null and alternative hypotheses. Fill in the correct symbol ( =, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

  • \(H_{0}: \mu \_ 45\)
  • \(H_{a}: \mu \_ 45\)
  • \(H_{0}: \mu \geq 45\)
  • \(H_{a}: \mu < 45\)

Example \(\PageIndex{4}\)

In an issue of U. S. News and World Report , an article on school standards stated that about half of all students in France, Germany, and Israel take advanced placement exams and a third pass. The same article stated that 6.6% of U.S. students take advanced placement exams and 4.4% pass. Test if the percentage of U.S. students who take advanced placement exams is more than 6.6%. State the null and alternative hypotheses.

  • \(H_{0}: p \leq 0.066\)
  • \(H_{a}: p > 0.066\)

Exercise \(\PageIndex{4}\)

On a state driver’s test, about 40% pass the test on the first try. We want to test if more than 40% pass on the first try. Fill in the correct symbol (\(=, \neq, \geq, <, \leq, >\)) for the null and alternative hypotheses.

  • \(H_{0}: p \_ 0.40\)
  • \(H_{a}: p \_ 0.40\)
  • \(H_{0}: p = 0.40\)
  • \(H_{a}: p > 0.40\)

COLLABORATIVE EXERCISE

Bring to class a newspaper, some news magazines, and some Internet articles . In groups, find articles from which your group can write null and alternative hypotheses. Discuss your hypotheses with the rest of the class.

In a hypothesis test , sample data is evaluated in order to arrive at a decision about some type of claim. If certain conditions about the sample are satisfied, then the claim can be evaluated for a population. In a hypothesis test, we:

  • Evaluate the null hypothesis , typically denoted with \(H_{0}\). The null is not rejected unless the hypothesis test shows otherwise. The null statement must always contain some form of equality \((=, \leq \text{or} \geq)\)
  • Always write the alternative hypothesis , typically denoted with \(H_{a}\) or \(H_{1}\), using less than, greater than, or not equals symbols, i.e., \((\neq, >, \text{or} <)\).
  • If we reject the null hypothesis, then we can assume there is enough evidence to support the alternative hypothesis.
  • Never state that a claim is proven true or false. Keep in mind the underlying fact that hypothesis testing is based on probability laws; therefore, we can talk only in terms of non-absolute certainties.

Formula Review

\(H_{0}\) and \(H_{a}\) are contradictory.

  • If \(\alpha \leq p\)-value, then do not reject \(H_{0}\).
  • If\(\alpha > p\)-value, then reject \(H_{0}\).

\(\alpha\) is preconceived. Its value is set before the hypothesis test starts. The \(p\)-value is calculated from the data.References

Data from the National Institute of Mental Health. Available online at http://www.nimh.nih.gov/publicat/depression.cfm .

write a typical null hypothesis for a two tail test

  • The Open University
  • Guest user / Sign out
  • Study with The Open University

My OpenLearn Profile

Personalise your OpenLearn profile, save your favourite content and get recognition for your learning

About this free course

Become an ou student, download this course, share this free course.

Data analysis: hypothesis testing

Start this free course now. Just create an account and sign in. Enrol and complete the course for a free statement of participation or digital badge if available.

4.2 Two-tailed tests

Hypotheses that have an equal (=) or not equal (≠) supposition (sign) in the statement are called non-directional hypotheses . In non-directional hypotheses, the researcher is interested in whether there is a statistically significant difference or relationship between two or more variables, but does not have any specific expectation about which group or variable will be higher or lower. For example, a non-directional hypothesis might be: ‘There is a difference in the preference for brand X between male and female consumers.’ In this hypothesis, the researcher is interested in whether there is a statistically significant difference in the preference for brand X between male and female consumers, but does not have a specific prediction about which gender will have a higher preference. The researcher may conduct a survey or experiment to collect data on the brand preference of male and female consumers and then use statistical analysis to determine whether there is a significant difference between the two groups.

Non-directional hypotheses are also known as two-tailed hypotheses. The term ‘two-tailed’ comes from the fact that the statistical test used to evaluate the hypothesis is based on the assumption that the difference or relationship could occur in either direction, resulting in two ‘tails’ in the probability distribution. Using the coffee foam example (from Activity 1), you have the following set of hypotheses:

H 0 : µ = 1cm foam

H a : µ ≠ 1cm foam

In this case, the researcher can reject the null hypothesis for the mean value that is either ‘much higher’ or ‘much lower’ than 1 cm foam. This is called a two-tailed test because the rejection region includes outcomes from both the upper and lower tails of the sample distribution when determining a decision rule. To give an illustration, if you set alpha level (α) equal to 0.05, that would give you a 95% confidence level. Then, you would reject the null hypothesis for obtained values of z 1.96 (you will look at how to calculate z-scores later in the course).

This can be plotted on a graph as shown in Figure 7.

A two-tailed test shown in a symmetrical graph reminiscent of a bell

A symmetrical graph reminiscent of a bell. The x-axis is labelled ‘z-score’ and the y-axis is labelled ‘probability density’. The x-axis increases in increments of 1 from -2 to 2.

The top of the bell-shaped curve is labelled ‘Foam height = 1cm’. The graph circles the rejection regions of the null hypothesis on both sides of the bell curve. Within these circles are two areas shaded orange: beneath the curve from -2 downwards which is labelled z 1.96 and α = 0.025.

In a two-tailed hypothesis test, the null hypothesis assumes that there is no significant difference or relationship between the two groups or variables, and the alternative hypothesis suggests that there is a significant difference or relationship, but does not specify the direction of the difference or relationship.

When performing a two-tailed test, you need to determine the level of significance, which is denoted by alpha (α). The value of alpha, in this case, is 0.05. To perform a two-tailed test at a significance level of 0.05, you need to divide alpha by 2, giving a significance level of 0.025 for each distribution tail (0.05/2 = 0.025). This is done because the two-tailed test is looking for significance in either tail of the distribution. If the calculated test statistic falls in the rejection region of either tail of the distribution, then the null hypothesis is rejected and the alternative hypothesis is accepted. In this case, the researcher can conclude that there is a significant difference or relationship between the two groups or variables.

Assuming that the population follows a normal distribution, the tail located below the critical value of z = –1.96 (in a later section, you will discuss how this value was determined) and the tail above the critical value of z = +1.96 each represent a proportion of 0.025. These tails are referred to as the lower and upper tails, respectively, and they correspond to the extreme values of the distribution that are far from the central part of the bell curve. These critical values are used in a two-tailed hypothesis test to determine whether to reject or fail to reject the null hypothesis. The null hypothesis represents the default assumption that there is no significant difference between the observed data and what would be expected under a specific condition.

If the calculated test statistic falls within the critical values, then the null hypothesis cannot be rejected at the 0.05 level of significance. However, if the calculated test statistic falls outside the critical values (orange-coloured areas in Figure 7), then the null hypothesis can be rejected in favour of the alternative hypothesis, suggesting that there is evidence of a significant difference between the observed data and what would be expected under the specified condition.

Previous

write a typical null hypothesis for a two tail test

User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

  • Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
  • Duis aute irure dolor in reprehenderit in voluptate
  • Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

S.3.2 hypothesis testing (p-value approach).

The P -value approach involves determining "likely" or "unlikely" by determining the probability — assuming the null hypothesis was true — of observing a more extreme test statistic in the direction of the alternative hypothesis than the one observed. If the P -value is small, say less than (or equal to) \(\alpha\), then it is "unlikely." And, if the P -value is large, say more than \(\alpha\), then it is "likely."

If the P -value is less than (or equal to) \(\alpha\), then the null hypothesis is rejected in favor of the alternative hypothesis. And, if the P -value is greater than \(\alpha\), then the null hypothesis is not rejected.

Specifically, the four steps involved in using the P -value approach to conducting any hypothesis test are:

  • Specify the null and alternative hypotheses.
  • Using the sample data and assuming the null hypothesis is true, calculate the value of the test statistic. Again, to conduct the hypothesis test for the population mean μ , we use the t -statistic \(t^*=\frac{\bar{x}-\mu}{s/\sqrt{n}}\) which follows a t -distribution with n - 1 degrees of freedom.
  • Using the known distribution of the test statistic, calculate the P -value : "If the null hypothesis is true, what is the probability that we'd observe a more extreme test statistic in the direction of the alternative hypothesis than we did?" (Note how this question is equivalent to the question answered in criminal trials: "If the defendant is innocent, what is the chance that we'd observe such extreme criminal evidence?")
  • Set the significance level, \(\alpha\), the probability of making a Type I error to be small — 0.01, 0.05, or 0.10. Compare the P -value to \(\alpha\). If the P -value is less than (or equal to) \(\alpha\), reject the null hypothesis in favor of the alternative hypothesis. If the P -value is greater than \(\alpha\), do not reject the null hypothesis.

Example S.3.2.1

Mean gpa section  .

In our example concerning the mean grade point average, suppose that our random sample of n = 15 students majoring in mathematics yields a test statistic t * equaling 2.5. Since n = 15, our test statistic t * has n - 1 = 14 degrees of freedom. Also, suppose we set our significance level α at 0.05 so that we have only a 5% chance of making a Type I error.

Right Tailed

The P -value for conducting the right-tailed test H 0 : μ = 3 versus H A : μ > 3 is the probability that we would observe a test statistic greater than t * = 2.5 if the population mean \(\mu\) really were 3. Recall that probability equals the area under the probability curve. The P -value is therefore the area under a t n - 1 = t 14 curve and to the right of the test statistic t * = 2.5. It can be shown using statistical software that the P -value is 0.0127. The graph depicts this visually.

t-distrbution graph showing the right tail beyond a t value of 2.5

The P -value, 0.0127, tells us it is "unlikely" that we would observe such an extreme test statistic t * in the direction of H A if the null hypothesis were true. Therefore, our initial assumption that the null hypothesis is true must be incorrect. That is, since the P -value, 0.0127, is less than \(\alpha\) = 0.05, we reject the null hypothesis H 0 : μ = 3 in favor of the alternative hypothesis H A : μ > 3.

Note that we would not reject H 0 : μ = 3 in favor of H A : μ > 3 if we lowered our willingness to make a Type I error to \(\alpha\) = 0.01 instead, as the P -value, 0.0127, is then greater than \(\alpha\) = 0.01.

Left Tailed

In our example concerning the mean grade point average, suppose that our random sample of n = 15 students majoring in mathematics yields a test statistic t * instead of equaling -2.5. The P -value for conducting the left-tailed test H 0 : μ = 3 versus H A : μ < 3 is the probability that we would observe a test statistic less than t * = -2.5 if the population mean μ really were 3. The P -value is therefore the area under a t n - 1 = t 14 curve and to the left of the test statistic t* = -2.5. It can be shown using statistical software that the P -value is 0.0127. The graph depicts this visually.

t distribution graph showing left tail below t value of -2.5

The P -value, 0.0127, tells us it is "unlikely" that we would observe such an extreme test statistic t * in the direction of H A if the null hypothesis were true. Therefore, our initial assumption that the null hypothesis is true must be incorrect. That is, since the P -value, 0.0127, is less than α = 0.05, we reject the null hypothesis H 0 : μ = 3 in favor of the alternative hypothesis H A : μ < 3.

Note that we would not reject H 0 : μ = 3 in favor of H A : μ < 3 if we lowered our willingness to make a Type I error to α = 0.01 instead, as the P -value, 0.0127, is then greater than \(\alpha\) = 0.01.

In our example concerning the mean grade point average, suppose again that our random sample of n = 15 students majoring in mathematics yields a test statistic t * instead of equaling -2.5. The P -value for conducting the two-tailed test H 0 : μ = 3 versus H A : μ ≠ 3 is the probability that we would observe a test statistic less than -2.5 or greater than 2.5 if the population mean μ really was 3. That is, the two-tailed test requires taking into account the possibility that the test statistic could fall into either tail (hence the name "two-tailed" test). The P -value is, therefore, the area under a t n - 1 = t 14 curve to the left of -2.5 and to the right of 2.5. It can be shown using statistical software that the P -value is 0.0127 + 0.0127, or 0.0254. The graph depicts this visually.

t-distribution graph of two tailed probability for t values of -2.5 and 2.5

Note that the P -value for a two-tailed test is always two times the P -value for either of the one-tailed tests. The P -value, 0.0254, tells us it is "unlikely" that we would observe such an extreme test statistic t * in the direction of H A if the null hypothesis were true. Therefore, our initial assumption that the null hypothesis is true must be incorrect. That is, since the P -value, 0.0254, is less than α = 0.05, we reject the null hypothesis H 0 : μ = 3 in favor of the alternative hypothesis H A : μ ≠ 3.

Note that we would not reject H 0 : μ = 3 in favor of H A : μ ≠ 3 if we lowered our willingness to make a Type I error to α = 0.01 instead, as the P -value, 0.0254, is then greater than \(\alpha\) = 0.01.

Now that we have reviewed the critical value and P -value approach procedures for each of the three possible hypotheses, let's look at three new examples — one of a right-tailed test, one of a left-tailed test, and one of a two-tailed test.

The good news is that, whenever possible, we will take advantage of the test statistics and P -values reported in statistical software, such as Minitab, to conduct our hypothesis tests in this course.

9.1 Null and Alternative Hypotheses

The actual test begins by considering two hypotheses . They are called the null hypothesis and the alternative hypothesis . These hypotheses contain opposing viewpoints.

H 0 , the — null hypothesis: a statement of no difference between sample means or proportions or no difference between a sample mean or proportion and a population mean or proportion. In other words, the difference equals 0.

H a —, the alternative hypothesis: a claim about the population that is contradictory to H 0 and what we conclude when we reject H 0 .

Since the null and alternative hypotheses are contradictory, you must examine evidence to decide if you have enough evidence to reject the null hypothesis or not. The evidence is in the form of sample data.

After you have determined which hypothesis the sample supports, you make a decision. There are two options for a decision. They are reject H 0 if the sample information favors the alternative hypothesis or do not reject H 0 or decline to reject H 0 if the sample information is insufficient to reject the null hypothesis.

Mathematical Symbols Used in H 0 and H a :

H 0 always has a symbol with an equal in it. H a never has a symbol with an equal in it. The choice of symbol depends on the wording of the hypothesis test. However, be aware that many researchers use = in the null hypothesis, even with > or < as the symbol in the alternative hypothesis. This practice is acceptable because we only make the decision to reject or not reject the null hypothesis.

Example 9.1

H 0 : No more than 30 percent of the registered voters in Santa Clara County voted in the primary election. p ≤ 30 H a : More than 30 percent of the registered voters in Santa Clara County voted in the primary election. p > 30

A medical trial is conducted to test whether or not a new medicine reduces cholesterol by 25 percent. State the null and alternative hypotheses.

Example 9.2

We want to test whether the mean GPA of students in American colleges is different from 2.0 (out of 4.0). The null and alternative hypotheses are the following: H 0 : μ = 2.0 H a : μ ≠ 2.0

We want to test whether the mean height of eighth graders is 66 inches. State the null and alternative hypotheses. Fill in the correct symbol (=, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

  • H 0 : μ __ 66
  • H a : μ __ 66

Example 9.3

We want to test if college students take fewer than five years to graduate from college, on the average. The null and alternative hypotheses are the following: H 0 : μ ≥ 5 H a : μ < 5

We want to test if it takes fewer than 45 minutes to teach a lesson plan. State the null and alternative hypotheses. Fill in the correct symbol ( =, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

  • H 0 : μ __ 45
  • H a : μ __ 45

Example 9.4

An article on school standards stated that about half of all students in France, Germany, and Israel take advanced placement exams and a third of the students pass. The same article stated that 6.6 percent of U.S. students take advanced placement exams and 4.4 percent pass. Test if the percentage of U.S. students who take advanced placement exams is more than 6.6 percent. State the null and alternative hypotheses. H 0 : p ≤ 0.066 H a : p > 0.066

On a state driver’s test, about 40 percent pass the test on the first try. We want to test if more than 40 percent pass on the first try. Fill in the correct symbol (=, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

  • H 0 : p __ 0.40
  • H a : p __ 0.40

Collaborative Exercise

Bring to class a newspaper, some news magazines, and some internet articles. In groups, find articles from which your group can write null and alternative hypotheses. Discuss your hypotheses with the rest of the class.

As an Amazon Associate we earn from qualifying purchases.

This book may not be used in the training of large language models or otherwise be ingested into large language models or generative AI offerings without OpenStax's permission.

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution License and you must attribute Texas Education Agency (TEA). The original material is available at: https://www.texasgateway.org/book/tea-statistics . Changes were made to the original material, including updates to art, structure, and other content updates.

Access for free at https://openstax.org/books/statistics/pages/1-introduction
  • Authors: Barbara Illowsky, Susan Dean
  • Publisher/website: OpenStax
  • Book title: Statistics
  • Publication date: Mar 27, 2020
  • Location: Houston, Texas
  • Book URL: https://openstax.org/books/statistics/pages/1-introduction
  • Section URL: https://openstax.org/books/statistics/pages/9-1-null-and-alternative-hypotheses

© Jan 23, 2024 Texas Education Agency (TEA). The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.

write a typical null hypothesis for a two tail test

Hypothesis Testing for Means & Proportions

  •   1  
  • |   2  
  • |   3  
  • |   4  
  • |   5  
  • |   6  
  • |   7  
  • |   8  
  • |   9  
  • |   10  

On This Page sidebar

Hypothesis Testing: Upper-, Lower, and Two Tailed Tests

Type i and type ii errors.

Learn More sidebar

All Modules

More Resources sidebar

Z score Table

t score Table

The procedure for hypothesis testing is based on the ideas described above. Specifically, we set up competing hypotheses, select a random sample from the population of interest and compute summary statistics. We then determine whether the sample data supports the null or alternative hypotheses. The procedure can be broken down into the following five steps.  

  • Step 1. Set up hypotheses and select the level of significance α.

H 0 : Null hypothesis (no change, no difference);  

H 1 : Research hypothesis (investigator's belief); α =0.05

  • Step 2. Select the appropriate test statistic.  

The test statistic is a single number that summarizes the sample information.   An example of a test statistic is the Z statistic computed as follows:

When the sample size is small, we will use t statistics (just as we did when constructing confidence intervals for small samples). As we present each scenario, alternative test statistics are provided along with conditions for their appropriate use.

  • Step 3.  Set up decision rule.  

The decision rule is a statement that tells under what circumstances to reject the null hypothesis. The decision rule is based on specific values of the test statistic (e.g., reject H 0 if Z > 1.645). The decision rule for a specific test depends on 3 factors: the research or alternative hypothesis, the test statistic and the level of significance. Each is discussed below.

  • The decision rule depends on whether an upper-tailed, lower-tailed, or two-tailed test is proposed. In an upper-tailed test the decision rule has investigators reject H 0 if the test statistic is larger than the critical value. In a lower-tailed test the decision rule has investigators reject H 0 if the test statistic is smaller than the critical value.  In a two-tailed test the decision rule has investigators reject H 0 if the test statistic is extreme, either larger than an upper critical value or smaller than a lower critical value.
  • The exact form of the test statistic is also important in determining the decision rule. If the test statistic follows the standard normal distribution (Z), then the decision rule will be based on the standard normal distribution. If the test statistic follows the t distribution, then the decision rule will be based on the t distribution. The appropriate critical value will be selected from the t distribution again depending on the specific alternative hypothesis and the level of significance.  
  • The third factor is the level of significance. The level of significance which is selected in Step 1 (e.g., α =0.05) dictates the critical value.   For example, in an upper tailed Z test, if α =0.05 then the critical value is Z=1.645.  

The following figures illustrate the rejection regions defined by the decision rule for upper-, lower- and two-tailed Z tests with α=0.05. Notice that the rejection regions are in the upper, lower and both tails of the curves, respectively. The decision rules are written below each figure.

Standard normal distribution with lower tail at -1.645 and alpha=0.05

Rejection Region for Lower-Tailed Z Test (H 1 : μ < μ 0 ) with α =0.05

The decision rule is: Reject H 0 if Z < 1.645.

Standard normal distribution with two tails

Rejection Region for Two-Tailed Z Test (H 1 : μ ≠ μ 0 ) with α =0.05

The decision rule is: Reject H 0 if Z < -1.960 or if Z > 1.960.

The complete table of critical values of Z for upper, lower and two-tailed tests can be found in the table of Z values to the right in "Other Resources."

Critical values of t for upper, lower and two-tailed tests can be found in the table of t values in "Other Resources."

  • Step 4. Compute the test statistic.  

Here we compute the test statistic by substituting the observed sample data into the test statistic identified in Step 2.

  • Step 5. Conclusion.  

The final conclusion is made by comparing the test statistic (which is a summary of the information observed in the sample) to the decision rule. The final conclusion will be either to reject the null hypothesis (because the sample data are very unlikely if the null hypothesis is true) or not to reject the null hypothesis (because the sample data are not very unlikely).  

If the null hypothesis is rejected, then an exact significance level is computed to describe the likelihood of observing the sample data assuming that the null hypothesis is true. The exact level of significance is called the p-value and it will be less than the chosen level of significance if we reject H 0 .

Statistical computing packages provide exact p-values as part of their standard output for hypothesis tests. In fact, when using a statistical computing package, the steps outlined about can be abbreviated. The hypotheses (step 1) should always be set up in advance of any analysis and the significance criterion should also be determined (e.g., α =0.05). Statistical computing packages will produce the test statistic (usually reporting the test statistic as t) and a p-value. The investigator can then determine statistical significance using the following: If p < α then reject H 0 .  

  • Step 1. Set up hypotheses and determine level of significance

H 0 : μ = 191 H 1 : μ > 191                 α =0.05

The research hypothesis is that weights have increased, and therefore an upper tailed test is used.

  • Step 2. Select the appropriate test statistic.

Because the sample size is large (n > 30) the appropriate test statistic is

  • Step 3. Set up decision rule.  

In this example, we are performing an upper tailed test (H 1 : μ> 191), with a Z test statistic and selected α =0.05.   Reject H 0 if Z > 1.645.

We now substitute the sample data into the formula for the test statistic identified in Step 2.  

We reject H 0 because 2.38 > 1.645. We have statistically significant evidence at a =0.05, to show that the mean weight in men in 2006 is more than 191 pounds. Because we rejected the null hypothesis, we now approximate the p-value which is the likelihood of observing the sample data if the null hypothesis is true. An alternative definition of the p-value is the smallest level of significance where we can still reject H 0 . In this example, we observed Z=2.38 and for α=0.05, the critical value was 1.645. Because 2.38 exceeded 1.645 we rejected H 0 . In our conclusion we reported a statistically significant increase in mean weight at a 5% level of significance. Using the table of critical values for upper tailed tests, we can approximate the p-value. If we select α=0.025, the critical value is 1.96, and we still reject H 0 because 2.38 > 1.960. If we select α=0.010 the critical value is 2.326, and we still reject H 0 because 2.38 > 2.326. However, if we select α=0.005, the critical value is 2.576, and we cannot reject H 0 because 2.38 < 2.576. Therefore, the smallest α where we still reject H 0 is 0.010. This is the p-value. A statistical computing package would produce a more precise p-value which would be in between 0.005 and 0.010. Here we are approximating the p-value and would report p < 0.010.                  

In all tests of hypothesis, there are two types of errors that can be committed. The first is called a Type I error and refers to the situation where we incorrectly reject H 0 when in fact it is true. This is also called a false positive result (as we incorrectly conclude that the research hypothesis is true when in fact it is not). When we run a test of hypothesis and decide to reject H 0 (e.g., because the test statistic exceeds the critical value in an upper tailed test) then either we make a correct decision because the research hypothesis is true or we commit a Type I error. The different conclusions are summarized in the table below. Note that we will never know whether the null hypothesis is really true or false (i.e., we will never know which row of the following table reflects reality).

Table - Conclusions in Test of Hypothesis

In the first step of the hypothesis test, we select a level of significance, α, and α= P(Type I error). Because we purposely select a small value for α, we control the probability of committing a Type I error. For example, if we select α=0.05, and our test tells us to reject H 0 , then there is a 5% probability that we commit a Type I error. Most investigators are very comfortable with this and are confident when rejecting H 0 that the research hypothesis is true (as it is the more likely scenario when we reject H 0 ).

When we run a test of hypothesis and decide not to reject H 0 (e.g., because the test statistic is below the critical value in an upper tailed test) then either we make a correct decision because the null hypothesis is true or we commit a Type II error. Beta (β) represents the probability of a Type II error and is defined as follows: β=P(Type II error) = P(Do not Reject H 0 | H 0 is false). Unfortunately, we cannot choose β to be small (e.g., 0.05) to control the probability of committing a Type II error because β depends on several factors including the sample size, α, and the research hypothesis. When we do not reject H 0 , it may be very likely that we are committing a Type II error (i.e., failing to reject H 0 when in fact it is false). Therefore, when tests are run and the null hypothesis is not rejected we often make a weak concluding statement allowing for the possibility that we might be committing a Type II error. If we do not reject H 0 , we conclude that we do not have significant evidence to show that H 1 is true. We do not conclude that H 0 is true.

Lightbulb icon signifying an important idea

 The most common reason for a Type II error is a small sample size.

return to top | previous page | next page

Content ©2017. All Rights Reserved. Date last modified: November 6, 2017. Wayne W. LaMorte, MD, PhD, MPH

Logo for BCcampus Open Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Chapter 13: Inferential Statistics

Some Basic Null Hypothesis Tests

Learning Objectives

  • Conduct and interpret one-sample, dependent-samples, and independent-samples  t  tests.
  • Interpret the results of one-way, repeated measures, and factorial ANOVAs.
  • Conduct and interpret null hypothesis tests of Pearson’s  r .

In this section, we look at several common null hypothesis testing procedures. The emphasis here is on providing enough information to allow you to conduct and interpret the most basic versions. In most cases, the online statistical analysis tools mentioned in  Chapter 12 will handle the computations—as will programs such as Microsoft Excel and SPSS.

The  t  Test

As we have seen throughout this book, many studies in psychology focus on the difference between two means. The most common null hypothesis test for this type of statistical relationship is the  t test . In this section, we look at three types of  t  tests that are used for slightly different research designs: the one-sample  t test, the dependent-samples  t  test, and the independent-samples  t  test.

One-Sample  t  Test

The  one-sample  t test  is used to compare a sample mean ( M ) with a hypothetical population mean (μ0) that provides some interesting standard of comparison. The null hypothesis is that the mean for the population (µ) is equal to the hypothetical population mean: μ = μ0. The alternative hypothesis is that the mean for the population is different from the hypothetical population mean: μ ≠ μ0. To decide between these two hypotheses, we need to find the probability of obtaining the sample mean (or one more extreme) if the null hypothesis were true. But finding this  p  value requires first computing a test statistic called  t . (A test statistic  is a statistic that is computed only to help find the  p  value.) The formula for  t  is as follows:

\[t=\dfrac{M-\mu_0}{\left(\dfrac{SD}{\sqrt{N}}\right)}\]

Again,  M  is the sample mean and µ 0  is the hypothetical population mean of interest.  SD  is the sample standard deviation and  N  is the sample size.

The reason the  t  statistic (or any test statistic) is useful is that we know how it is distributed when the null hypothesis is true. As shown in Figure 13.1, this distribution is unimodal and symmetrical, and it has a mean of 0. Its precise shape depends on a statistical concept called the degrees of freedom, which for a one-sample  t  test is  N  − 1. (There are 24 degrees of freedom for the distribution shown in Figure 13.1.) The important point is that knowing this distribution makes it possible to find the  p value for any  t  score. Consider, for example, a  t  score of +1.50 based on a sample of 25. The probability of a  t  score at least this extreme is given by the proportion of  t  scores in the distribution that are at least this extreme. For now, let us define  extreme  as being far from zero in either direction. Thus the  p  value is the proportion of  t  scores that are +1.50 or above  or  that are −1.50 or below—a value that turns out to be .14.

Graph with one-tailed critical values of ±1.711 and two-tailed critical values of ±2.262.

Fortunately, we do not have to deal directly with the distribution of  t  scores. If we were to enter our sample data and hypothetical mean of interest into one of the online statistical tools in  Chapter 12 or into a program like SPSS (Excel does not have a one-sample  t  test function), the output would include both the  t  score and the  p  value. At this point, the rest of the procedure is simple. If  p  is less than .05, we reject the null hypothesis and conclude that the population mean differs from the hypothetical mean of interest. If  p  is greater than .05, we retain the null hypothesis and conclude that there is not enough evidence to say that the population mean differs from the hypothetical mean of interest. (Again, technically, we conclude only that we do not have enough evidence to conclude that it  does  differ.)

If we were to compute the  t  score by hand, we could use a table like Table 13.2 to make the decision. This table does not provide actual  p  values. Instead, it provides the  critical values  of  t  for different degrees of freedom ( df)  when α is .05. For now, let us focus on the two-tailed critical values in the last column of the table. Each of these values should be interpreted as a pair of values: one positive and one negative. For example, the two-tailed critical values when there are 24 degrees of freedom are +2.064 and −2.064. These are represented by the red vertical lines in Figure 13.1. The idea is that any  t  score below the lower critical value (the left-hand red line in Figure 13.1) is in the lowest 2.5% of the distribution, while any  t  score above the upper critical value (the right-hand red line) is in the highest 2.5% of the distribution. Therefore any  t  score beyond the critical value in  either  direction is in the most extreme 5% of  t  scores when the null hypothesis is true and has a  p  value less than .05. Thus if the  t  score we compute is beyond the critical value in either direction, then we reject the null hypothesis. If the  t  score we compute is between the upper and lower critical values, then we retain the null hypothesis.

Thus far, we have considered what is called a  two-tailed test , where we reject the null hypothesis if the  t  score for the sample is extreme in either direction. This test makes sense when we believe that the sample mean might differ from the hypothetical population mean but we do not have good reason to expect the difference to go in a particular direction. But it is also possible to do a  one-tailed test , where we reject the null hypothesis only if the  t  score for the sample is extreme in one direction that we specify before collecting the data. This test makes sense when we have good reason to expect the sample mean will differ from the hypothetical population mean in a particular direction.

Here is how it works. Each one-tailed critical value in Table 13.2 can again be interpreted as a pair of values: one positive and one negative. A  t  score below the lower critical value is in the lowest 5% of the distribution, and a  t  score above the upper critical value is in the highest 5% of the distribution. For 24 degrees of freedom, these values are −1.711 and +1.711. (These are represented by the green vertical lines in Figure 13.1.) However, for a one-tailed test, we must decide before collecting data whether we expect the sample mean to be lower than the hypothetical population mean, in which case we would use only the lower critical value, or we expect the sample mean to be greater than the hypothetical population mean, in which case we would use only the upper critical value. Notice that we still reject the null hypothesis when the  t  score for our sample is in the most extreme 5% of the t scores we would expect if the null hypothesis were true—so α remains at .05. We have simply redefined  extreme  to refer only to one tail of the distribution. The advantage of the one-tailed test is that critical values are less extreme. If the sample mean differs from the hypothetical population mean in the expected direction, then we have a better chance of rejecting the null hypothesis. The disadvantage is that if the sample mean differs from the hypothetical population mean in the unexpected direction, then there is no chance at all of rejecting the null hypothesis.

Example One-Sample t Test

Imagine that a health psychologist is interested in the accuracy of university students’ estimates of the number of calories in a chocolate chip cookie. He shows the cookie to a sample of 10 students and asks each one to estimate the number of calories in it. Because the actual number of calories in the cookie is 250, this is the hypothetical population mean of interest (µ 0 ). The null hypothesis is that the mean estimate for the population (μ) is 250. Because he has no real sense of whether the students will underestimate or overestimate the number of calories, he decides to do a two-tailed test. Now imagine further that the participants’ actual estimates are as follows:

250, 280, 200, 150, 175, 200, 200, 220, 180, 250

The mean estimate for the sample ( M ) is 212.00 calories and the standard deviation ( SD ) is 39.17. The health psychologist can now compute the  t  score for his sample:

\[t=\dfrac{212-250}{\left(\dfrac{39.17}{\sqrt{10}}\right)}=-3.07\]

If he enters the data into one of the online analysis tools or uses SPSS, it would also tell him that the two-tailed  p  value for this  t  score (with 10 − 1 = 9 degrees of freedom) is .013. Because this is less than .05, the health psychologist would reject the null hypothesis and conclude that university students tend to underestimate the number of calories in a chocolate chip cookie. If he computes the  t  score by hand, he could look at Table 13.2 and see that the critical value of  t  for a two-tailed test with 9 degrees of freedom is ±2.262. The fact that his  t  score was more extreme than this critical value would tell him that his  p  value is less than .05 and that he should reject the null hypothesis.

Finally, if this researcher had gone into this study with good reason to expect that university students underestimate the number of calories, then he could have done a one-tailed test instead of a two-tailed test. The only thing this decision would change is the critical value, which would be −1.833. This slightly less extreme value would make it a bit easier to reject the null hypothesis. However, if it turned out that university students overestimate the number of calories—no matter how much they overestimate it—the researcher would not have been able to reject the null hypothesis.

The Dependent-Samples t Test

The  dependent-samples t test  (sometimes called the paired-samples  t  test) is used to compare two means for the same sample tested at two different times or under two different conditions. This comparison is appropriate for pretest-posttest designs or within-subjects experiments. The null hypothesis is that the means at the two times or under the two conditions are the same in the population. The alternative hypothesis is that they are not the same. This test can also be one-tailed if the researcher has good reason to expect the difference goes in a particular direction.

It helps to think of the dependent-samples  t  test as a special case of the one-sample  t  test. However, the first step in the dependent-samples  t  test is to reduce the two scores for each participant to a single  difference score  by taking the difference between them. At this point, the dependent-samples  t  test becomes a one-sample  t  test on the difference scores. The hypothetical population mean (µ 0 ) of interest is 0 because this is what the mean difference score would be if there were no difference on average between the two times or two conditions. We can now think of the null hypothesis as being that the mean difference score in the population is 0 (µ 0  = 0) and the alternative hypothesis as being that the mean difference score in the population is not 0 (µ 0  ≠ 0).

Example Dependent-Samples t Test

Imagine that the health psychologist now knows that people tend to underestimate the number of calories in junk food and has developed a short training program to improve their estimates. To test the effectiveness of this program, he conducts a pretest-posttest study in which 10 participants estimate the number of calories in a chocolate chip cookie before the training program and then again afterward. Because he expects the program to increase the participants’ estimates, he decides to do a one-tailed test. Now imagine further that the pretest estimates are

230, 250, 280, 175, 150, 200, 180, 210, 220, 190

and that the posttest estimates (for the same participants in the same order) are

250, 260, 250, 200, 160, 200, 200, 180, 230, 240

The difference scores, then, are as follows:

+20, +10, −30, +25, +10, 0, +20, −30, +10, +50

Note that it does not matter whether the first set of scores is subtracted from the second or the second from the first as long as it is done the same way for all participants. In this example, it makes sense to subtract the pretest estimates from the posttest estimates so that positive difference scores mean that the estimates went up after the training and negative difference scores mean the estimates went down.

The mean of the difference scores is 8.50 with a standard deviation of 27.27. The health psychologist can now compute the  t  score for his sample as follows:

\[t=\dfrac{8.5-0}{\left(\dfrac{27.27}{\sqrt{10}}\right)}=1.11\]

If he enters the data into one of the online analysis tools or uses Excel or SPSS, it would tell him that the one-tailed  p  value for this  t  score (again with 10 − 1 = 9 degrees of freedom) is .148. Because this is greater than .05, he would retain the null hypothesis and conclude that the training program does not increase people’s calorie estimates. If he were to compute the  t  score by hand, he could look at Table 13.2 and see that the critical value of  t for a one-tailed test with 9 degrees of freedom is +1.833. (It is positive this time because he was expecting a positive mean difference score.) The fact that his  t score was less extreme than this critical value would tell him that his  p  value is greater than .05 and that he should fail to reject the null hypothesis.

The Independent-Samples  t  Test

The  independent-samples  t test  is used to compare the means of two separate samples ( M 1  and  M 2 ). The two samples might have been tested under different conditions in a between-subjects experiment, or they could be preexisting groups in a correlational design (e.g., women and men, extraverts and introverts). The null hypothesis is that the means of the two populations are the same: µ 1  = µ 2 . The alternative hypothesis is that they are not the same: µ 1  ≠ µ 2 . Again, the test can be one-tailed if the researcher has good reason to expect the difference goes in a particular direction.

The  t  statistic here is a bit more complicated because it must take into account two sample means, two standard deviations, and two sample sizes. The formula is as follows:

\[t=\dfrac{M_1-M_2}{\sqrt{\dfrac{{SD_1}^2}{n_1}+\dfrac{{SD_2}^2}{n_2}}}\]

Notice that this formula includes squared standard deviations (the variances) that appear inside the square root symbol. Also, lowercase  n 1  and  n 2  refer to the sample sizes in the two groups or condition (as opposed to capital  N , which generally refers to the total sample size). The only additional thing to know here is that there are  N  − 2 degrees of freedom for the independent-samples  t  test.

Example Independent-Samples t Test

Now the health psychologist wants to compare the calorie estimates of people who regularly eat junk food with the estimates of people who rarely eat junk food. He believes the difference could come out in either direction so he decides to conduct a two-tailed test. He collects data from a sample of eight participants who eat junk food regularly and seven participants who rarely eat junk food. The data are as follows:

Junk food eaters: 180, 220, 150, 85, 200, 170, 150, 190

Non–junk food eaters: 200, 240, 190, 175, 200, 300, 240

The mean for the junk food eaters is 220.71 with a standard deviation of 41.23. The mean for the non–junk food eaters is 168.12 with a standard deviation of 42.66. He can now compute his  t  score as follows:

\[t=\dfrac{220.71-168.12}{\sqrt{\dfrac{41.23^2}{8}+\dfrac{42.66^2}{7}}}=2.42\]

If he enters the data into one of the online analysis tools or uses Excel or SPSS, it would tell him that the two-tailed  p  value for this  t  score (with 15 − 2 = 13 degrees of freedom) is .015. Because this p value is less than .05, the health psychologist would reject the null hypothesis and conclude that people who eat junk food regularly make lower calorie estimates than people who eat it rarely. If he were to compute the  t  score by hand, he could look at Table 13.2 and see that the critical value of  t  for a two-tailed test with 13 degrees of freedom is ±2.160. The fact that his  t  score was more extreme than this critical value would tell him that his  p  value is less than .05 and that he should fail to retain the null hypothesis.

The Analysis of Variance

When there are more than two groups or condition means to be compared, the most common null hypothesis test is the  analysis of variance  (ANOVA) . In this section, we look primarily at the  one-way ANOVA , which is used for between-subjects designs with a single independent variable. We then briefly consider some other versions of the ANOVA that are used for within-subjects and factorial research designs.

One-Way ANOVA

The one-way ANOVA is used to compare the means of more than two samples ( M 1 ,  M 2 … M G ) in a between-subjects design. The null hypothesis is that all the means are equal in the population: µ 1 = µ 2  =…= µ G . The alternative hypothesis is that not all the means in the population are equal.

The test statistic for the ANOVA is called  F . It is a ratio of two estimates of the population variance based on the sample data. One estimate of the population variance is called the  mean squares between groups (MS B )  and is based on the differences among the sample means. The other is called the mean squares within groups (MS W )  and is based on the differences among the scores within each group. The  F  statistic is the ratio of the  MS B  to the  MS W and can therefore be expressed as follows:

F = MS B ÷ MS W

Again, the reason that  F  is useful is that we know how it is distributed when the null hypothesis is true. As shown in Figure 13.2, this distribution is unimodal and positively skewed with values that cluster around 1. The precise shape of the distribution depends on both the number of groups and the sample size, and there is a degrees of freedom value associated with each of these. The between-groups degrees of freedom is the number of groups minus one:  df B  = ( G  − 1). The within-groups degrees of freedom is the total sample size minus the number of groups:  df W  =  N  −  G . Again, knowing the distribution of  F when the null hypothesis is true allows us to find the  p  value.

Line graph with a peak after 0, then a sharp descent. Critical value is approximately 2.8.

The online tools in  Chapter 12 and statistical software such as Excel and SPSS will compute  F  and find the  p  value. If  p  is less than .05, then we reject the null hypothesis and conclude that there are differences among the group means in the population. If  p  is greater than .05, then we retain the null hypothesis and conclude that there is not enough evidence to say that there are differences. In the unlikely event that we would compute  F  by hand, we can use a table of critical values like Table 13.3 “Table of Critical Values of ” to make the decision. The idea is that any  F  ratio greater than the critical value has a  p value of less than .05. Thus if the  F  ratio we compute is beyond the critical value, then we reject the null hypothesis. If the F ratio we compute is less than the critical value, then we retain the null hypothesis.

Example One-Way ANOVA

Imagine that the health psychologist wants to compare the calorie estimates of psychology majors, nutrition majors, and professional dieticians. He collects the following data:

Psych majors: 200, 180, 220, 160, 150, 200, 190, 200

Nutrition majors: 190, 220, 200, 230, 160, 150, 200, 210, 195

Dieticians: 220, 250, 240, 275, 250, 230, 200, 240

The means are 187.50 ( SD  = 23.14), 195.00 ( SD  = 27.77), and 238.13 ( SD  = 22.35), respectively. So it appears that dieticians made substantially more accurate estimates on average. The researcher would almost certainly enter these data into a program such as Excel or SPSS, which would compute  F  for him and find the  p  value. Table 13.4 shows the output of the one-way ANOVA function in Excel for these data. This table is referred to as an ANOVA table. It shows that  MS B  is 5,971.88,  MS W  is 602.23, and their ratio,  F , is 9.92. The  p  value is .0009. Because this value is below .05, the researcher would reject the null hypothesis and conclude that the mean calorie estimates for the three groups are not the same in the population. Notice that the ANOVA table also includes the “sum of squares” ( SS ) for between groups and for within groups. These values are computed on the way to finding  MS B  and MS W  but are not typically reported by the researcher. Finally, if the researcher were to compute the  F  ratio by hand, he could look at Table 13.3 and see that the critical value of  F  with 2 and 21 degrees of freedom is 3.467 (the same value in Table 13.4 under  F crit ). The fact that his  F  score was more extreme than this critical value would tell him that his  p  value is less than .05 and that he should reject the null hypothesis.

ANOVA Elaborations

Post hoc comparisons.

When we reject the null hypothesis in a one-way ANOVA, we conclude that the group means are not all the same in the population. But this can indicate different things. With three groups, it can indicate that all three means are significantly different from each other. Or it can indicate that one of the means is significantly different from the other two, but the other two are not significantly different from each other. It could be, for example, that the mean calorie estimates of psychology majors, nutrition majors, and dieticians are all significantly different from each other. Or it could be that the mean for dieticians is significantly different from the means for psychology and nutrition majors, but the means for psychology and nutrition majors are not significantly different from each other. For this reason, statistically significant one-way ANOVA results are typically followed up with a series of  post hoc comparisons  of selected pairs of group means to determine which are different from which others.

One approach to post hoc comparisons would be to conduct a series of independent-samples  t  tests comparing each group mean to each of the other group means. But there is a problem with this approach. In general, if we conduct a  t  test when the null hypothesis is true, we have a 5% chance of mistakenly rejecting the null hypothesis (see Section 13.3 “Additional Considerations” for more on such Type I errors). If we conduct several  t  tests when the null hypothesis is true, the chance of mistakenly rejecting  at least one null hypothesis increases with each test we conduct. Thus researchers do not usually make post hoc comparisons using standard  t  tests because there is too great a chance that they will mistakenly reject at least one null hypothesis. Instead, they use one of several modified  t  test procedures—among them the Bonferonni procedure, Fisher’s least significant difference (LSD) test, and Tukey’s honestly significant difference (HSD) test. The details of these approaches are beyond the scope of this book, but it is important to understand their purpose. It is to keep the risk of mistakenly rejecting a true null hypothesis to an acceptable level (close to 5%).

Repeated-Measures ANOVA

Recall that the one-way ANOVA is appropriate for between-subjects designs in which the means being compared come from separate groups of participants. It is not appropriate for within-subjects designs in which the means being compared come from the same participants tested under different conditions or at different times. This requires a slightly different approach, called the repeated-measures ANOVA . The basics of the repeated-measures ANOVA are the same as for the one-way ANOVA. The main difference is that measuring the dependent variable multiple times for each participant allows for a more refined measure of  MS W . Imagine, for example, that the dependent variable in a study is a measure of reaction time. Some participants will be faster or slower than others because of stable individual differences in their nervous systems, muscles, and other factors. In a between-subjects design, these stable individual differences would simply add to the variability within the groups and increase the value of  MS W . In a within-subjects design, however, these stable individual differences can be measured and subtracted from the value of  MS W . This lower value of  MS W  means a higher value of  F  and a more sensitive test.

Factorial ANOVA

When more than one independent variable is included in a factorial design, the appropriate approach is the  factorial ANOVA . Again, the basics of the factorial ANOVA are the same as for the one-way and repeated-measures ANOVAs. The main difference is that it produces an  F  ratio and  p  value for each main effect and for each interaction. Returning to our calorie estimation example, imagine that the health psychologist tests the effect of participant major (psychology vs. nutrition) and food type (cookie vs. hamburger) in a factorial design. A factorial ANOVA would produce separate  F  ratios and  p values for the main effect of major, the main effect of food type, and the interaction between major and food. Appropriate modifications must be made depending on whether the design is between subjects, within subjects, or mixed.

Testing Pearson’s  r

For relationships between quantitative variables, where Pearson’s  r  is used to describe the strength of those relationships, the appropriate null hypothesis test is a test of Pearson’s  r . The basic logic is exactly the same as for other null hypothesis tests. In this case, the null hypothesis is that there is no relationship in the population. We can use the Greek lowercase rho (ρ) to represent the relevant parameter: ρ = 0. The alternative hypothesis is that there is a relationship in the population: ρ ≠ 0. As with the  t  test, this test can be two-tailed if the researcher has no expectation about the direction of the relationship or one-tailed if the researcher expects the relationship to go in a particular direction.

It is possible to use Pearson’s  r  for the sample to compute a  t  score with  N  − 2 degrees of freedom and then to proceed as for a  t  test. However, because of the way it is computed, Pearson’s  r  can also be treated as its own test statistic. The online statistical tools and statistical software such as Excel and SPSS generally compute Pearson’s  r  and provide the  p  value associated with that value of Pearson’s  r . As always, if the  p  value is less than .05, we reject the null hypothesis and conclude that there is a relationship between the variables in the population. If the  p  value is greater than .05, we retain the null hypothesis and conclude that there is not enough evidence to say there is a relationship in the population. If we compute Pearson’s  r  by hand, we can use a table like Table 13.5, which shows the critical values of  r  for various samples sizes when α is .05. A sample value of Pearson’s  r  that is more extreme than the critical value is statistically significant.

Example Test of Pearson’s  r

Imagine that the health psychologist is interested in the correlation between people’s calorie estimates and their weight. He has no expectation about the direction of the relationship, so he decides to conduct a two-tailed test. He computes the correlation for a sample of 22 university students and finds that Pearson’s  r  is −.21. The statistical software he uses tells him that the  p  value is .348. It is greater than .05, so he retains the null hypothesis and concludes that there is no relationship between people’s calorie estimates and their weight. If he were to compute Pearson’s  r  by hand, he could look at Table 13.5 and see that the critical value for 22 − 2 = 20 degrees of freedom is .444. The fact that Pearson’s  r  for the sample is less extreme than this critical value tells him that the  p  value is greater than .05 and that he should retain the null hypothesis.

Key Takeaways

  • To compare two means, the most common null hypothesis test is the  t  test. The one-sample  t  test is used for comparing one sample mean with a hypothetical population mean of interest, the dependent-samples  t  test is used to compare two means in a within-subjects design, and the independent-samples  t  test is used to compare two means in a between-subjects design.
  • To compare more than two means, the most common null hypothesis test is the analysis of variance (ANOVA). The one-way ANOVA is used for between-subjects designs with one independent variable, the repeated-measures ANOVA is used for within-subjects designs, and the factorial ANOVA is used for factorial designs.
  • A null hypothesis test of Pearson’s  r  is used to compare a sample value of Pearson’s  r  with a hypothetical population value of 0.
  • Practice: Use one of the online tools, Excel, or SPSS to reproduce the one-sample  t  test, dependent-samples  t  test, independent-samples  t  test, and one-way ANOVA for the four sets of calorie estimation data presented in this section.
  • Practice: A sample of 25 university students rated their friendliness on a scale of 1 ( Much Lower Than Average ) to 7 ( Much Higher Than Average ). Their mean rating was 5.30 with a standard deviation of 1.50. Conduct a one-sample  t test comparing their mean rating with a hypothetical mean rating of 4 ( Average ). The question is whether university students have a tendency to rate themselves as friendlier than average.
  • The correlation between height and IQ is +.13 in a sample of 35.
  • For a sample of 88 university students, the correlation between how disgusted they felt and the harshness of their moral judgments was +.23.
  • The correlation between the number of daily hassles and positive mood is −.43 for a sample of 30 middle-aged adults.

A common null hypothesis test examining the difference between two means.

Compares a sample mean with a hypothetical population mean that provides some interesting standard of comparison.

A statistic that is computed only to help find the p value.

Points on the test distribution that are compared to the test statistic to determine whether to reject the null hypothesis.

The null hypothesis is rejected if the t score for the sample is extreme in either direction.

Where the null hypothesis is rejected only if the t score for the sample is extreme in one direction that we specify before collecting the data.

Statistical test used to compare two means for the same sample tested at two different times or under two different conditions.

Variable formed by subtracting one variable from another.

Statistical test used to compare the means of two separate samples.

Most common null hypothesis test when there are more than two groups or condition means to be compared.

A null hypothesis test that is used for between-between subjects designs with a single independent variable.

An estimate of population variance based on the differences among the sample means.

An estimate of population variance based on the differences among the scores within each group.

Analysis of selected pairs of group means to determine which are different from which others.

The dependent variable is measured multiple times for each participant, allowing a more refined measure of MSW.

A null hypothesis test that is used when more than one independent variable is included in a factorial design.

Research Methods in Psychology - 2nd Canadian Edition Copyright © 2015 by Paul C. Price, Rajiv Jhangiani, & I-Chant A. Chiang is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

write a typical null hypothesis for a two tail test

If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

To log in and use all the features of Khan Academy, please enable JavaScript in your browser.

AP®︎/College Statistics

Course: ap®︎/college statistics   >   unit 10.

  • Idea behind hypothesis testing

Examples of null and alternative hypotheses

  • Writing null and alternative hypotheses
  • P-values and significance tests
  • Comparing P-values to different significance levels
  • Estimating a P-value from a simulation
  • Estimating P-values from simulations
  • Using P-values to make conclusions

Want to join the conversation?

  • Upvote Button navigates to signup page
  • Downvote Button navigates to signup page
  • Flag Button navigates to signup page

Good Answer

Video transcript

Statology

Statistics Made Easy

Two Sample t-test: Definition, Formula, and Example

A two sample t-test is used to determine whether or not two population means are equal.

This tutorial explains the following:

  • The motivation for performing a two sample t-test.
  • The formula to perform a two sample t-test.
  • The assumptions that should be met to perform a two sample t-test.
  • An example of how to perform a two sample t-test.

Two Sample t-test: Motivation

Suppose we want to know whether or not the mean weight between two different species of turtles is equal. Since there are thousands of turtles in each population, it would be too time-consuming and costly to go around and weigh each individual turtle.

Instead, we might take a simple random sample of 15 turtles from each population and use the mean weight in each sample to determine if the mean weight is equal between the two populations:

Two sample t-test example

However, it’s virtually guaranteed that the mean weight between the two samples will be at least a little different. The question is whether or not this difference is statistically significant . Fortunately, a two sample t-test allows us to answer this question.

Two Sample t-test: Formula

A two-sample t-test always uses the following null hypothesis:

  • H 0 : μ 1  = μ 2 (the two population means are equal)

The alternative hypothesis can be either two-tailed, left-tailed, or right-tailed:

  • H 1 (two-tailed): μ 1  ≠ μ 2 (the two population means are not equal)
  • H 1 (left-tailed): μ 1  < μ 2  (population 1 mean is less than population 2 mean)
  • H 1 (right-tailed):  μ 1 > μ 2  (population 1 mean is greater than population 2 mean)

We use the following formula to calculate the test statistic t:

Test statistic:  ( x 1  –  x 2 )  /  s p (√ 1/n 1  + 1/n 2 )

where  x 1  and  x 2 are the sample means, n 1 and n 2  are the sample sizes, and where s p is calculated as:

s p = √  (n 1 -1)s 1 2  +  (n 2 -1)s 2 2  /  (n 1 +n 2 -2)

where s 1 2  and s 2 2  are the sample variances.

If the p-value that corresponds to the test statistic t with (n 1 +n 2 -1) degrees of freedom is less than your chosen significance level (common choices are 0.10, 0.05, and 0.01) then you can reject the null hypothesis.

Two Sample t-test: Assumptions

For the results of a two sample t-test to be valid, the following assumptions should be met:

  • The observations in one sample should be independent of the observations in the other sample.
  • The data should be approximately normally distributed.
  • The two samples should have approximately the same variance. If this assumption is not met, you should instead perform Welch’s t-test .
  • The data in both samples was obtained using a random sampling method .

Two Sample t-test : Example

Suppose we want to know whether or not the mean weight between two different species of turtles is equal. To test this, will perform a two sample t-test at significance level α = 0.05 using the following steps:

Step 1: Gather the sample data.

Suppose we collect a random sample of turtles from each population with the following information:

  • Sample size n 1 = 40
  • Sample mean weight  x 1  = 300
  • Sample standard deviation s 1 = 18.5
  • Sample size n 2 = 38
  • Sample mean weight  x 2  = 305
  • Sample standard deviation s 2 = 16.7

Step 2: Define the hypotheses.

We will perform the two sample t-test with the following hypotheses:

  • H 0 :  μ 1  = μ 2 (the two population means are equal)
  • H 1 :  μ 1  ≠ μ 2 (the two population means are not equal)

Step 3: Calculate the test statistic  t .

First, we will calculate the pooled standard deviation s p :

s p = √  (n 1 -1)s 1 2  +  (n 2 -1)s 2 2  /  (n 1 +n 2 -2)  = √  (40-1)18.5 2  +  (38-1)16.7 2  /  (40+38-2)  = 17.647

Next, we will calculate the test statistic  t :

t = ( x 1  –  x 2 )  /  s p (√ 1/n 1  + 1/n 2 ) =  (300-305) / 17.647(√ 1/40 + 1/38 ) =  -1.2508

Step 4: Calculate the p-value of the test statistic  t .

According to the T Score to P Value Calculator , the p-value associated with t = -1.2508 and degrees of freedom = n 1 +n 2 -2 = 40+38-2 = 76 is  0.21484 .

Step 5: Draw a conclusion.

Since this p-value is not less than our significance level α = 0.05, we fail to reject the null hypothesis. We do not have sufficient evidence to say that the mean weight of turtles between these two populations is different.

Note:  You can also perform this entire two sample t-test by simply using the Two Sample t-test Calculator .

Additional Resources

The following tutorials explain how to perform a two-sample t-test using different statistical programs:

How to Perform a Two Sample t-test in Excel How to Perform a Two Sample t-test in SPSS How to Perform a Two Sample t-test in Stata How to Perform a Two Sample t-test in R How to Perform a Two Sample t-test in Python How to Perform a Two Sample t-test on a TI-84 Calculator

Featured Posts

5 Statistical Biases to Avoid

Hey there. My name is Zach Bobbitt. I have a Masters of Science degree in Applied Statistics and I’ve worked on machine learning algorithms for professional businesses in both healthcare and retail. I’m passionate about statistics, machine learning, and data visualization and I created Statology to be a resource for both students and teachers alike.  My goal with this site is to help you learn statistics through using simple terms, plenty of real-world examples, and helpful illustrations.

2 Replies to “Two Sample t-test: Definition, Formula, and Example”

I like the detailed information and simplified in the way I can understand and relate easily. Thank you

It seems a couple of parenthesis is missed at the pooled standard deviation formula. Under square root you have (n1-1)s12 + (n2-1)s22 / (n1+n2-2) but it should be [(n1-1)s12 + (n2-1)s22] / (n1+n2-2) I used square bracket

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Study Guides
  • One- and Two-Tailed Tests
  • Method of Statistical Inference
  • Types of Statistics
  • Steps in the Process
  • Making Predictions
  • Comparing Results
  • Probability
  • Quiz: Introduction to Statistics
  • What Are Statistics?
  • Quiz: Bar Chart
  • Quiz: Pie Chart
  • Introduction to Graphic Displays
  • Quiz: Dot Plot
  • Quiz: Introduction to Graphic Displays
  • Frequency Histogram
  • Relative Frequency Histogram
  • Quiz: Relative Frequency Histogram
  • Frequency Polygon
  • Quiz: Frequency Polygon
  • Frequency Distribution
  • Stem-and-Leaf
  • Box Plot (Box-and-Whiskers)
  • Quiz: Box Plot (Box-and-Whiskers)
  • Scatter Plot
  • Measures of Central Tendency
  • Quiz: Measures of Central Tendency
  • Measures of Variability
  • Quiz: Measures of Variability
  • Measurement Scales
  • Quiz: Introduction to Numerical Measures
  • Classic Theory
  • Relative Frequency Theory
  • Probability of Simple Events
  • Quiz: Probability of Simple Events
  • Independent Events
  • Dependent Events
  • Introduction to Probability
  • Quiz: Introduction to Probability
  • Probability of Joint Occurrences
  • Quiz: Probability of Joint Occurrences
  • Non-Mutually-Exclusive Outcomes
  • Quiz: Non-Mutually-Exclusive Outcomes
  • Double-Counting
  • Conditional Probability
  • Quiz: Conditional Probability
  • Probability Distributions
  • Quiz: Probability Distributions
  • The Binomial
  • Quiz: The Binomial
  • Quiz: Sampling Distributions
  • Random and Systematic Error
  • Central Limit Theorem
  • Quiz: Central Limit Theorem
  • Populations, Samples, Parameters, and Statistics
  • Properties of the Normal Curve
  • Quiz: Populations, Samples, Parameters, and Statistics
  • Sampling Distributions
  • Quiz: Properties of the Normal Curve
  • Normal Approximation to the Binomial
  • Quiz: Normal Approximation to the Binomial
  • Quiz: Stating Hypotheses
  • The Test Statistic
  • Quiz: The Test Statistic
  • Quiz: One- and Two-Tailed Tests
  • Type I and II Errors
  • Quiz: Type I and II Errors
  • Stating Hypotheses
  • Significance
  • Quiz: Significance
  • Point Estimates and Confidence Intervals
  • Quiz: Point Estimates and Confidence Intervals
  • Estimating a Difference Score
  • Quiz: Estimating a Difference Score
  • Univariate Tests: An Overview
  • Quiz: Univariate Tests: An Overview
  • One-Sample z-test
  • Quiz: One-Sample z-test
  • One-Sample t-test
  • Quiz: One-Sample t-test
  • Two-Sample z-test for Comparing Two Means
  • Quiz: Introduction to Univariate Inferential Tests
  • Quiz: Two-Sample z-test for Comparing Two Means
  • Two Sample t test for Comparing Two Means
  • Quiz: Two-Sample t-test for Comparing Two Means
  • Paired Difference t-test
  • Quiz: Paired Difference t-test
  • Test for a Single Population Proportion
  • Quiz: Test for a Single Population Proportion
  • Test for Comparing Two Proportions
  • Quiz: Test for Comparing Two Proportions
  • Quiz: Simple Linear Regression
  • Chi-Square (X2)
  • Quiz: Chi-Square (X2)
  • Correlation
  • Quiz: Correlation
  • Simple Linear Regression
  • Common Mistakes
  • Statistics Tables
  • Quiz: Cumulative Review A
  • Quiz: Cumulative Review B
  • Statistics Quizzes

In the previous example, you tested a research hypothesis that predicted not only that the sample mean would be different from the population mean but that it would be different in a specific direction—it would be lower. This test is called a directional or one‐tailed test because the region of rejection is entirely within one tail of the distribution.

Some hypotheses predict only that one value will be different from another, without additionally predicting which will be higher. The test of such a hypothesis is nondirectional or two‐tailed because an extreme test statistic in either tail of the distribution (positive or negative) will lead to the rejection of the null hypothesis of no difference.

Suppose that you suspect that a particular class's performance on a proficiency test is not representative of those people who have taken the test. The national mean score on the test is 74.

The research hypothesis is:

The mean score of the class on the test is not 74.

Or in notation: H a : μ ≠ 74

The null hypothesis is:

The mean score of the class on the test is 74.

In notation: H 0 : μ = 74

As in the last example, you decide to use a 5 percent probability level for the test. Both tests have a region of rejection, then, of 5 percent, or 0.05. In this example, however, the rejection region must be split between both tails of the distribution—0.025 in the upper tail and 0.025 in the lower tail—because your hypothesis specifies only a difference, not a direction, as shown in Figure 1(a). You will reject the null hypotheses of no difference if the class sample mean is either much higher or much lower than the population mean of 74. In the previous example, only a sample mean much lower than the population mean would have led to the rejection of the null hypothesis.

Figure 1.Comparison of (a) a two‐tailed test and (b) a one‐tailed test, at the same probability level (95 percent).

write a typical null hypothesis for a two tail test

The decision of whether to use a one‐ or a two‐tailed test is important because a test statistic that falls in the region of rejection in a one‐tailed test may not do so in a two‐tailed test, even though both tests use the same probability level. Suppose the class sample mean in your example was 77, and its corresponding z ‐score was computed to be 1.80. Table 2 in "Statistics Tables" shows the critical z ‐scores for a probability of 0.025 in either tail to be –1.96 and 1.96. In order to reject the null hypothesis, the test statistic must be either smaller than –1.96 or greater than 1.96. It is not, so you cannot reject the null hypothesis. Refer to Figure 1(a).

Suppose, however, you had a reason to expect that the class would perform better on the proficiency test than the population, and you did a one‐tailed test instead. For this test, the rejection region of 0.05 would be entirely within the upper tail. The critical z ‐value for a probability of 0.05 in the upper tail is 1.65. (Remember that Table 2 in "Statistics Tables" gives areas of the curve below z ; so you look up the z ‐value for a probability of 0.95.) Your computed test statistic of z = 1.80 exceeds the critical value and falls in the region of rejection, so you reject the null hypothesis and say that your suspicion that the class was better than the population was supported. See Figure 1(b).

In practice, you should use a one‐tailed test only when you have good reason to expect that the difference will be in a particular direction. A two‐tailed test is more conservative than a one‐tailed test because a two‐tailed test takes a more extreme test statistic to reject the null hypothesis.

Previous Quiz: The Test Statistic

Next Quiz: One- and Two-Tailed Tests

  • Online Quizzes for CliffsNotes Statistics QuickReview, 2nd Edition

Krista King Math | Online math help

The p-value and rejecting the null (for one- and two-tail tests)

The p-value and rejecting the null blog post.jpeg

What is the p-value?

The ???p??? -value  (or the observed level of significance) is the smallest level of significance at which you can reject the null hypothesis, assuming the null hypothesis is true.

You can also think about the ???p???-value as the total area of the region of rejection. Remember that in a one-tailed test, the region of rejection is consolidated into one tail, whereas in a two-tailed test, the rejection region is split between two tails.

Krista King Math.jpg

Hi! I'm krista.

I create online courses to help you rock your math class. Read more.

So, as you might expect, calculating the ???p???-value as the area of the rejection region will be slightly different depending on whether we’re using a two-tailed test or a one-tailed test, and whether the one-tailed test is an upper-tail test or lower-tail test.

Calculating the ???p???-value

For a one-tailed, lower-tail test

For a one-tailed test, first calculate your ???z???-test statistic. For a lower-tail test, ???z??? will be negative. Look up the ???z???-value in a ???z???-table, and the value you find in the body of the table represents the area under the probability distribution curve to the left of your negative ???z???-value.

For instance, assume you found ???z=-1.46???. In a ???z???-table, you find

table of negative z-values

So ???0.0721??? is the area under the curve to the left of ???z=-1.46???, and this is the ???p???-value also. So ???p=0.0721???.

p-value for a negative z-score

For a one-tailed, upper-tail test

For a one-tailed test, first calculate your ???z???-test statistic. For an upper-tail test, ???z??? will be positive. Look up the ???z???-value in a ???z???-table, and the value you find in the body of the table represents the area under the probability distribution curve to the left of your positive ???z???-value.

For instance, assume you found ???z=1.46???. In a ???z???-table, you find

table of positive z-values

But in an upper-tail test, you’re interested in the area to the right of the ???z???-value, not the area to the left. To find the area to the right, you need to subtract the value in the ???z???-table from ???1???.

???1-0.9279=0.0721???

So ???0.0721??? is the area under the curve to the right of ???z=1.46???, and this is the ???p???-value also. So ???p=0.0721???.

p-value for a positive z-score

For a two-tailed test

For a two-tailed test, first calculate your ???z???-test statistic. For an two-tail test, ???z??? could be either positive or negative. Look up the ???z???-value in a -table, and the value you find in the body of the table represents the area under the probability distribution curve to the left of your ???z???-value.

For instance, assume you found ???z=1.23???. In a ???z???-table, you find

table of positive z-scores

But for a positive ???z???-value, you’re interested in the area to the right of the ???z???-value, not the area to the left. To find the area to the right, you need to subtract the value in the ???z???-table from ???1???.

???1-0.8907=0.1093???

So ???0.1093??? is the area under the curve to the right of ???z=1.23???. Because this is a two-tail test, the region of rejection is not only the ???10.93\%??? of area under the upper tail, but also the symmetrical ???10.93\%??? of area under the lower tail. So we’ll double ???0.1093??? to get ???2(0.1093)=0.2186???, and this is the ???p???-value also. So ???p=0.2186???.

p-value for a two-tail test

How to reject the null hypothesis

The reason we’ve gone through all this work to understand the ???p???-value is because using a ???p???-value is a really quick way to decide whether or not to reject the null hypothesis.

Whether or not you should reject ???H_0??? can be determined by the relationship between the ???\alpha??? level and the ???p???-value.

If ???p\leq \alpha???, reject the null hypothesis

If ???p>\alpha???, do not reject the null hypothesis

In our earlier examples, we found

???p=0.0721??? for the lower-tail one-tailed test

???p=0.0721??? for the upper-tail one-tailed test

???p=0.2186??? for the two-tailed test

With these in mind, let’s say for instance you set the confidence level of your hypothesis test at ???90\%???, which is the same as setting the ???\alpha??? level at ???\alpha=0.10???. In that case,

???p=0.0721\leq\alpha=0.10???

???p=0.2186>\alpha=0.10???

So we would have rejected the null hypothesis for both one-tailed tests, but we would have failed to reject the null in the two-tailed test. If, however, we’d picked a more rigorous ???\alpha=0.05??? or ???\alpha=0.01???, we would have failed to reject the null hypothesis every time.

Significance

The  significance  (or  statistical significance ) of a test is the probability of obtaining your result by chance. The less likely it is that we obtained a result by chance, the more significant our results.

Hopefully by now it’s not too surprising by now that all of these are equivalent statements:

The finding is significant at the ???0.01??? level

The confidence level is ???99\%???

The Type I error rate is ???0.01???

The alpha level is ???0.01???, ???\alpha=0.01???

The area of the rejection region is ???0.01???

The ???p???-value is ???0.01???, ???p=0.01???

There’s a ???1??? in ???100??? chance of getting a result as, or more, extreme as this one

The smaller the ???p???-value, or the smaller the alpha value, or the lower the Type I error rate, and the smaller the region of rejection, the higher the confidence level, and the less likely it is that you got your result by chance.

In other words, an alpha level of ???0.10??? (or a ???p???-value of ???0.10???, or a confidence level of ???90\%???) is a lower bar to clear. At that significance level, there’s a ???1??? in ???10??? chance that the result we got was just by chance. And therefore there’s a ???1??? in ???10??? chance that we’ll reject the null hypothesis when we really shouldn’t have, thinking that we provided support for the alternative hypothesis when we shouldn’t have.

But a stricter alpha level of ???0.01??? (or a ???p???-value of ???0.01???, or a confidence level of ???99\%???) is a higher bar to clear. At that significance level, there’s only a ???1??? in ???100??? chance that the result we got was just by chance. And therefore there’s only a ???1??? in ???100??? chance that we’ll reject the null hypothesis when we really shouldn’t have, thinking that we provided support for the alternative hypothesis when we shouldn’t have.

If we find a result that clears the bar we’ve set for ourselves, then we reject the null hypothesis and we say that the finding is significant at the ???p???-value that we find. Otherwise, we fail to reject the null.

How to use the p-value to determine whether or not you can reject the null hypothesis

Krista King Math Signup.png

Take the course

Want to learn more about probability & statistics i have a step-by-step course for that. :).

One and Two Tailed Tests

Suppose we have a null hypothesis H 0 and an alternative hypothesis H 1 . We consider the distribution given by the null hypothesis and perform a test to determine whether or not the null hypothesis should be rejected in favour of the alternative hypothesis.

There are two different types of tests that can be performed. A one-tailed test looks for an increase or decrease in the parameter whereas a two-tailed test looks for any change in the parameter (which can be any change- increase or decrease).

We can perform the test at any level (usually 1%, 5% or 10%). For example, performing the test at a 5% level means that there is a 5% chance of wrongly rejecting H 0 .

If we perform the test at the 5% level and decide to reject the null hypothesis, we say "there is significant evidence at the 5% level to suggest the hypothesis is false".

One-Tailed Test

We choose a critical region. In a one-tailed test, the critical region will have just one part (the red area below). If our sample value lies in this region, we reject the null hypothesis in favour of the alternative.

Suppose we are looking for a definite decrease. Then the critical region will be to the left. Note, however, that in the one-tailed test the value of the parameter can be as high as you like.

Suppose we are given that X has a Poisson distribution and we want to carry out a hypothesis test on the mean, l, based upon a sample observation of 3.

Suppose the hypotheses are: H 0 : l = 9 H 1 : l < 9

We want to test if it is "reasonable" for the observed value of 3 to have come from a Poisson distribution with parameter 9. So what is the probability that a value as low as 3 has come from a Po(9)?

P(X < 3) = 0.0212 (this has come from a Poisson table)

The probability is less than 0.05, so there is less than a 5% chance that the value has come from a Poisson(3) distribution. We therefore reject the null hypothesis in favour of the alternative at the 5% level.

However, the probability is greater than 0.01, so we would not reject the null hypothesis in favour of the alternative at the 1% level.

Two-Tailed Test

In a two-tailed test, we are looking for either an increase or a decrease. So, for example, H 0 might be that the mean is equal to 9 (as before). This time, however, H 1 would be that the mean is not equal to 9. In this case, therefore, the critical region has two parts:

Lets test the parameter p of a Binomial distribution at the 10% level.

Suppose a coin is tossed 10 times and we get 7 heads. We want to test whether or not the coin is fair. If the coin is fair, p = 0.5 . Put this as the null hypothesis:

H 0 : p = 0.5 H 1 : p =(doesn' equal) 0.5

Now, because the test is 2-tailed, the critical region has two parts. Half of the critical region is to the right and half is to the left. So the critical region contains both the top 5% of the distribution and the bottom 5% of the distribution (since we are testing at the 10% level).

If H 0 is true, X ~ Bin(10, 0.5).

If the null hypothesis is true, what is the probability that X is 7 or above? P(X > 7) = 1 - P(X < 7) = 1 - P(X < 6) = 1 - 0.8281 = 0.1719

Is this in the critical region? No- because the probability that X is at least 7 is not less than 0.05 (5%), which is what we need it to be.

So there is not significant evidence at the 10% level to reject the null hypothesis.

Pass Your GCSE Maths Banner

IMAGES

  1. Edu Write 2 Tailed Hypothesis

    write a typical null hypothesis for a two tail test

  2. Significance Level and Power of a Hypothesis Test Tutorial

    write a typical null hypothesis for a two tail test

  3. Hypothesis Testing: Upper, Lower, and Two Tailed Tests

    write a typical null hypothesis for a two tail test

  4. PPT

    write a typical null hypothesis for a two tail test

  5. The p-value and rejecting the null (for one- and two-tail tests

    write a typical null hypothesis for a two tail test

  6. What Is a Two-Tailed Test? Definition and Example

    write a typical null hypothesis for a two tail test

VIDEO

  1. Part 1: Hypothesis testing (Null & Alternative hypothesis)

  2. Adv Business Statistics lecture 1 (3 of 5) -- Hypothesis testing involving one population mean

  3. Establishment of null and alternative hypothesis ch 13 lec 2

  4. Part 3 : Hypothesis testing ( one tail and two tail hypotesting)

  5. Part 2 : Hypothesis testing (one tail and two tail hypotesting)| Full concept in Hindi

  6. Hypotheses: Introduction

COMMENTS

  1. Two-Tailed Hypothesis Tests: 3 Example Problems

    H0 (Null Hypothesis): μ = 20 grams. HA (Alternative Hypothesis): μ ≠ 20 grams. This is an example of a two-tailed hypothesis test because the alternative hypothesis contains the not equal "≠" sign. The engineer believes that the new method will influence widget weight, but doesn't specify whether it will cause average weight to ...

  2. Hypothesis Testing

    z-value = (105-100)÷(15÷√7.5) = 2.89. This value 2.89 is called the test statistic. This takes us to our last step. 5. Draw a conclusion. So, if you look at the curve, the value of 2.89 will definitely lie on the red area towards the right of the curve because the critical value of 1.96 is less than 2.89.

  3. One-Tailed and Two-Tailed Hypothesis Tests Explained

    One-tailed hypothesis tests are also known as directional and one-sided tests because you can test for effects in only one direction. When you perform a one-tailed test, the entire significance level percentage goes into the extreme end of one tail of the distribution. In the examples below, I use an alpha of 5%.

  4. How to Write a Null Hypothesis (5 Examples)

    Whenever we perform a hypothesis test, we always write a null hypothesis and an alternative hypothesis, which take the following forms: H0 (Null Hypothesis): Population parameter =, ≤, ≥ some value. HA (Alternative Hypothesis): Population parameter <, >, ≠ some value. Note that the null hypothesis always contains the equal sign.

  5. 11.4: One- and Two-Tailed Tests

    This is a very high probability and the null hypothesis would not be rejected. The null hypothesis for the two-tailed test is \(\pi =0.5\). By contrast, the null hypothesis for the one-tailed test is \(\pi \leq 0.5\). Accordingly, we reject the two-tailed hypothesis if the sample proportion deviates greatly from \(0.5\) in either direction.

  6. 9.1: Null and Alternative Hypotheses

    The actual test begins by considering two hypotheses.They are called the null hypothesis and the alternative hypothesis.These hypotheses contain opposing viewpoints. \(H_0\): The null hypothesis: It is a statement of no difference between the variables—they are not related. This can often be considered the status quo and as a result if you cannot accept the null it requires some action.

  7. Hypothesis testing: One-tailed and two-tailed tests

    At this point, you might use a statistical test, like unpaired or 2-sample t-test, to see if there's a significant difference between the two groups' means. Typically, an unpaired t-test starts with two hypotheses. The first hypothesis is called the null hypothesis, and it basically says there's no difference in the means of the two groups.

  8. Hypothesis Testing

    There are 5 main steps in hypothesis testing: State your research hypothesis as a null hypothesis and alternate hypothesis (H o) and (H a or H 1 ). Collect data in a way designed to test the hypothesis. Perform an appropriate statistical test. Decide whether to reject or fail to reject your null hypothesis. Present the findings in your results ...

  9. Data analysis: hypothesis testing: 4.2 Two-tailed tests

    To perform a two-tailed test at a significance level of 0.05, you need to divide alpha by 2, giving a significance level of 0.025 for each distribution tail (0.05/2 = 0.025). This is done because the two-tailed test is looking for significance in either tail of the distribution. If the calculated test statistic falls in the rejection region of ...

  10. One-tailed and two-tailed tests (video)

    A one tailed test does not leave more room to conclude that the alternative hypothesis is true. The benefit (increased certainty) of a one tailed test doesn't come free, as the analyst must know "something more", which is the direction of the effect, compared to a two tailed test. ( 3 votes)

  11. S.3.2 Hypothesis Testing (P-Value Approach)

    Two-Tailed. In our example concerning the mean grade point average, suppose again that our random sample of n = 15 students majoring in mathematics yields a test statistic t* instead of equaling -2.5.The P-value for conducting the two-tailed test H 0: μ = 3 versus H A: μ ≠ 3 is the probability that we would observe a test statistic less than -2.5 or greater than 2.5 if the population mean ...

  12. 9.1 Null and Alternative Hypotheses

    The actual test begins by considering two hypotheses.They are called the null hypothesis and the alternative hypothesis.These hypotheses contain opposing viewpoints. H 0, the —null hypothesis: a statement of no difference between sample means or proportions or no difference between a sample mean or proportion and a population mean or proportion. In other words, the difference equals 0.

  13. Hypothesis Testing: Upper-, Lower, and Two Tailed Tests

    In a two-tailed test the decision rule has investigators reject H 0 if the test statistic is extreme, either larger than an upper critical value or smaller than a lower critical value. ... Because we rejected the null hypothesis, we now approximate the p-value which is the likelihood of observing the sample data if the null hypothesis is true. ...

  14. Some Basic Null Hypothesis Tests

    The most common null hypothesis test for this type of statistical relationship is the t test. In this section, we look at three types of t tests that are used for slightly different research designs: the one-sample t test, the dependent-samples t test, and the independent-samples t test. The one-sample t test is used to compare a sample mean ...

  15. Null & Alternative Hypotheses

    Statistical test: Null hypothesis (H 0) Alternative hypothesis (H a) Two-sample t test or. One-way ANOVA with two groups: The mean dependent variable does not differ between group 1 (µ 1) and group 2 (µ 2) in the population; µ 1 = µ 2. The mean dependent variable differs between group 1 (µ 1) and group 2 (µ 2) in the population; µ 1 ≠ ...

  16. Examples of null and alternative hypotheses

    It is the opposite of your research hypothesis. The alternative hypothesis--that is, the research hypothesis--is the idea, phenomenon, observation that you want to prove. If you suspect that girls take longer to get ready for school than boys, then: Alternative: girls time > boys time. Null: girls time <= boys time.

  17. Two Sample t-test: Definition, Formula, and Example

    A two-sample t-test always uses the following null hypothesis: H 0: μ 1 = μ 2 (the two population means are equal) The alternative hypothesis can be either two-tailed, left-tailed, or right-tailed: ... (common choices are 0.10, 0.05, and 0.01) then you can reject the null hypothesis. Two Sample t-test: Assumptions. For the results of a two ...

  18. T-test and Hypothesis Testing (Explained Simply)

    The possible outcomes of hypothesis testing: Reject the null hypothesis —a person is ... (the mean difference) can be positive or negative, it is better to use a two-tailed t-test. The two-tailed t-test can detect the effect from both directions. ... Finally, if you have questions, comments, or criticism, feel free to write in the comments ...

  19. One- and Two-Tailed Tests

    In practice, you should use a one‐tailed test only when you have good reason to expect that the difference will be in a particular direction. A two‐tailed test is more conservative than a one‐tailed test because a two‐tailed test takes a more extreme test statistic to reject the null hypothesis. Next Quiz: One- and Two-Tailed Tests.

  20. The p-value and rejecting the null (for one- and two-tail tests)

    The p-value (or the observed level of significance) is the smallest level of significance at which you can reject the null hypothesis, assuming the null hypothesis is true. You can also think about the p-value as the total area of the region of rejection. Remember that in a one-tailed test, the regi

  21. An Introduction to t Tests

    When to use a t test. A t test can only be used when comparing the means of two groups (a.k.a. pairwise comparison). If you want to compare more than two groups, or if you want to do multiple pairwise comparisons, use an ANOVA test or a post-hoc test.. The t test is a parametric test of difference, meaning that it makes the same assumptions about your data as other parametric tests.

  22. One and Two Tailed Tests

    For example, performing the test at a 5% level means that there is a 5% chance of wrongly rejecting H 0. If we perform the test at the 5% level and decide to reject the null hypothesis, we say "there is significant evidence at the 5% level to suggest the hypothesis is false". One-Tailed Test. We choose a critical region. In a one-tailed test ...

  23. Solved Chapter 9 (Fundamentals of Hypothesis Testing:

    Chapter 9 (Fundamentals of Hypothesis Testing: One-Sample Tests] (1) (a) Write a typical null hypothesis for a two-tail test. (2.5 points] (b) Write a typical alternative hypothesis for a two-tail test.

  24. Answered: The general manager of an engineering…

    In what are you 95% confident? (c) Test the null hypothesis that ẞ2 is zero against the alternative that it is not using a two-tail test and the α = 0.05 level of significance. What do you conclude? (d) Test the null hypothesis that ẞ2 is zero against the one-tail alternative that it is positive at the a = €0.05 level of significance.