What Is a Testable Hypothesis?

  • Scientific Method
  • Chemical Laws
  • Periodic Table
  • Projects & Experiments
  • Biochemistry
  • Physical Chemistry
  • Medical Chemistry
  • Chemistry In Everyday Life
  • Famous Chemists
  • Activities for Kids
  • Abbreviations & Acronyms
  • Weather & Climate
  • Ph.D., Biomedical Sciences, University of Tennessee at Knoxville
  • B.A., Physics and Mathematics, Hastings College

A hypothesis is a tentative answer to a scientific question. A testable hypothesis is a  hypothesis that can be proved or disproved as a result of testing, data collection, or experience. Only testable hypotheses can be used to conceive and perform an experiment using the scientific method .

Requirements for a Testable Hypothesis

In order to be considered testable, two criteria must be met:

  • It must be possible to prove that the hypothesis is true.
  • It must be possible to prove that the hypothesis is false.
  • It must be possible to reproduce the results of the hypothesis.

Examples of a Testable Hypothesis

All the following hypotheses are testable. It's important, however, to note that while it's possible to say that the hypothesis is correct, much more research would be required to answer the question " why is this hypothesis correct?" 

  • Students who attend class have higher grades than students who skip class.  This is testable because it is possible to compare the grades of students who do and do not skip class and then analyze the resulting data. Another person could conduct the same research and come up with the same results.
  • People exposed to high levels of ultraviolet light have a higher incidence of cancer than the norm.  This is testable because it is possible to find a group of people who have been exposed to high levels of ultraviolet light and compare their cancer rates to the average.
  • If you put people in a dark room, then they will be unable to tell when an infrared light turns on.  This hypothesis is testable because it is possible to put a group of people into a dark room, turn on an infrared light, and ask the people in the room whether or not an infrared light has been turned on.

Examples of a Hypothesis Not Written in a Testable Form

  • It doesn't matter whether or not you skip class.  This hypothesis can't be tested because it doesn't make any actual claim regarding the outcome of skipping class. "It doesn't matter" doesn't have any specific meaning, so it can't be tested.
  • Ultraviolet light could cause cancer.  The word "could" makes a hypothesis extremely difficult to test because it is very vague. There "could," for example, be UFOs watching us at every moment, even though it's impossible to prove that they are there!
  • Goldfish make better pets than guinea pigs.  This is not a hypothesis; it's a matter of opinion. There is no agreed-upon definition of what a "better" pet is, so while it is possible to argue the point, there is no way to prove it.

How to Propose a Testable Hypothesis

Now that you know what a testable hypothesis is, here are tips for proposing one.

  • Try to write the hypothesis as an if-then statement. If you take an action, then a certain outcome is expected.
  • Identify the independent and dependent variable in the hypothesis. The independent variable is what you are controlling or changing. You measure the effect this has on the dependent variable.
  • Write the hypothesis in such a way that you can prove or disprove it. For example, a person has skin cancer, you can't prove they got it from being out in the sun. However, you can demonstrate a relationship between exposure to ultraviolet light and increased risk of skin cancer.
  • Make sure you are proposing a hypothesis you can test with reproducible results. If your face breaks out, you can't prove the breakout was caused by the french fries you had for dinner last night. However, you can measure whether or not eating french fries is associated with breaking out. It's a matter of gathering enough data to be able to reproduce results and draw a conclusion.
  • What Are Examples of a Hypothesis?
  • What Are the Elements of a Good Hypothesis?
  • What Is a Hypothesis? (Science)
  • How To Design a Science Fair Experiment
  • Understanding Simple vs Controlled Experiments
  • Scientific Method Vocabulary Terms
  • Hypothesis, Model, Theory, and Law
  • Null Hypothesis Definition and Examples
  • Six Steps of the Scientific Method
  • What 'Fail to Reject' Means in a Hypothesis Test
  • Scientific Method Flow Chart
  • Null Hypothesis Examples
  • What Is an Experiment? Definition and Design
  • Scientific Hypothesis Examples
  • The Visible Spectrum: Wavelengths and Colors

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • How to Write a Strong Hypothesis | Steps & Examples

How to Write a Strong Hypothesis | Steps & Examples

Published on May 6, 2022 by Shona McCombes . Revised on November 20, 2023.

A hypothesis is a statement that can be tested by scientific research. If you want to test a relationship between two or more variables, you need to write hypotheses before you start your experiment or data collection .

Example: Hypothesis

Daily apple consumption leads to fewer doctor’s visits.

Table of contents

What is a hypothesis, developing a hypothesis (with example), hypothesis examples, other interesting articles, frequently asked questions about writing hypotheses.

A hypothesis states your predictions about what your research will find. It is a tentative answer to your research question that has not yet been tested. For some research projects, you might have to write several hypotheses that address different aspects of your research question.

A hypothesis is not just a guess – it should be based on existing theories and knowledge. It also has to be testable, which means you can support or refute it through scientific research methods (such as experiments, observations and statistical analysis of data).

Variables in hypotheses

Hypotheses propose a relationship between two or more types of variables .

  • An independent variable is something the researcher changes or controls.
  • A dependent variable is something the researcher observes and measures.

If there are any control variables , extraneous variables , or confounding variables , be sure to jot those down as you go to minimize the chances that research bias  will affect your results.

In this example, the independent variable is exposure to the sun – the assumed cause . The dependent variable is the level of happiness – the assumed effect .

Prevent plagiarism. Run a free check.

Step 1. ask a question.

Writing a hypothesis begins with a research question that you want to answer. The question should be focused, specific, and researchable within the constraints of your project.

Step 2. Do some preliminary research

Your initial answer to the question should be based on what is already known about the topic. Look for theories and previous studies to help you form educated assumptions about what your research will find.

At this stage, you might construct a conceptual framework to ensure that you’re embarking on a relevant topic . This can also help you identify which variables you will study and what you think the relationships are between them. Sometimes, you’ll have to operationalize more complex constructs.

Step 3. Formulate your hypothesis

Now you should have some idea of what you expect to find. Write your initial answer to the question in a clear, concise sentence.

4. Refine your hypothesis

You need to make sure your hypothesis is specific and testable. There are various ways of phrasing a hypothesis, but all the terms you use should have clear definitions, and the hypothesis should contain:

  • The relevant variables
  • The specific group being studied
  • The predicted outcome of the experiment or analysis

5. Phrase your hypothesis in three ways

To identify the variables, you can write a simple prediction in  if…then form. The first part of the sentence states the independent variable and the second part states the dependent variable.

In academic research, hypotheses are more commonly phrased in terms of correlations or effects, where you directly state the predicted relationship between variables.

If you are comparing two groups, the hypothesis can state what difference you expect to find between them.

6. Write a null hypothesis

If your research involves statistical hypothesis testing , you will also have to write a null hypothesis . The null hypothesis is the default position that there is no association between the variables. The null hypothesis is written as H 0 , while the alternative hypothesis is H 1 or H a .

  • H 0 : The number of lectures attended by first-year students has no effect on their final exam scores.
  • H 1 : The number of lectures attended by first-year students has a positive effect on their final exam scores.

If you want to know more about the research process , methodology , research bias , or statistics , make sure to check out some of our other articles with explanations and examples.

  • Sampling methods
  • Simple random sampling
  • Stratified sampling
  • Cluster sampling
  • Likert scales
  • Reproducibility

 Statistics

  • Null hypothesis
  • Statistical power
  • Probability distribution
  • Effect size
  • Poisson distribution

Research bias

  • Optimism bias
  • Cognitive bias
  • Implicit bias
  • Hawthorne effect
  • Anchoring bias
  • Explicit bias

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

A hypothesis is not just a guess — it should be based on existing theories and knowledge. It also has to be testable, which means you can support or refute it through scientific research methods (such as experiments, observations and statistical analysis of data).

Null and alternative hypotheses are used in statistical hypothesis testing . The null hypothesis of a test always predicts no effect or no relationship between variables, while the alternative hypothesis states your research prediction of an effect or relationship.

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

McCombes, S. (2023, November 20). How to Write a Strong Hypothesis | Steps & Examples. Scribbr. Retrieved April 11, 2024, from https://www.scribbr.com/methodology/hypothesis/

Is this article helpful?

Shona McCombes

Shona McCombes

Other students also liked, construct validity | definition, types, & examples, what is a conceptual framework | tips & examples, operationalization | a guide with examples, pros & cons, unlimited academic ai-proofreading.

✔ Document error-free in 5minutes ✔ Unlimited document corrections ✔ Specialized in correcting academic texts

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • J Korean Med Sci
  • v.36(50); 2021 Dec 27

Logo of jkms

Formulating Hypotheses for Different Study Designs

Durga prasanna misra.

1 Department of Clinical Immunology and Rheumatology, Sanjay Gandhi Postgraduate Institute of Medical Sciences, Lucknow, India.

Armen Yuri Gasparyan

2 Departments of Rheumatology and Research and Development, Dudley Group NHS Foundation Trust (Teaching Trust of the University of Birmingham, UK), Russells Hall Hospital, Dudley, UK.

Olena Zimba

3 Department of Internal Medicine #2, Danylo Halytsky Lviv National Medical University, Lviv, Ukraine.

Marlen Yessirkepov

4 Department of Biology and Biochemistry, South Kazakhstan Medical Academy, Shymkent, Kazakhstan.

Vikas Agarwal

George d. kitas.

5 Centre for Epidemiology versus Arthritis, University of Manchester, Manchester, UK.

Generating a testable working hypothesis is the first step towards conducting original research. Such research may prove or disprove the proposed hypothesis. Case reports, case series, online surveys and other observational studies, clinical trials, and narrative reviews help to generate hypotheses. Observational and interventional studies help to test hypotheses. A good hypothesis is usually based on previous evidence-based reports. Hypotheses without evidence-based justification and a priori ideas are not received favourably by the scientific community. Original research to test a hypothesis should be carefully planned to ensure appropriate methodology and adequate statistical power. While hypotheses can challenge conventional thinking and may be controversial, they should not be destructive. A hypothesis should be tested by ethically sound experiments with meaningful ethical and clinical implications. The coronavirus disease 2019 pandemic has brought into sharp focus numerous hypotheses, some of which were proven (e.g. effectiveness of corticosteroids in those with hypoxia) while others were disproven (e.g. ineffectiveness of hydroxychloroquine and ivermectin).

Graphical Abstract

An external file that holds a picture, illustration, etc.
Object name is jkms-36-e338-abf001.jpg

DEFINING WORKING AND STANDALONE SCIENTIFIC HYPOTHESES

Science is the systematized description of natural truths and facts. Routine observations of existing life phenomena lead to the creative thinking and generation of ideas about mechanisms of such phenomena and related human interventions. Such ideas presented in a structured format can be viewed as hypotheses. After generating a hypothesis, it is necessary to test it to prove its validity. Thus, hypothesis can be defined as a proposed mechanism of a naturally occurring event or a proposed outcome of an intervention. 1 , 2

Hypothesis testing requires choosing the most appropriate methodology and adequately powering statistically the study to be able to “prove” or “disprove” it within predetermined and widely accepted levels of certainty. This entails sample size calculation that often takes into account previously published observations and pilot studies. 2 , 3 In the era of digitization, hypothesis generation and testing may benefit from the availability of numerous platforms for data dissemination, social networking, and expert validation. Related expert evaluations may reveal strengths and limitations of proposed ideas at early stages of post-publication promotion, preventing the implementation of unsupported controversial points. 4

Thus, hypothesis generation is an important initial step in the research workflow, reflecting accumulating evidence and experts' stance. In this article, we overview the genesis and importance of scientific hypotheses and their relevance in the era of the coronavirus disease 2019 (COVID-19) pandemic.

DO WE NEED HYPOTHESES FOR ALL STUDY DESIGNS?

Broadly, research can be categorized as primary or secondary. In the context of medicine, primary research may include real-life observations of disease presentations and outcomes. Single case descriptions, which often lead to new ideas and hypotheses, serve as important starting points or justifications for case series and cohort studies. The importance of case descriptions is particularly evident in the context of the COVID-19 pandemic when unique, educational case reports have heralded a new era in clinical medicine. 5

Case series serve similar purpose to single case reports, but are based on a slightly larger quantum of information. Observational studies, including online surveys, describe the existing phenomena at a larger scale, often involving various control groups. Observational studies include variable-scale epidemiological investigations at different time points. Interventional studies detail the results of therapeutic interventions.

Secondary research is based on already published literature and does not directly involve human or animal subjects. Review articles are generated by secondary research. These could be systematic reviews which follow methods akin to primary research but with the unit of study being published papers rather than humans or animals. Systematic reviews have a rigid structure with a mandatory search strategy encompassing multiple databases, systematic screening of search results against pre-defined inclusion and exclusion criteria, critical appraisal of study quality and an optional component of collating results across studies quantitatively to derive summary estimates (meta-analysis). 6 Narrative reviews, on the other hand, have a more flexible structure. Systematic literature searches to minimise bias in selection of articles are highly recommended but not mandatory. 7 Narrative reviews are influenced by the authors' viewpoint who may preferentially analyse selected sets of articles. 8

In relation to primary research, case studies and case series are generally not driven by a working hypothesis. Rather, they serve as a basis to generate a hypothesis. Observational or interventional studies should have a hypothesis for choosing research design and sample size. The results of observational and interventional studies further lead to the generation of new hypotheses, testing of which forms the basis of future studies. Review articles, on the other hand, may not be hypothesis-driven, but form fertile ground to generate future hypotheses for evaluation. Fig. 1 summarizes which type of studies are hypothesis-driven and which lead on to hypothesis generation.

An external file that holds a picture, illustration, etc.
Object name is jkms-36-e338-g001.jpg

STANDARDS OF WORKING AND SCIENTIFIC HYPOTHESES

A review of the published literature did not enable the identification of clearly defined standards for working and scientific hypotheses. It is essential to distinguish influential versus not influential hypotheses, evidence-based hypotheses versus a priori statements and ideas, ethical versus unethical, or potentially harmful ideas. The following points are proposed for consideration while generating working and scientific hypotheses. 1 , 2 Table 1 summarizes these points.

Evidence-based data

A scientific hypothesis should have a sound basis on previously published literature as well as the scientist's observations. Randomly generated (a priori) hypotheses are unlikely to be proven. A thorough literature search should form the basis of a hypothesis based on published evidence. 7

Unless a scientific hypothesis can be tested, it can neither be proven nor be disproven. Therefore, a scientific hypothesis should be amenable to testing with the available technologies and the present understanding of science.

Supported by pilot studies

If a hypothesis is based purely on a novel observation by the scientist in question, it should be grounded on some preliminary studies to support it. For example, if a drug that targets a specific cell population is hypothesized to be useful in a particular disease setting, then there must be some preliminary evidence that the specific cell population plays a role in driving that disease process.

Testable by ethical studies

The hypothesis should be testable by experiments that are ethically acceptable. 9 For example, a hypothesis that parachutes reduce mortality from falls from an airplane cannot be tested using a randomized controlled trial. 10 This is because it is obvious that all those jumping from a flying plane without a parachute would likely die. Similarly, the hypothesis that smoking tobacco causes lung cancer cannot be tested by a clinical trial that makes people take up smoking (since there is considerable evidence for the health hazards associated with smoking). Instead, long-term observational studies comparing outcomes in those who smoke and those who do not, as was performed in the landmark epidemiological case control study by Doll and Hill, 11 are more ethical and practical.

Balance between scientific temper and controversy

Novel findings, including novel hypotheses, particularly those that challenge established norms, are bound to face resistance for their wider acceptance. Such resistance is inevitable until the time such findings are proven with appropriate scientific rigor. However, hypotheses that generate controversy are generally unwelcome. For example, at the time the pandemic of human immunodeficiency virus (HIV) and AIDS was taking foot, there were numerous deniers that refused to believe that HIV caused AIDS. 12 , 13 Similarly, at a time when climate change is causing catastrophic changes to weather patterns worldwide, denial that climate change is occurring and consequent attempts to block climate change are certainly unwelcome. 14 The denialism and misinformation during the COVID-19 pandemic, including unfortunate examples of vaccine hesitancy, are more recent examples of controversial hypotheses not backed by science. 15 , 16 An example of a controversial hypothesis that was a revolutionary scientific breakthrough was the hypothesis put forth by Warren and Marshall that Helicobacter pylori causes peptic ulcers. Initially, the hypothesis that a microorganism could cause gastritis and gastric ulcers faced immense resistance. When the scientists that proposed the hypothesis themselves ingested H. pylori to induce gastritis in themselves, only then could they convince the wider world about their hypothesis. Such was the impact of the hypothesis was that Barry Marshall and Robin Warren were awarded the Nobel Prize in Physiology or Medicine in 2005 for this discovery. 17 , 18

DISTINGUISHING THE MOST INFLUENTIAL HYPOTHESES

Influential hypotheses are those that have stood the test of time. An archetype of an influential hypothesis is that proposed by Edward Jenner in the eighteenth century that cowpox infection protects against smallpox. While this observation had been reported for nearly a century before this time, it had not been suitably tested and publicised until Jenner conducted his experiments on a young boy by demonstrating protection against smallpox after inoculation with cowpox. 19 These experiments were the basis for widespread smallpox immunization strategies worldwide in the 20th century which resulted in the elimination of smallpox as a human disease today. 20

Other influential hypotheses are those which have been read and cited widely. An example of this is the hygiene hypothesis proposing an inverse relationship between infections in early life and allergies or autoimmunity in adulthood. An analysis reported that this hypothesis had been cited more than 3,000 times on Scopus. 1

LESSONS LEARNED FROM HYPOTHESES AMIDST THE COVID-19 PANDEMIC

The COVID-19 pandemic devastated the world like no other in recent memory. During this period, various hypotheses emerged, understandably so considering the public health emergency situation with innumerable deaths and suffering for humanity. Within weeks of the first reports of COVID-19, aberrant immune system activation was identified as a key driver of organ dysfunction and mortality in this disease. 21 Consequently, numerous drugs that suppress the immune system or abrogate the activation of the immune system were hypothesized to have a role in COVID-19. 22 One of the earliest drugs hypothesized to have a benefit was hydroxychloroquine. Hydroxychloroquine was proposed to interfere with Toll-like receptor activation and consequently ameliorate the aberrant immune system activation leading to pathology in COVID-19. 22 The drug was also hypothesized to have a prophylactic role in preventing infection or disease severity in COVID-19. It was also touted as a wonder drug for the disease by many prominent international figures. However, later studies which were well-designed randomized controlled trials failed to demonstrate any benefit of hydroxychloroquine in COVID-19. 23 , 24 , 25 , 26 Subsequently, azithromycin 27 , 28 and ivermectin 29 were hypothesized as potential therapies for COVID-19, but were not supported by evidence from randomized controlled trials. The role of vitamin D in preventing disease severity was also proposed, but has not been proven definitively until now. 30 , 31 On the other hand, randomized controlled trials identified the evidence supporting dexamethasone 32 and interleukin-6 pathway blockade with tocilizumab as effective therapies for COVID-19 in specific situations such as at the onset of hypoxia. 33 , 34 Clues towards the apparent effectiveness of various drugs against severe acute respiratory syndrome coronavirus 2 in vitro but their ineffectiveness in vivo have recently been identified. Many of these drugs are weak, lipophilic bases and some others induce phospholipidosis which results in apparent in vitro effectiveness due to non-specific off-target effects that are not replicated inside living systems. 35 , 36

Another hypothesis proposed was the association of the routine policy of vaccination with Bacillus Calmette-Guerin (BCG) with lower deaths due to COVID-19. This hypothesis emerged in the middle of 2020 when COVID-19 was still taking foot in many parts of the world. 37 , 38 Subsequently, many countries which had lower deaths at that time point went on to have higher numbers of mortality, comparable to other areas of the world. Furthermore, the hypothesis that BCG vaccination reduced COVID-19 mortality was a classic example of ecological fallacy. Associations between population level events (ecological studies; in this case, BCG vaccination and COVID-19 mortality) cannot be directly extrapolated to the individual level. Furthermore, such associations cannot per se be attributed as causal in nature, and can only serve to generate hypotheses that need to be tested at the individual level. 39

IS TRADITIONAL PEER REVIEW EFFICIENT FOR EVALUATION OF WORKING AND SCIENTIFIC HYPOTHESES?

Traditionally, publication after peer review has been considered the gold standard before any new idea finds acceptability amongst the scientific community. Getting a work (including a working or scientific hypothesis) reviewed by experts in the field before experiments are conducted to prove or disprove it helps to refine the idea further as well as improve the experiments planned to test the hypothesis. 40 A route towards this has been the emergence of journals dedicated to publishing hypotheses such as the Central Asian Journal of Medical Hypotheses and Ethics. 41 Another means of publishing hypotheses is through registered research protocols detailing the background, hypothesis, and methodology of a particular study. If such protocols are published after peer review, then the journal commits to publishing the completed study irrespective of whether the study hypothesis is proven or disproven. 42 In the post-pandemic world, online research methods such as online surveys powered via social media channels such as Twitter and Instagram might serve as critical tools to generate as well as to preliminarily test the appropriateness of hypotheses for further evaluation. 43 , 44

Some radical hypotheses might be difficult to publish after traditional peer review. These hypotheses might only be acceptable by the scientific community after they are tested in research studies. Preprints might be a way to disseminate such controversial and ground-breaking hypotheses. 45 However, scientists might prefer to keep their hypotheses confidential for the fear of plagiarism of ideas, avoiding online posting and publishing until they have tested the hypotheses.

SUGGESTIONS ON GENERATING AND PUBLISHING HYPOTHESES

Publication of hypotheses is important, however, a balance is required between scientific temper and controversy. Journal editors and reviewers might keep in mind these specific points, summarized in Table 2 and detailed hereafter, while judging the merit of hypotheses for publication. Keeping in mind the ethical principle of primum non nocere, a hypothesis should be published only if it is testable in a manner that is ethically appropriate. 46 Such hypotheses should be grounded in reality and lend themselves to further testing to either prove or disprove them. It must be considered that subsequent experiments to prove or disprove a hypothesis have an equal chance of failing or succeeding, akin to tossing a coin. A pre-conceived belief that a hypothesis is unlikely to be proven correct should not form the basis of rejection of such a hypothesis for publication. In this context, hypotheses generated after a thorough literature search to identify knowledge gaps or based on concrete clinical observations on a considerable number of patients (as opposed to random observations on a few patients) are more likely to be acceptable for publication by peer-reviewed journals. Also, hypotheses should be considered for publication or rejection based on their implications for science at large rather than whether the subsequent experiments to test them end up with results in favour of or against the original hypothesis.

Hypotheses form an important part of the scientific literature. The COVID-19 pandemic has reiterated the importance and relevance of hypotheses for dealing with public health emergencies and highlighted the need for evidence-based and ethical hypotheses. A good hypothesis is testable in a relevant study design, backed by preliminary evidence, and has positive ethical and clinical implications. General medical journals might consider publishing hypotheses as a specific article type to enable more rapid advancement of science.

Disclosure: The authors have no potential conflicts of interest to disclose.

Author Contributions:

  • Data curation: Gasparyan AY, Misra DP, Zimba O, Yessirkepov M, Agarwal V, Kitas GD.
  • Bipolar Disorder
  • Therapy Center
  • When To See a Therapist
  • Types of Therapy
  • Best Online Therapy
  • Best Couples Therapy
  • Best Family Therapy
  • Managing Stress
  • Sleep and Dreaming
  • Understanding Emotions
  • Self-Improvement
  • Healthy Relationships
  • Student Resources
  • Personality Types
  • Guided Meditations
  • Verywell Mind Insights
  • 2023 Verywell Mind 25
  • Mental Health in the Classroom
  • Editorial Process
  • Meet Our Review Board
  • Crisis Support

How to Write a Great Hypothesis

Hypothesis Format, Examples, and Tips

Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

explain how your hypothesis is testable

Amy Morin, LCSW, is a psychotherapist and international bestselling author. Her books, including "13 Things Mentally Strong People Don't Do," have been translated into more than 40 languages. Her TEDx talk,  "The Secret of Becoming Mentally Strong," is one of the most viewed talks of all time.

explain how your hypothesis is testable

Verywell / Alex Dos Diaz

  • The Scientific Method

Hypothesis Format

Falsifiability of a hypothesis, operational definitions, types of hypotheses, hypotheses examples.

  • Collecting Data

Frequently Asked Questions

A hypothesis is a tentative statement about the relationship between two or more  variables. It is a specific, testable prediction about what you expect to happen in a study.

One hypothesis example would be a study designed to look at the relationship between sleep deprivation and test performance might have a hypothesis that states: "This study is designed to assess the hypothesis that sleep-deprived people will perform worse on a test than individuals who are not sleep-deprived."

This article explores how a hypothesis is used in psychology research, how to write a good hypothesis, and the different types of hypotheses you might use.

The Hypothesis in the Scientific Method

In the scientific method , whether it involves research in psychology, biology, or some other area, a hypothesis represents what the researchers think will happen in an experiment. The scientific method involves the following steps:

  • Forming a question
  • Performing background research
  • Creating a hypothesis
  • Designing an experiment
  • Collecting data
  • Analyzing the results
  • Drawing conclusions
  • Communicating the results

The hypothesis is a prediction, but it involves more than a guess. Most of the time, the hypothesis begins with a question which is then explored through background research. It is only at this point that researchers begin to develop a testable hypothesis. Unless you are creating an exploratory study, your hypothesis should always explain what you  expect  to happen.

In a study exploring the effects of a particular drug, the hypothesis might be that researchers expect the drug to have some type of effect on the symptoms of a specific illness. In psychology, the hypothesis might focus on how a certain aspect of the environment might influence a particular behavior.

Remember, a hypothesis does not have to be correct. While the hypothesis predicts what the researchers expect to see, the goal of the research is to determine whether this guess is right or wrong. When conducting an experiment, researchers might explore a number of factors to determine which ones might contribute to the ultimate outcome.

In many cases, researchers may find that the results of an experiment  do not  support the original hypothesis. When writing up these results, the researchers might suggest other options that should be explored in future studies.

In many cases, researchers might draw a hypothesis from a specific theory or build on previous research. For example, prior research has shown that stress can impact the immune system. So a researcher might hypothesize: "People with high-stress levels will be more likely to contract a common cold after being exposed to the virus than people who have low-stress levels."

In other instances, researchers might look at commonly held beliefs or folk wisdom. "Birds of a feather flock together" is one example of folk wisdom that a psychologist might try to investigate. The researcher might pose a specific hypothesis that "People tend to select romantic partners who are similar to them in interests and educational level."

Elements of a Good Hypothesis

So how do you write a good hypothesis? When trying to come up with a hypothesis for your research or experiments, ask yourself the following questions:

  • Is your hypothesis based on your research on a topic?
  • Can your hypothesis be tested?
  • Does your hypothesis include independent and dependent variables?

Before you come up with a specific hypothesis, spend some time doing background research. Once you have completed a literature review, start thinking about potential questions you still have. Pay attention to the discussion section in the  journal articles you read . Many authors will suggest questions that still need to be explored.

To form a hypothesis, you should take these steps:

  • Collect as many observations about a topic or problem as you can.
  • Evaluate these observations and look for possible causes of the problem.
  • Create a list of possible explanations that you might want to explore.
  • After you have developed some possible hypotheses, think of ways that you could confirm or disprove each hypothesis through experimentation. This is known as falsifiability.

In the scientific method ,  falsifiability is an important part of any valid hypothesis.   In order to test a claim scientifically, it must be possible that the claim could be proven false.

Students sometimes confuse the idea of falsifiability with the idea that it means that something is false, which is not the case. What falsifiability means is that  if  something was false, then it is possible to demonstrate that it is false.

One of the hallmarks of pseudoscience is that it makes claims that cannot be refuted or proven false.

A variable is a factor or element that can be changed and manipulated in ways that are observable and measurable. However, the researcher must also define how the variable will be manipulated and measured in the study.

For example, a researcher might operationally define the variable " test anxiety " as the results of a self-report measure of anxiety experienced during an exam. A "study habits" variable might be defined by the amount of studying that actually occurs as measured by time.

These precise descriptions are important because many things can be measured in a number of different ways. One of the basic principles of any type of scientific research is that the results must be replicable.   By clearly detailing the specifics of how the variables were measured and manipulated, other researchers can better understand the results and repeat the study if needed.

Some variables are more difficult than others to define. How would you operationally define a variable such as aggression ? For obvious ethical reasons, researchers cannot create a situation in which a person behaves aggressively toward others.

In order to measure this variable, the researcher must devise a measurement that assesses aggressive behavior without harming other people. In this situation, the researcher might utilize a simulated task to measure aggressiveness.

Hypothesis Checklist

  • Does your hypothesis focus on something that you can actually test?
  • Does your hypothesis include both an independent and dependent variable?
  • Can you manipulate the variables?
  • Can your hypothesis be tested without violating ethical standards?

The hypothesis you use will depend on what you are investigating and hoping to find. Some of the main types of hypotheses that you might use include:

  • Simple hypothesis : This type of hypothesis suggests that there is a relationship between one independent variable and one dependent variable.
  • Complex hypothesis : This type of hypothesis suggests a relationship between three or more variables, such as two independent variables and a dependent variable.
  • Null hypothesis : This hypothesis suggests no relationship exists between two or more variables.
  • Alternative hypothesis : This hypothesis states the opposite of the null hypothesis.
  • Statistical hypothesis : This hypothesis uses statistical analysis to evaluate a representative sample of the population and then generalizes the findings to the larger group.
  • Logical hypothesis : This hypothesis assumes a relationship between variables without collecting data or evidence.

A hypothesis often follows a basic format of "If {this happens} then {this will happen}." One way to structure your hypothesis is to describe what will happen to the  dependent variable  if you change the  independent variable .

The basic format might be: "If {these changes are made to a certain independent variable}, then we will observe {a change in a specific dependent variable}."

A few examples of simple hypotheses:

  • "Students who eat breakfast will perform better on a math exam than students who do not eat breakfast."
  • Complex hypothesis: "Students who experience test anxiety before an English exam will get lower scores than students who do not experience test anxiety."​
  • "Motorists who talk on the phone while driving will be more likely to make errors on a driving course than those who do not talk on the phone."

Examples of a complex hypothesis include:

  • "People with high-sugar diets and sedentary activity levels are more likely to develop depression."
  • "Younger people who are regularly exposed to green, outdoor areas have better subjective well-being than older adults who have limited exposure to green spaces."

Examples of a null hypothesis include:

  • "Children who receive a new reading intervention will have scores different than students who do not receive the intervention."
  • "There will be no difference in scores on a memory recall task between children and adults."

Examples of an alternative hypothesis:

  • "Children who receive a new reading intervention will perform better than students who did not receive the intervention."
  • "Adults will perform better on a memory task than children." 

Collecting Data on Your Hypothesis

Once a researcher has formed a testable hypothesis, the next step is to select a research design and start collecting data. The research method depends largely on exactly what they are studying. There are two basic types of research methods: descriptive research and experimental research.

Descriptive Research Methods

Descriptive research such as  case studies ,  naturalistic observations , and surveys are often used when it would be impossible or difficult to  conduct an experiment . These methods are best used to describe different aspects of a behavior or psychological phenomenon.

Once a researcher has collected data using descriptive methods, a correlational study can then be used to look at how the variables are related. This type of research method might be used to investigate a hypothesis that is difficult to test experimentally.

Experimental Research Methods

Experimental methods  are used to demonstrate causal relationships between variables. In an experiment, the researcher systematically manipulates a variable of interest (known as the independent variable) and measures the effect on another variable (known as the dependent variable).

Unlike correlational studies, which can only be used to determine if there is a relationship between two variables, experimental methods can be used to determine the actual nature of the relationship—whether changes in one variable actually  cause  another to change.

A Word From Verywell

The hypothesis is a critical part of any scientific exploration. It represents what researchers expect to find in a study or experiment. In situations where the hypothesis is unsupported by the research, the research still has value. Such research helps us better understand how different aspects of the natural world relate to one another. It also helps us develop new hypotheses that can then be tested in the future.

Some examples of how to write a hypothesis include:

  • "Staying up late will lead to worse test performance the next day."
  • "People who consume one apple each day will visit the doctor fewer times each year."
  • "Breaking study sessions up into three 20-minute sessions will lead to better test results than a single 60-minute study session."

The four parts of a hypothesis are:

  • The research question
  • The independent variable (IV)
  • The dependent variable (DV)
  • The proposed relationship between the IV and DV

Castillo M. The scientific method: a need for something better? . AJNR Am J Neuroradiol. 2013;34(9):1669-71. doi:10.3174/ajnr.A3401

Nevid J. Psychology: Concepts and Applications. Wadworth, 2013.

By Kendra Cherry, MSEd Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

Research Hypothesis In Psychology: Types, & Examples

Saul Mcleod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul Mcleod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Learn about our Editorial Process

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

On This Page:

A research hypothesis, in its plural form “hypotheses,” is a specific, testable prediction about the anticipated results of a study, established at its outset. It is a key component of the scientific method .

Hypotheses connect theory to data and guide the research process towards expanding scientific understanding

Some key points about hypotheses:

  • A hypothesis expresses an expected pattern or relationship. It connects the variables under investigation.
  • It is stated in clear, precise terms before any data collection or analysis occurs. This makes the hypothesis testable.
  • A hypothesis must be falsifiable. It should be possible, even if unlikely in practice, to collect data that disconfirms rather than supports the hypothesis.
  • Hypotheses guide research. Scientists design studies to explicitly evaluate hypotheses about how nature works.
  • For a hypothesis to be valid, it must be testable against empirical evidence. The evidence can then confirm or disprove the testable predictions.
  • Hypotheses are informed by background knowledge and observation, but go beyond what is already known to propose an explanation of how or why something occurs.
Predictions typically arise from a thorough knowledge of the research literature, curiosity about real-world problems or implications, and integrating this to advance theory. They build on existing literature while providing new insight.

Types of Research Hypotheses

Alternative hypothesis.

The research hypothesis is often called the alternative or experimental hypothesis in experimental research.

It typically suggests a potential relationship between two key variables: the independent variable, which the researcher manipulates, and the dependent variable, which is measured based on those changes.

The alternative hypothesis states a relationship exists between the two variables being studied (one variable affects the other).

A hypothesis is a testable statement or prediction about the relationship between two or more variables. It is a key component of the scientific method. Some key points about hypotheses:

  • Important hypotheses lead to predictions that can be tested empirically. The evidence can then confirm or disprove the testable predictions.

In summary, a hypothesis is a precise, testable statement of what researchers expect to happen in a study and why. Hypotheses connect theory to data and guide the research process towards expanding scientific understanding.

An experimental hypothesis predicts what change(s) will occur in the dependent variable when the independent variable is manipulated.

It states that the results are not due to chance and are significant in supporting the theory being investigated.

The alternative hypothesis can be directional, indicating a specific direction of the effect, or non-directional, suggesting a difference without specifying its nature. It’s what researchers aim to support or demonstrate through their study.

Null Hypothesis

The null hypothesis states no relationship exists between the two variables being studied (one variable does not affect the other). There will be no changes in the dependent variable due to manipulating the independent variable.

It states results are due to chance and are not significant in supporting the idea being investigated.

The null hypothesis, positing no effect or relationship, is a foundational contrast to the research hypothesis in scientific inquiry. It establishes a baseline for statistical testing, promoting objectivity by initiating research from a neutral stance.

Many statistical methods are tailored to test the null hypothesis, determining the likelihood of observed results if no true effect exists.

This dual-hypothesis approach provides clarity, ensuring that research intentions are explicit, and fosters consistency across scientific studies, enhancing the standardization and interpretability of research outcomes.

Nondirectional Hypothesis

A non-directional hypothesis, also known as a two-tailed hypothesis, predicts that there is a difference or relationship between two variables but does not specify the direction of this relationship.

It merely indicates that a change or effect will occur without predicting which group will have higher or lower values.

For example, “There is a difference in performance between Group A and Group B” is a non-directional hypothesis.

Directional Hypothesis

A directional (one-tailed) hypothesis predicts the nature of the effect of the independent variable on the dependent variable. It predicts in which direction the change will take place. (i.e., greater, smaller, less, more)

It specifies whether one variable is greater, lesser, or different from another, rather than just indicating that there’s a difference without specifying its nature.

For example, “Exercise increases weight loss” is a directional hypothesis.

hypothesis

Falsifiability

The Falsification Principle, proposed by Karl Popper , is a way of demarcating science from non-science. It suggests that for a theory or hypothesis to be considered scientific, it must be testable and irrefutable.

Falsifiability emphasizes that scientific claims shouldn’t just be confirmable but should also have the potential to be proven wrong.

It means that there should exist some potential evidence or experiment that could prove the proposition false.

However many confirming instances exist for a theory, it only takes one counter observation to falsify it. For example, the hypothesis that “all swans are white,” can be falsified by observing a black swan.

For Popper, science should attempt to disprove a theory rather than attempt to continually provide evidence to support a research hypothesis.

Can a Hypothesis be Proven?

Hypotheses make probabilistic predictions. They state the expected outcome if a particular relationship exists. However, a study result supporting a hypothesis does not definitively prove it is true.

All studies have limitations. There may be unknown confounding factors or issues that limit the certainty of conclusions. Additional studies may yield different results.

In science, hypotheses can realistically only be supported with some degree of confidence, not proven. The process of science is to incrementally accumulate evidence for and against hypothesized relationships in an ongoing pursuit of better models and explanations that best fit the empirical data. But hypotheses remain open to revision and rejection if that is where the evidence leads.
  • Disproving a hypothesis is definitive. Solid disconfirmatory evidence will falsify a hypothesis and require altering or discarding it based on the evidence.
  • However, confirming evidence is always open to revision. Other explanations may account for the same results, and additional or contradictory evidence may emerge over time.

We can never 100% prove the alternative hypothesis. Instead, we see if we can disprove, or reject the null hypothesis.

If we reject the null hypothesis, this doesn’t mean that our alternative hypothesis is correct but does support the alternative/experimental hypothesis.

Upon analysis of the results, an alternative hypothesis can be rejected or supported, but it can never be proven to be correct. We must avoid any reference to results proving a theory as this implies 100% certainty, and there is always a chance that evidence may exist which could refute a theory.

How to Write a Hypothesis

  • Identify variables . The researcher manipulates the independent variable and the dependent variable is the measured outcome.
  • Operationalized the variables being investigated . Operationalization of a hypothesis refers to the process of making the variables physically measurable or testable, e.g. if you are about to study aggression, you might count the number of punches given by participants.
  • Decide on a direction for your prediction . If there is evidence in the literature to support a specific effect of the independent variable on the dependent variable, write a directional (one-tailed) hypothesis. If there are limited or ambiguous findings in the literature regarding the effect of the independent variable on the dependent variable, write a non-directional (two-tailed) hypothesis.
  • Make it Testable : Ensure your hypothesis can be tested through experimentation or observation. It should be possible to prove it false (principle of falsifiability).
  • Clear & concise language . A strong hypothesis is concise (typically one to two sentences long), and formulated using clear and straightforward language, ensuring it’s easily understood and testable.

Consider a hypothesis many teachers might subscribe to: students work better on Monday morning than on Friday afternoon (IV=Day, DV= Standard of work).

Now, if we decide to study this by giving the same group of students a lesson on a Monday morning and a Friday afternoon and then measuring their immediate recall of the material covered in each session, we would end up with the following:

  • The alternative hypothesis states that students will recall significantly more information on a Monday morning than on a Friday afternoon.
  • The null hypothesis states that there will be no significant difference in the amount recalled on a Monday morning compared to a Friday afternoon. Any difference will be due to chance or confounding factors.

More Examples

  • Memory : Participants exposed to classical music during study sessions will recall more items from a list than those who studied in silence.
  • Social Psychology : Individuals who frequently engage in social media use will report higher levels of perceived social isolation compared to those who use it infrequently.
  • Developmental Psychology : Children who engage in regular imaginative play have better problem-solving skills than those who don’t.
  • Clinical Psychology : Cognitive-behavioral therapy will be more effective in reducing symptoms of anxiety over a 6-month period compared to traditional talk therapy.
  • Cognitive Psychology : Individuals who multitask between various electronic devices will have shorter attention spans on focused tasks than those who single-task.
  • Health Psychology : Patients who practice mindfulness meditation will experience lower levels of chronic pain compared to those who don’t meditate.
  • Organizational Psychology : Employees in open-plan offices will report higher levels of stress than those in private offices.
  • Behavioral Psychology : Rats rewarded with food after pressing a lever will press it more frequently than rats who receive no reward.

Print Friendly, PDF & Email

Sciencing_Icons_Science SCIENCE

Sciencing_icons_biology biology, sciencing_icons_cells cells, sciencing_icons_molecular molecular, sciencing_icons_microorganisms microorganisms, sciencing_icons_genetics genetics, sciencing_icons_human body human body, sciencing_icons_ecology ecology, sciencing_icons_chemistry chemistry, sciencing_icons_atomic & molecular structure atomic & molecular structure, sciencing_icons_bonds bonds, sciencing_icons_reactions reactions, sciencing_icons_stoichiometry stoichiometry, sciencing_icons_solutions solutions, sciencing_icons_acids & bases acids & bases, sciencing_icons_thermodynamics thermodynamics, sciencing_icons_organic chemistry organic chemistry, sciencing_icons_physics physics, sciencing_icons_fundamentals-physics fundamentals, sciencing_icons_electronics electronics, sciencing_icons_waves waves, sciencing_icons_energy energy, sciencing_icons_fluid fluid, sciencing_icons_astronomy astronomy, sciencing_icons_geology geology, sciencing_icons_fundamentals-geology fundamentals, sciencing_icons_minerals & rocks minerals & rocks, sciencing_icons_earth scructure earth structure, sciencing_icons_fossils fossils, sciencing_icons_natural disasters natural disasters, sciencing_icons_nature nature, sciencing_icons_ecosystems ecosystems, sciencing_icons_environment environment, sciencing_icons_insects insects, sciencing_icons_plants & mushrooms plants & mushrooms, sciencing_icons_animals animals, sciencing_icons_math math, sciencing_icons_arithmetic arithmetic, sciencing_icons_addition & subtraction addition & subtraction, sciencing_icons_multiplication & division multiplication & division, sciencing_icons_decimals decimals, sciencing_icons_fractions fractions, sciencing_icons_conversions conversions, sciencing_icons_algebra algebra, sciencing_icons_working with units working with units, sciencing_icons_equations & expressions equations & expressions, sciencing_icons_ratios & proportions ratios & proportions, sciencing_icons_inequalities inequalities, sciencing_icons_exponents & logarithms exponents & logarithms, sciencing_icons_factorization factorization, sciencing_icons_functions functions, sciencing_icons_linear equations linear equations, sciencing_icons_graphs graphs, sciencing_icons_quadratics quadratics, sciencing_icons_polynomials polynomials, sciencing_icons_geometry geometry, sciencing_icons_fundamentals-geometry fundamentals, sciencing_icons_cartesian cartesian, sciencing_icons_circles circles, sciencing_icons_solids solids, sciencing_icons_trigonometry trigonometry, sciencing_icons_probability-statistics probability & statistics, sciencing_icons_mean-median-mode mean/median/mode, sciencing_icons_independent-dependent variables independent/dependent variables, sciencing_icons_deviation deviation, sciencing_icons_correlation correlation, sciencing_icons_sampling sampling, sciencing_icons_distributions distributions, sciencing_icons_probability probability, sciencing_icons_calculus calculus, sciencing_icons_differentiation-integration differentiation/integration, sciencing_icons_application application, sciencing_icons_projects projects, sciencing_icons_news news.

  • Share Tweet Email Print
  • Home ⋅
  • Science Fair Project Ideas for Kids, Middle & High School Students ⋅

How to Write a Testable Hypothesis

Which freezes faster, hot or cold water? is a testable observation.

Science Projects on What Liquid Freezes Faster

A testable hypothesis is one that can be used as the basis for an experiment. It predicts the correlation between two variables and can be tested by varying one of the variables. If the variables cannot be measured, the hypothesis cannot be proved or disproved. If one of the variables cannot be varied, it is impossible to conduct an experiment. If more than one variable is changed, the results are inconclusive. To write a testable hypothesis, it is important to consider how it will be tested and what makes for a valid experiment.

Make an observation about the correlation between two things. This can be in the form of a question, such as, "Which freezes faster, hot water or cold water?" or in the form of a theory, such as, "Cold water probably freezes faster than hot water."

Evaluate the observation and list the variables. For the observation about water, there are two: the temperature of the water and the time it takes the water to freeze.

Evaluate the variables to determine whether they can be varied or measured. The water temperature and the time it takes to freeze are measurable, and one of the variables, the temperature of the water, can be varied. Also consider how to keep all other variables, such as the size of the ice cube trays and their placement in the freezer, constant so that only the temperature of the water changes.

Consider how you would test the variables to explore your observation. For example, you could boil water, pour it into ice cube trays and put the trays in the freezer, then note the time it goes into the freezer and how long it takes to freeze. You could then take cold water and do the same thing. In each case, you would measure the temperature of the water before putting it into the freezer. The process could be repeated several times to create enough data to determine whether there is a correlation between water temperature and how long it takes to freeze. If you cannot determine a way to test or observe the variables, you have no basis for a good hypothesis.

Write a statement that includes the variables and predicts an outcome that can be tested. For example, "The colder the water used in ice cube trays, the faster it will freeze." This is a testable hypothesis.

Related Articles

Difference between manipulative & responding variable, water evaporation science fair projects, definitions of control, constant, independent and dependent..., how to do a science project step-by-step, difference between data & conclusion of a study, why should you only test for one variable at a time..., science projects: how hot & cold water changes a balloon, what is a testable prediction, how to calculate a temperature range, how to grow a plant from a bean as a science project, what is the incubation period for duck eggs, what are dependent, independent & controlled variables, what are the independent variables for a moldy bread..., water density science experiments, why is water good to use in a calorimeter, what is the meaning of variables in research, why does adding salt to water make it colder, what happens when you put dry ice in water, cool 8th grade science experiments.

  • Science Buddies: Writing a Hypothesis for Your Science Fair Project

About the Author

Since 1998 Yvonne Garcia has been a technology and business consultant. She focuses on developing and conducting trainings on a variety of business issues and software programs, writing manuals, proposals and how-to guides as well as designing and maintaining blogs and websites. Garcia received her Bachelor of Fine Arts in creative writing from Goddard College.

Photo Credits

Photos.com/Photos.com/Getty Images

Find Your Next Great Science Fair Project! GO

We Have More Great Sciencing Articles!

Definitions of Control, Constant, Independent and Dependent Variables in a Science Experiment

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Statistics LibreTexts

Hypothesis Testing

  • Last updated
  • Save as PDF
  • Page ID 31289

CO-6: Apply basic concepts of probability, random variation, and commonly used statistical probability distributions.

Learning Objectives

LO 6.26: Outline the logic and process of hypothesis testing.

LO 6.27: Explain what the p-value is and how it is used to draw conclusions.

Video: Hypothesis Testing (8:43)

Introduction

We are in the middle of the part of the course that has to do with inference for one variable.

So far, we talked about point estimation and learned how interval estimation enhances it by quantifying the magnitude of the estimation error (with a certain level of confidence) in the form of the margin of error. The result is the confidence interval — an interval that, with a certain confidence, we believe captures the unknown parameter.

We are now moving to the other kind of inference, hypothesis testing . We say that hypothesis testing is “the other kind” because, unlike the inferential methods we presented so far, where the goal was estimating the unknown parameter, the idea, logic and goal of hypothesis testing are quite different.

In the first two parts of this section we will discuss the idea behind hypothesis testing, explain how it works, and introduce new terminology that emerges in this form of inference. The final two parts will be more specific and will discuss hypothesis testing for the population proportion ( p ) and the population mean ( μ, mu).

If this is your first statistics course, you will need to spend considerable time on this topic as there are many new ideas. Many students find this process and its logic difficult to understand in the beginning.

In this section, we will use the hypothesis test for a population proportion to motivate our understanding of the process. We will conduct these tests manually. For all future hypothesis test procedures, including problems involving means, we will use software to obtain the results and focus on interpreting them in the context of our scenario.

General Idea and Logic of Hypothesis Testing

The purpose of this section is to gradually build your understanding about how statistical hypothesis testing works. We start by explaining the general logic behind the process of hypothesis testing. Once we are confident that you understand this logic, we will add some more details and terminology.

To start our discussion about the idea behind statistical hypothesis testing, consider the following example:

A case of suspected cheating on an exam is brought in front of the disciplinary committee at a certain university.

There are two opposing claims in this case:

  • The student’s claim: I did not cheat on the exam.
  • The instructor’s claim: The student did cheat on the exam.

Adhering to the principle “innocent until proven guilty,” the committee asks the instructor for evidence to support his claim. The instructor explains that the exam had two versions, and shows the committee members that on three separate exam questions, the student used in his solution numbers that were given in the other version of the exam.

The committee members all agree that it would be extremely unlikely to get evidence like that if the student’s claim of not cheating had been true. In other words, the committee members all agree that the instructor brought forward strong enough evidence to reject the student’s claim, and conclude that the student did cheat on the exam.

What does this example have to do with statistics?

While it is true that this story seems unrelated to statistics, it captures all the elements of hypothesis testing and the logic behind it. Before you read on to understand why, it would be useful to read the example again. Please do so now.

Statistical hypothesis testing is defined as:

  • Assessing evidence provided by the data against the null claim (the claim which is to be assumed true unless enough evidence exists to reject it).

Here is how the process of statistical hypothesis testing works:

  • We have two claims about what is going on in the population. Let’s call them claim 1 (this will be the null claim or hypothesis) and claim 2 (this will be the alternative) . Much like the story above, where the student’s claim is challenged by the instructor’s claim, the null claim 1 is challenged by the alternative claim 2. (For us, these claims are usually about the value of population parameter(s) or about the existence or nonexistence of a relationship between two variables in the population).
  • We choose a sample, collect relevant data and summarize them (this is similar to the instructor collecting evidence from the student’s exam). For statistical tests, this step will also involve checking any conditions or assumptions.
  • We figure out how likely it is to observe data like the data we obtained, if claim 1 is true. (Note that the wording “how likely …” implies that this step requires some kind of probability calculation). In the story, the committee members assessed how likely it is to observe evidence such as the instructor provided, had the student’s claim of not cheating been true.
  • If, after assuming claim 1 is true, we find that it would be extremely unlikely to observe data as strong as ours or stronger in favor of claim 2, then we have strong evidence against claim 1, and we reject it in favor of claim 2. Later we will see this corresponds to a small p-value.
  • If, after assuming claim 1 is true, we find that observing data as strong as ours or stronger in favor of claim 2 is NOT VERY UNLIKELY , then we do not have enough evidence against claim 1, and therefore we cannot reject it in favor of claim 2. Later we will see this corresponds to a p-value which is not small.

In our story, the committee decided that it would be extremely unlikely to find the evidence that the instructor provided had the student’s claim of not cheating been true. In other words, the members felt that it is extremely unlikely that it is just a coincidence (random chance) that the student used the numbers from the other version of the exam on three separate problems. The committee members therefore decided to reject the student’s claim and concluded that the student had, indeed, cheated on the exam. (Wouldn’t you conclude the same?)

Hopefully this example helped you understand the logic behind hypothesis testing.

Interactive Applet: Reasoning of a Statistical Test

To strengthen your understanding of the process of hypothesis testing and the logic behind it, let’s look at three statistical examples.

A recent study estimated that 20% of all college students in the United States smoke. The head of Health Services at Goodheart University (GU) suspects that the proportion of smokers may be lower at GU. In hopes of confirming her claim, the head of Health Services chooses a random sample of 400 Goodheart students, and finds that 70 of them are smokers.

Let’s analyze this example using the 4 steps outlined above:

  • claim 1: The proportion of smokers at Goodheart is 0.20.
  • claim 2: The proportion of smokers at Goodheart is less than 0.20.

Claim 1 basically says “nothing special goes on at Goodheart University; the proportion of smokers there is no different from the proportion in the entire country.” This claim is challenged by the head of Health Services, who suspects that the proportion of smokers at Goodheart is lower.

  • Choosing a sample and collecting data: A sample of n = 400 was chosen, and summarizing the data revealed that the sample proportion of smokers is p -hat = 70/400 = 0.175.While it is true that 0.175 is less than 0.20, it is not clear whether this is strong enough evidence against claim 1. We must account for sampling variation.
  • Assessment of evidence: In order to assess whether the data provide strong enough evidence against claim 1, we need to ask ourselves: How surprising is it to get a sample proportion as low as p -hat = 0.175 (or lower), assuming claim 1 is true? In other words, we need to find how likely it is that in a random sample of size n = 400 taken from a population where the proportion of smokers is p = 0.20 we’ll get a sample proportion as low as p -hat = 0.175 (or lower).It turns out that the probability that we’ll get a sample proportion as low as p -hat = 0.175 (or lower) in such a sample is roughly 0.106 (do not worry about how this was calculated at this point – however, if you think about it hopefully you can see that the key is the sampling distribution of p -hat).
  • Conclusion: Well, we found that if claim 1 were true there is a probability of 0.106 of observing data like that observed or more extreme. Now you have to decide …Do you think that a probability of 0.106 makes our data rare enough (surprising enough) under claim 1 so that the fact that we did observe it is enough evidence to reject claim 1? Or do you feel that a probability of 0.106 means that data like we observed are not very likely when claim 1 is true, but they are not unlikely enough to conclude that getting such data is sufficient evidence to reject claim 1. Basically, this is your decision. However, it would be nice to have some kind of guideline about what is generally considered surprising enough.

A certain prescription allergy medicine is supposed to contain an average of 245 parts per million (ppm) of a certain chemical. If the concentration is higher than 245 ppm, the drug will likely cause unpleasant side effects, and if the concentration is below 245 ppm, the drug may be ineffective. The manufacturer wants to check whether the mean concentration in a large shipment is the required 245 ppm or not. To this end, a random sample of 64 portions from the large shipment is tested, and it is found that the sample mean concentration is 250 ppm with a sample standard deviation of 12 ppm.

  • Claim 1: The mean concentration in the shipment is the required 245 ppm.
  • Claim 2: The mean concentration in the shipment is not the required 245 ppm.

Note that again, claim 1 basically says: “There is nothing unusual about this shipment, the mean concentration is the required 245 ppm.” This claim is challenged by the manufacturer, who wants to check whether that is, indeed, the case or not.

  • Choosing a sample and collecting data: A sample of n = 64 portions is chosen and after summarizing the data it is found that the sample mean concentration is x-bar = 250 and the sample standard deviation is s = 12.Is the fact that x-bar = 250 is different from 245 strong enough evidence to reject claim 1 and conclude that the mean concentration in the whole shipment is not the required 245? In other words, do the data provide strong enough evidence to reject claim 1?
  • Assessing the evidence: In order to assess whether the data provide strong enough evidence against claim 1, we need to ask ourselves the following question: If the mean concentration in the whole shipment were really the required 245 ppm (i.e., if claim 1 were true), how surprising would it be to observe a sample of 64 portions where the sample mean concentration is off by 5 ppm or more (as we did)? It turns out that it would be extremely unlikely to get such a result if the mean concentration were really the required 245. There is only a probability of 0.0007 (i.e., 7 in 10,000) of that happening. (Do not worry about how this was calculated at this point, but again, the key will be the sampling distribution.)
  • Making conclusions: Here, it is pretty clear that a sample like the one we observed or more extreme is VERY rare (or extremely unlikely) if the mean concentration in the shipment were really the required 245 ppm. The fact that we did observe such a sample therefore provides strong evidence against claim 1, so we reject it and conclude with very little doubt that the mean concentration in the shipment is not the required 245 ppm.

Do you think that you’re getting it? Let’s make sure, and look at another example.

Is there a relationship between gender and combined scores (Math + Verbal) on the SAT exam?

Following a report on the College Board website, which showed that in 2003, males scored generally higher than females on the SAT exam, an educational researcher wanted to check whether this was also the case in her school district. The researcher chose random samples of 150 males and 150 females from her school district, collected data on their SAT performance and found the following:

Again, let’s see how the process of hypothesis testing works for this example:

  • Claim 1: Performance on the SAT is not related to gender (males and females score the same).
  • Claim 2: Performance on the SAT is related to gender – males score higher.

Note that again, claim 1 basically says: “There is nothing going on between the variables SAT and gender.” Claim 2 represents what the researcher wants to check, or suspects might actually be the case.

  • Choosing a sample and collecting data: Data were collected and summarized as given above. Is the fact that the sample mean score of males (1,025) is higher than the sample mean score of females (1,010) by 15 points strong enough information to reject claim 1 and conclude that in this researcher’s school district, males score higher on the SAT than females?
  • Assessment of evidence: In order to assess whether the data provide strong enough evidence against claim 1, we need to ask ourselves: If SAT scores are in fact not related to gender (claim 1 is true), how likely is it to get data like the data we observed, in which the difference between the males’ average and females’ average score is as high as 15 points or higher? It turns out that the probability of observing such a sample result if SAT score is not related to gender is approximately 0.29 (Again, do not worry about how this was calculated at this point).
  • Conclusion: Here, we have an example where observing a sample like the one we observed or more extreme is definitely not surprising (roughly 30% chance) if claim 1 were true (i.e., if indeed there is no difference in SAT scores between males and females). We therefore conclude that our data does not provide enough evidence for rejecting claim 1.
  • “The data provide enough evidence to reject claim 1 and accept claim 2”; or
  • “The data do not provide enough evidence to reject claim 1.”

In particular, note that in the second type of conclusion we did not say: “ I accept claim 1 ,” but only “ I don’t have enough evidence to reject claim 1 .” We will come back to this issue later, but this is a good place to make you aware of this subtle difference.

Hopefully by now, you understand the logic behind the statistical hypothesis testing process. Here is a summary:

A flow chart describing the process. First, we state Claim 1 and Claim 2. Claim 1 says "nothing special is going on" and is challenged by claim 2. Second, we collect relevant data and summarize it. Third, we assess how surprising it woudl be to observe data like that observed if Claim 1 is true. Fourth, we draw conclusions in context.

Learn by Doing: Logic of Hypothesis Testing

Did I Get This?: Logic of Hypothesis Testing

Steps in Hypothesis Testing

Video: Steps in Hypothesis Testing (16:02)

Now that we understand the general idea of how statistical hypothesis testing works, let’s go back to each of the steps and delve slightly deeper, getting more details and learning some terminology.

Hypothesis Testing Step 1: State the Hypotheses

In all three examples, our aim is to decide between two opposing points of view, Claim 1 and Claim 2. In hypothesis testing, Claim 1 is called the null hypothesis (denoted “ Ho “), and Claim 2 plays the role of the alternative hypothesis (denoted “ Ha “). As we saw in the three examples, the null hypothesis suggests nothing special is going on; in other words, there is no change from the status quo, no difference from the traditional state of affairs, no relationship. In contrast, the alternative hypothesis disagrees with this, stating that something is going on, or there is a change from the status quo, or there is a difference from the traditional state of affairs. The alternative hypothesis, Ha, usually represents what we want to check or what we suspect is really going on.

Let’s go back to our three examples and apply the new notation:

In example 1:

  • Ho: The proportion of smokers at GU is 0.20.
  • Ha: The proportion of smokers at GU is less than 0.20.

In example 2:

  • Ho: The mean concentration in the shipment is the required 245 ppm.
  • Ha: The mean concentration in the shipment is not the required 245 ppm.

In example 3:

  • Ho: Performance on the SAT is not related to gender (males and females score the same).
  • Ha: Performance on the SAT is related to gender – males score higher.

Learn by Doing: State the Hypotheses

Did I Get This?: State the Hypotheses

Hypothesis Testing Step 2: Collect Data, Check Conditions and Summarize Data

This step is pretty obvious. This is what inference is all about. You look at sampled data in order to draw conclusions about the entire population. In the case of hypothesis testing, based on the data, you draw conclusions about whether or not there is enough evidence to reject Ho.

There is, however, one detail that we would like to add here. In this step we collect data and summarize it. Go back and look at the second step in our three examples. Note that in order to summarize the data we used simple sample statistics such as the sample proportion ( p -hat), sample mean (x-bar) and the sample standard deviation (s).

In practice, you go a step further and use these sample statistics to summarize the data with what’s called a test statistic . We are not going to go into any details right now, but we will discuss test statistics when we go through the specific tests.

This step will also involve checking any conditions or assumptions required to use the test.

Hypothesis Testing Step 3: Assess the Evidence

As we saw, this is the step where we calculate how likely is it to get data like that observed (or more extreme) when Ho is true. In a sense, this is the heart of the process, since we draw our conclusions based on this probability.

  • If this probability is very small (see example 2), then that means that it would be very surprising to get data like that observed (or more extreme) if Ho were true. The fact that we did observe such data is therefore evidence against Ho, and we should reject it.
  • On the other hand, if this probability is not very small (see example 3) this means that observing data like that observed (or more extreme) is not very surprising if Ho were true. The fact that we observed such data does not provide evidence against Ho. This crucial probability, therefore, has a special name. It is called the p-value of the test.

In our three examples, the p-values were given to you (and you were reassured that you didn’t need to worry about how these were derived yet):

  • Example 1: p-value = 0.106
  • Example 2: p-value = 0.0007
  • Example 3: p-value = 0.29

Obviously, the smaller the p-value, the more surprising it is to get data like ours (or more extreme) when Ho is true, and therefore, the stronger the evidence the data provide against Ho.

Looking at the three p-values of our three examples, we see that the data that we observed in example 2 provide the strongest evidence against the null hypothesis, followed by example 1, while the data in example 3 provides the least evidence against Ho.

  • Right now we will not go into specific details about p-value calculations, but just mention that since the p-value is the probability of getting data like those observed (or more extreme) when Ho is true, it would make sense that the calculation of the p-value will be based on the data summary, which, as we mentioned, is the test statistic. Indeed, this is the case. In practice, we will mostly use software to provide the p-value for us.

Hypothesis Testing Step 4: Making Conclusions

Since our statistical conclusion is based on how small the p-value is, or in other words, how surprising our data are when Ho is true, it would be nice to have some kind of guideline or cutoff that will help determine how small the p-value must be, or how “rare” (unlikely) our data must be when Ho is true, for us to conclude that we have enough evidence to reject Ho.

This cutoff exists, and because it is so important, it has a special name. It is called the significance level of the test and is usually denoted by the Greek letter α (alpha). The most commonly used significance level is α (alpha) = 0.05 (or 5%). This means that:

  • if the p-value < α (alpha) (usually 0.05), then the data we obtained is considered to be “rare (or surprising) enough” under the assumption that Ho is true, and we say that the data provide statistically significant evidence against Ho, so we reject Ho and thus accept Ha.
  • if the p-value > α (alpha)(usually 0.05), then our data are not considered to be “surprising enough” under the assumption that Ho is true, and we say that our data do not provide enough evidence to reject Ho (or, equivalently, that the data do not provide enough evidence to accept Ha).

Now that we have a cutoff to use, here are the appropriate conclusions for each of our examples based upon the p-values we were given.

In Example 1:

  • Using our cutoff of 0.05, we fail to reject Ho.
  • Conclusion : There IS NOT enough evidence that the proportion of smokers at GU is less than 0.20
  • Still we should consider: Does the evidence seen in the data provide any practical evidence towards our alternative hypothesis?

In Example 2:

  • Using our cutoff of 0.05, we reject Ho.
  • Conclusion : There IS enough evidence that the mean concentration in the shipment is not the required 245 ppm.

In Example 3:

  • Conclusion : There IS NOT enough evidence that males score higher on average than females on the SAT.

Notice that all of the above conclusions are written in terms of the alternative hypothesis and are given in the context of the situation. In no situation have we claimed the null hypothesis is true. Be very careful of this and other issues discussed in the following comments.

  • Although the significance level provides a good guideline for drawing our conclusions, it should not be treated as an incontrovertible truth. There is a lot of room for personal interpretation. What if your p-value is 0.052? You might want to stick to the rules and say “0.052 > 0.05 and therefore I don’t have enough evidence to reject Ho”, but you might decide that 0.052 is small enough for you to believe that Ho should be rejected. It should be noted that scientific journals do consider 0.05 to be the cutoff point for which any p-value below the cutoff indicates enough evidence against Ho, and any p-value above it, or even equal to it , indicates there is not enough evidence against Ho. Although a p-value between 0.05 and 0.10 is often reported as marginally statistically significant.
  • It is important to draw your conclusions in context . It is never enough to say: “p-value = …, and therefore I have enough evidence to reject Ho at the 0.05 significance level.” You should always word your conclusion in terms of the data. Although we will use the terminology of “rejecting Ho” or “failing to reject Ho” – this is mostly due to the fact that we are instructing you in these concepts. In practice, this language is rarely used. We also suggest writing your conclusion in terms of the alternative hypothesis.Is there or is there not enough evidence that the alternative hypothesis is true?
  • Let’s go back to the issue of the nature of the two types of conclusions that I can make.
  • Either I reject Ho (when the p-value is smaller than the significance level)
  • or I cannot reject Ho (when the p-value is larger than the significance level).

As we mentioned earlier, note that the second conclusion does not imply that I accept Ho, but just that I don’t have enough evidence to reject it. Saying (by mistake) “I don’t have enough evidence to reject Ho so I accept it” indicates that the data provide evidence that Ho is true, which is not necessarily the case . Consider the following slightly artificial yet effective example:

An employer claims to subscribe to an “equal opportunity” policy, not hiring men any more often than women for managerial positions. Is this credible? You’re not sure, so you want to test the following two hypotheses:

  • Ho: The proportion of male managers hired is 0.5
  • Ha: The proportion of male managers hired is more than 0.5

Data: You choose at random three of the new managers who were hired in the last 5 years and find that all 3 are men.

Assessing Evidence: If the proportion of male managers hired is really 0.5 (Ho is true), then the probability that the random selection of three managers will yield three males is therefore 0.5 * 0.5 * 0.5 = 0.125. This is the p-value (using the multiplication rule for independent events).

Conclusion: Using 0.05 as the significance level, you conclude that since the p-value = 0.125 > 0.05, the fact that the three randomly selected managers were all males is not enough evidence to reject the employer’s claim of subscribing to an equal opportunity policy (Ho).

However, the data (all three selected are males) definitely does NOT provide evidence to accept the employer’s claim (Ho).

Learn By Doing: Using p-values

Did I Get This?: Using p-values

Comment about wording: Another common wording in scientific journals is:

  • “The results are statistically significant” – when the p-value < α (alpha).
  • “The results are not statistically significant” – when the p-value > α (alpha).

Often you will see significance levels reported with additional description to indicate the degree of statistical significance. A general guideline (although not required in our course) is:

  • If 0.01 ≤ p-value < 0.05, then the results are (statistically) significant .
  • If 0.001 ≤ p-value < 0.01, then the results are highly statistically significant .
  • If p-value < 0.001, then the results are very highly statistically significant .
  • If p-value > 0.05, then the results are not statistically significant (NS).
  • If 0.05 ≤ p-value < 0.10, then the results are marginally statistically significant .

Let’s summarize

We learned quite a lot about hypothesis testing. We learned the logic behind it, what the key elements are, and what types of conclusions we can and cannot draw in hypothesis testing. Here is a quick recap:

Video: Hypothesis Testing Overview (2:20)

Here are a few more activities if you need some additional practice.

Did I Get This?: Hypothesis Testing Overview

  • Notice that the p-value is an example of a conditional probability . We calculate the probability of obtaining results like those of our data (or more extreme) GIVEN the null hypothesis is true. We could write P(Obtaining results like ours or more extreme | Ho is True).
  • We could write P(Obtaining a test statistic as or more extreme than ours | Ho is True).
  • In this case we are asking “Assuming the null hypothesis is true, how rare is it to observe something as or more extreme than what I have found in my data?”
  • If after assuming the null hypothesis is true, what we have found in our data is extremely rare (small p-value), this provides evidence to reject our assumption that Ho is true in favor of Ha.
  • The p-value can also be thought of as the probability, assuming the null hypothesis is true, that the result we have seen is solely due to random error (or random chance). We have already seen that statistics from samples collected from a population vary. There is random error or random chance involved when we sample from populations.

In this setting, if the p-value is very small, this implies, assuming the null hypothesis is true, that it is extremely unlikely that the results we have obtained would have happened due to random error alone, and thus our assumption (Ho) is rejected in favor of the alternative hypothesis (Ha).

  • It is EXTREMELY important that you find a definition of the p-value which makes sense to you. New students often need to contemplate this idea repeatedly through a variety of examples and explanations before becoming comfortable with this idea. It is one of the two most important concepts in statistics (the other being confidence intervals).
  • We infer that the alternative hypothesis is true ONLY by rejecting the null hypothesis.
  • A statistically significant result is one that has a very low probability of occurring if the null hypothesis is true.
  • Results which are statistically significant may or may not have practical significance and vice versa.

Error and Power

LO 6.28: Define a Type I and Type II error in general and in the context of specific scenarios.

LO 6.29: Explain the concept of the power of a statistical test including the relationship between power, sample size, and effect size.

Video: Errors and Power (12:03)

Type I and Type II Errors in Hypothesis Tests

We have not yet discussed the fact that we are not guaranteed to make the correct decision by this process of hypothesis testing. Maybe you are beginning to see that there is always some level of uncertainty in statistics.

Let’s think about what we know already and define the possible errors we can make in hypothesis testing. When we conduct a hypothesis test, we choose one of two possible conclusions based upon our data.

If the p-value is smaller than your pre-specified significance level (α, alpha), you reject the null hypothesis and either

  • You have made the correct decision since the null hypothesis is false
  • You have made an error ( Type I ) and rejected Ho when in fact Ho is true (your data happened to be a RARE EVENT under Ho)

If the p-value is greater than (or equal to) your chosen significance level (α, alpha), you fail to reject the null hypothesis and either

  • You have made the correct decision since the null hypothesis is true
  • You have made an error ( Type II ) and failed to reject Ho when in fact Ho is false (the alternative hypothesis, Ha, is true)

The following summarizes the four possible results which can be obtained from a hypothesis test. Notice the rows represent the decision made in the hypothesis test and the columns represent the (usually unknown) truth in reality.

mod12-errors1

Although the truth is unknown in practice – or we would not be conducting the test – we know it must be the case that either the null hypothesis is true or the null hypothesis is false. It is also the case that either decision we make in a hypothesis test can result in an incorrect conclusion!

A TYPE I Error occurs when we Reject Ho when, in fact, Ho is True. In this case, we mistakenly reject a true null hypothesis.

  • P(TYPE I Error) = P(Reject Ho | Ho is True) = α = alpha = Significance Level

A TYPE II Error occurs when we fail to Reject Ho when, in fact, Ho is False. In this case we fail to reject a false null hypothesis.

P(TYPE II Error) = P(Fail to Reject Ho | Ho is False) = β = beta

When our significance level is 5%, we are saying that we will allow ourselves to make a Type I error less than 5% of the time. In the long run, if we repeat the process, 5% of the time we will find a p-value < 0.05 when in fact the null hypothesis was true.

In this case, our data represent a rare occurrence which is unlikely to happen but is still possible. For example, suppose we toss a coin 10 times and obtain 10 heads, this is unlikely for a fair coin but not impossible. We might conclude the coin is unfair when in fact we simply saw a very rare event for this fair coin.

Our testing procedure CONTROLS for the Type I error when we set a pre-determined value for the significance level.

Notice that these probabilities are conditional probabilities. This is one more reason why conditional probability is an important concept in statistics.

Unfortunately, calculating the probability of a Type II error requires us to know the truth about the population. In practice we can only calculate this probability using a series of “what if” calculations which depend upon the type of problem.

Comment: As you initially read through the examples below, focus on the broad concepts instead of the small details. It is not important to understand how to calculate these values yourself at this point.

  • Try to understand the pictures we present. Which pictures represent an assumed null hypothesis and which represent an alternative?
  • It may be useful to come back to this page (and the activities here) after you have reviewed the rest of the section on hypothesis testing and have worked a few problems yourself.

Interactive Applet: Statistical Significance

Here are two examples of using an older version of this applet. It looks slightly different but the same settings and options are available in the version above.

In both cases we will consider IQ scores.

Our null hypothesis is that the true mean is 100. Assume the standard deviation is 16 and we will specify a significance level of 5%.

In this example we will specify that the true mean is indeed 100 so that the null hypothesis is true. Most of the time (95%), when we generate a sample, we should fail to reject the null hypothesis since the null hypothesis is indeed true.

Here is one sample that results in a correct decision:

mod12-significance_ex1a

In the sample above, we obtain an x-bar of 105, which is drawn on the distribution which assumes μ (mu) = 100 (the null hypothesis is true). Notice the sample is shown as blue dots along the x-axis and the shaded region shows for which values of x-bar we would reject the null hypothesis. In other words, we would reject Ho whenever the x-bar falls in the shaded region.

Enter the same values and generate samples until you obtain a Type I error (you falsely reject the null hypothesis). You should see something like this:

mod12-significance_ex2

If you were to generate 100 samples, you should have around 5% where you rejected Ho. These would be samples which would result in a Type I error.

The previous example illustrates a correct decision and a Type I error when the null hypothesis is true. The next example illustrates a correct decision and Type II error when the null hypothesis is false. In this case, we must specify the true population mean.

Let’s suppose we are sampling from an honors program and that the true mean IQ for this population is 110. We do not know the probability of a Type II error without more detailed calculations.

Let’s start with a sample which results in a correct decision.

mod12-significance_ex3

In the sample above, we obtain an x-bar of 111, which is drawn on the distribution which assumes μ (mu) = 100 (the null hypothesis is true).

Enter the same values and generate samples until you obtain a Type II error (you fail to reject the null hypothesis). You should see something like this:

mod12-significance_ex4

You should notice that in this case (when Ho is false), it is easier to obtain an incorrect decision (a Type II error) than it was in the case where Ho is true. If you generate 100 samples, you can approximate the probability of a Type II error.

We can find the probability of a Type II error by visualizing both the assumed distribution and the true distribution together. The image below is adapted from an applet we will use when we discuss the power of a statistical test.

mod12-significance_ex5a

There is a 37.4% chance that, in the long run, we will make a Type II error and fail to reject the null hypothesis when in fact the true mean IQ is 110 in the population from which we sample our 10 individuals.

Can you visualize what will happen if the true population mean is really 115 or 108? When will the Type II error increase? When will it decrease? We will look at this idea again when we discuss the concept of power in hypothesis tests.

  • It is important to note that there is a trade-off between the probability of a Type I and a Type II error. If we decrease the probability of one of these errors, the probability of the other will increase! The practical result of this is that if we require stronger evidence to reject the null hypothesis (smaller significance level = probability of a Type I error), we will increase the chance that we will be unable to reject the null hypothesis when in fact Ho is false (increases the probability of a Type II error).
  • When α (alpha) = 0.05 we obtained a Type II error probability of 0.374 = β = beta

mod12-significance_ex4

  • When α (alpha) = 0.01 (smaller than before) we obtain a Type II error probability of 0.644 = β = beta (larger than before)

mod12-significance_ex6a

  • As the blue line in the picture moves farther right, the significance level (α, alpha) is decreasing and the Type II error probability is increasing.
  • As the blue line in the picture moves farther left, the significance level (α, alpha) is increasing and the Type II error probability is decreasing

Let’s return to our very first example and define these two errors in context.

  • Ho = The student’s claim: I did not cheat on the exam.
  • Ha = The instructor’s claim: The student did cheat on the exam.

Adhering to the principle “innocent until proven guilty,” the committee asks the instructor for evidence to support his claim.

There are four possible outcomes of this process. There are two possible correct decisions:

  • The student did cheat on the exam and the instructor brings enough evidence to reject Ho and conclude the student did cheat on the exam. This is a CORRECT decision!
  • The student did not cheat on the exam and the instructor fails to provide enough evidence that the student did cheat on the exam. This is a CORRECT decision!

Both the correct decisions and the possible errors are fairly easy to understand but with the errors, you must be careful to identify and define the two types correctly.

TYPE I Error: Reject Ho when Ho is True

  • The student did not cheat on the exam but the instructor brings enough evidence to reject Ho and conclude the student cheated on the exam. This is a Type I Error.

TYPE II Error: Fail to Reject Ho when Ho is False

  • The student did cheat on the exam but the instructor fails to provide enough evidence that the student cheated on the exam. This is a Type II Error.

In most situations, including this one, it is more “acceptable” to have a Type II error than a Type I error. Although allowing a student who cheats to go unpunished might be considered a very bad problem, punishing a student for something he or she did not do is usually considered to be a more severe error. This is one reason we control for our Type I error in the process of hypothesis testing.

Did I Get This?: Type I and Type II Errors (in context)

  • The probabilities of Type I and Type II errors are closely related to the concepts of sensitivity and specificity that we discussed previously. Consider the following hypotheses:

Ho: The individual does not have diabetes (status quo, nothing special happening)

Ha: The individual does have diabetes (something is going on here)

In this setting:

When someone tests positive for diabetes we would reject the null hypothesis and conclude the person has diabetes (we may or may not be correct!).

When someone tests negative for diabetes we would fail to reject the null hypothesis so that we fail to conclude the person has diabetes (we may or may not be correct!)

Let’s take it one step further:

Sensitivity = P(Test + | Have Disease) which in this setting equals P(Reject Ho | Ho is False) = 1 – P(Fail to Reject Ho | Ho is False) = 1 – β = 1 – beta

Specificity = P(Test – | No Disease) which in this setting equals P(Fail to Reject Ho | Ho is True) = 1 – P(Reject Ho | Ho is True) = 1 – α = 1 – alpha

Notice that sensitivity and specificity relate to the probability of making a correct decision whereas α (alpha) and β (beta) relate to the probability of making an incorrect decision.

Usually α (alpha) = 0.05 so that the specificity listed above is 0.95 or 95%.

Next, we will see that the sensitivity listed above is the power of the hypothesis test!

Reasons for a Type I Error in Practice

Assuming that you have obtained a quality sample:

  • The reason for a Type I error is random chance.
  • When a Type I error occurs, our observed data represented a rare event which indicated evidence in favor of the alternative hypothesis even though the null hypothesis was actually true.

Reasons for a Type II Error in Practice

Again, assuming that you have obtained a quality sample, now we have a few possibilities depending upon the true difference that exists.

  • The sample size is too small to detect an important difference. This is the worst case, you should have obtained a larger sample. In this situation, you may notice that the effect seen in the sample seems PRACTICALLY significant and yet the p-value is not small enough to reject the null hypothesis.
  • The sample size is reasonable for the important difference but the true difference (which might be somewhat meaningful or interesting) is smaller than your test was capable of detecting. This is tolerable as you were not interested in being able to detect this difference when you began your study. In this situation, you may notice that the effect seen in the sample seems to have some potential for practical significance.
  • The sample size is more than adequate, the difference that was not detected is meaningless in practice. This is not a problem at all and is in effect a “correct decision” since the difference you did not detect would have no practical meaning.
  • Note: We will discuss the idea of practical significance later in more detail.

Power of a Hypothesis Test

It is often the case that we truly wish to prove the alternative hypothesis. It is reasonable that we would be interested in the probability of correctly rejecting the null hypothesis. In other words, the probability of rejecting the null hypothesis, when in fact the null hypothesis is false. This can also be thought of as the probability of being able to detect a (pre-specified) difference of interest to the researcher.

Let’s begin with a realistic example of how power can be described in a study.

In a clinical trial to study two medications for weight loss, we have an 80% chance to detect a difference in the weight loss between the two medications of 10 pounds. In other words, the power of the hypothesis test we will conduct is 80%.

In other words, if one medication comes from a population with an average weight loss of 25 pounds and the other comes from a population with an average weight loss of 15 pounds, we will have an 80% chance to detect that difference using the sample we have in our trial.

If we were to repeat this trial many times, 80% of the time we will be able to reject the null hypothesis (that there is no difference between the medications) and 20% of the time we will fail to reject the null hypothesis (and make a Type II error!).

The difference of 10 pounds in the previous example, is often called the effect size . The measure of the effect differs depending on the particular test you are conducting but is always some measure related to the true effect in the population. In this example, it is the difference between two population means.

Recall the definition of a Type II error:

Notice that P(Reject Ho | Ho is False) = 1 – P(Fail to Reject Ho | Ho is False) = 1 – β = 1- beta.

The POWER of a hypothesis test is the probability of rejecting the null hypothesis when the null hypothesis is false . This can also be stated as the probability of correctly rejecting the null hypothesis .

POWER = P(Reject Ho | Ho is False) = 1 – β = 1 – beta

Power is the test’s ability to correctly reject the null hypothesis. A test with high power has a good chance of being able to detect the difference of interest to us, if it exists .

As we mentioned on the bottom of the previous page, this can be thought of as the sensitivity of the hypothesis test if you imagine Ho = No disease and Ha = Disease.

Factors Affecting the Power of a Hypothesis Test

The power of a hypothesis test is affected by numerous quantities (similar to the margin of error in a confidence interval).

Assume that the null hypothesis is false for a given hypothesis test. All else being equal, we have the following:

  • Larger samples result in a greater chance to reject the null hypothesis which means an increase in the power of the hypothesis test.
  • If the effect size is larger, it will become easier for us to detect. This results in a greater chance to reject the null hypothesis which means an increase in the power of the hypothesis test. The effect size varies for each test and is usually closely related to the difference between the hypothesized value and the true value of the parameter under study.
  • From the relationship between the probability of a Type I and a Type II error (as α (alpha) decreases, β (beta) increases), we can see that as α (alpha) decreases, Power = 1 – β = 1 – beta also decreases.
  • There are other mathematical ways to change the power of a hypothesis test, such as changing the population standard deviation; however, these are not quantities that we can usually control so we will not discuss them here.

In practice, we specify a significance level and a desired power to detect a difference which will have practical meaning to us and this determines the sample size required for the experiment or study.

For most grants involving statistical analysis, power calculations must be completed to illustrate that the study will have a reasonable chance to detect an important effect. Otherwise, the money spent on the study could be wasted. The goal is usually to have a power close to 80%.

For example, if there is only a 5% chance to detect an important difference between two treatments in a clinical trial, this would result in a waste of time, effort, and money on the study since, when the alternative hypothesis is true, the chance a treatment effect can be found is very small.

  • In order to calculate the power of a hypothesis test, we must specify the “truth.” As we mentioned previously when discussing Type II errors, in practice we can only calculate this probability using a series of “what if” calculations which depend upon the type of problem.

The following activity involves working with an interactive applet to study power more carefully.

Learn by Doing: Power of Hypothesis Tests

The following reading is an excellent discussion about Type I and Type II errors.

(Optional) Outside Reading: A Good Discussion of Power (≈ 2500 words)

We will not be asking you to perform power calculations manually. You may be asked to use online calculators and applets. Most statistical software packages offer some ability to complete power calculations. There are also many online calculators for power and sample size on the internet, for example, Russ Lenth’s power and sample-size page .

Proportions (Introduction & Step 1)

CO-4: Distinguish among different measurement scales, choose the appropriate descriptive and inferential statistical methods based on these distinctions, and interpret the results.

LO 4.33: In a given context, distinguish between situations involving a population proportion and a population mean and specify the correct null and alternative hypothesis for the scenario.

LO 4.34: Carry out a complete hypothesis test for a population proportion by hand.

Video: Proportions (Introduction & Step 1) (7:18)

Now that we understand the process of hypothesis testing and the logic behind it, we are ready to start learning about specific statistical tests (also known as significance tests).

The first test we are going to learn is the test about the population proportion (p).

This test is widely known as the “z-test for the population proportion (p).”

We will understand later where the “z-test” part is coming from.

This will be the only type of problem you will complete entirely “by-hand” in this course. Our goal is to use this example to give you the tools you need to understand how this process works. After working a few problems, you should review the earlier material again. You will likely need to review the terminology and concepts a few times before you fully understand the process.

In reality, you will often be conducting more complex statistical tests and allowing software to provide the p-value. In these settings it will be important to know what test to apply for a given situation and to be able to explain the results in context.

Review: Types of Variables

When we conduct a test about a population proportion, we are working with a categorical variable. Later in the course, after we have learned a variety of hypothesis tests, we will need to be able to identify which test is appropriate for which situation. Identifying the variable as categorical or quantitative is an important component of choosing an appropriate hypothesis test.

Learn by Doing: Review Types of Variables

One Sample Z-Test for a Population Proportion

In this part of our discussion on hypothesis testing, we will go into details that we did not go into before. More specifically, we will use this test to introduce the idea of a test statistic , and details about how p-values are calculated .

Let’s start by introducing the three examples, which will be the leading examples in our discussion. Each example is followed by a figure illustrating the information provided, as well as the question of interest.

A machine is known to produce 20% defective products, and is therefore sent for repair. After the machine is repaired, 400 products produced by the machine are chosen at random and 64 of them are found to be defective. Do the data provide enough evidence that the proportion of defective products produced by the machine (p) has been reduced as a result of the repair?

The following figure displays the information, as well as the question of interest:

The question of interest helps us formulate the null and alternative hypotheses in terms of p, the proportion of defective products produced by the machine following the repair:

  • Ho: p = 0.20 (No change; the repair did not help).
  • Ha: p < 0.20 (The repair was effective at reducing the proportion of defective parts).

There are rumors that students at a certain liberal arts college are more inclined to use drugs than U.S. college students in general. Suppose that in a simple random sample of 100 students from the college, 19 admitted to marijuana use. Do the data provide enough evidence to conclude that the proportion of marijuana users among the students in the college (p) is higher than the national proportion, which is 0.157? (This number is reported by the Harvard School of Public Health.)

Again, the following figure displays the information as well as the question of interest:

As before, we can formulate the null and alternative hypotheses in terms of p, the proportion of students in the college who use marijuana:

  • Ho: p = 0.157 (same as among all college students in the country).
  • Ha: p > 0.157 (higher than the national figure).

Polls on certain topics are conducted routinely in order to monitor changes in the public’s opinions over time. One such topic is the death penalty. In 2003 a poll estimated that 64% of U.S. adults support the death penalty for a person convicted of murder. In a more recent poll, 675 out of 1,000 U.S. adults chosen at random were in favor of the death penalty for convicted murderers. Do the results of this poll provide evidence that the proportion of U.S. adults who support the death penalty for convicted murderers (p) changed between 2003 and the later poll?

Here is a figure that displays the information, as well as the question of interest:

Again, we can formulate the null and alternative hypotheses in term of p, the proportion of U.S. adults who support the death penalty for convicted murderers.

  • Ho: p = 0.64 (No change from 2003).
  • Ha: p ≠ 0.64 (Some change since 2003).

Learn by Doing: Proportions (Overview)

Did I Get This?: Proportions ( Overview )

Recall that there are basically 4 steps in the process of hypothesis testing:

  • STEP 1: State the appropriate null and alternative hypotheses, Ho and Ha.
  • STEP 2: Obtain a random sample, collect relevant data, and check whether the data meet the conditions under which the test can be used . If the conditions are met, summarize the data using a test statistic.
  • STEP 3: Find the p-value of the test.
  • STEP 4: Based on the p-value, decide whether or not the results are statistically significant and draw your conclusions in context.
  • Note: In practice, we should always consider the practical significance of the results as well as the statistical significance.

We are now going to go through these steps as they apply to the hypothesis testing for the population proportion p. It should be noted that even though the details will be specific to this particular test, some of the ideas that we will add apply to hypothesis testing in general.

Step 1. Stating the Hypotheses

Here again are the three set of hypotheses that are being tested in each of our three examples:

Has the proportion of defective products been reduced as a result of the repair?

Is the proportion of marijuana users in the college higher than the national figure?

Did the proportion of U.S. adults who support the death penalty change between 2003 and a later poll?

The null hypothesis always takes the form:

  • Ho: p = some value

and the alternative hypothesis takes one of the following three forms:

  • Ha: p < that value (like in example 1) or
  • Ha: p > that value (like in example 2) or
  • Ha: p ≠ that value (like in example 3).

Note that it was quite clear from the context which form of the alternative hypothesis would be appropriate. The value that is specified in the null hypothesis is called the null value , and is generally denoted by p 0 . We can say, therefore, that in general the null hypothesis about the population proportion (p) would take the form:

  • Ho: p = p 0

We write Ho: p = p 0 to say that we are making the hypothesis that the population proportion has the value of p 0 . In other words, p is the unknown population proportion and p 0 is the number we think p might be for the given situation.

The alternative hypothesis takes one of the following three forms (depending on the context):

Ha: p < p 0 (one-sided)

Ha: p > p 0 (one-sided)

Ha: p ≠ p 0 (two-sided)

The first two possible forms of the alternatives (where the = sign in Ho is challenged by < or >) are called one-sided alternatives , and the third form of alternative (where the = sign in Ho is challenged by ≠) is called a two-sided alternative. To understand the intuition behind these names let’s go back to our examples.

Example 3 (death penalty) is a case where we have a two-sided alternative:

In this case, in order to reject Ho and accept Ha we will need to get a sample proportion of death penalty supporters which is very different from 0.64 in either direction, either much larger or much smaller than 0.64.

In example 2 (marijuana use) we have a one-sided alternative:

Here, in order to reject Ho and accept Ha we will need to get a sample proportion of marijuana users which is much higher than 0.157.

Similarly, in example 1 (defective products), where we are testing:

in order to reject Ho and accept Ha, we will need to get a sample proportion of defective products which is much smaller than 0.20.

Learn by Doing: State Hypotheses (Proportions)

Did I Get This?: State Hypotheses (Proportions)

Proportions (Step 2)

Video: Proportions (Step 2) (12:38)

Step 2. Collect Data, Check Conditions, and Summarize Data

After the hypotheses have been stated, the next step is to obtain a sample (on which the inference will be based), collect relevant data , and summarize them.

It is extremely important that our sample is representative of the population about which we want to draw conclusions. This is ensured when the sample is chosen at random. Beyond the practical issue of ensuring representativeness, choosing a random sample has theoretical importance that we will mention later.

In the case of hypothesis testing for the population proportion (p), we will collect data on the relevant categorical variable from the individuals in the sample and start by calculating the sample proportion p-hat (the natural quantity to calculate when the parameter of interest is p).

Let’s go back to our three examples and add this step to our figures.

As we mentioned earlier without going into details, when we summarize the data in hypothesis testing, we go a step beyond calculating the sample statistic and summarize the data with a test statistic . Every test has a test statistic, which to some degree captures the essence of the test. In fact, the p-value, which so far we have looked upon as “the king” (in the sense that everything is determined by it), is actually determined by (or derived from) the test statistic. We will now introduce the test statistic.

The test statistic is a measure of how far the sample proportion p-hat is from the null value p 0 , the value that the null hypothesis claims is the value of p. In other words, since p-hat is what the data estimates p to be, the test statistic can be viewed as a measure of the “distance” between what the data tells us about p and what the null hypothesis claims p to be.

Let’s use our examples to understand this:

The parameter of interest is p, the proportion of defective products following the repair.

The data estimate p to be p-hat = 0.16

The null hypothesis claims that p = 0.20

The data are therefore 0.04 (or 4 percentage points) below the null hypothesis value.

It is hard to evaluate whether this difference of 4% in defective products is enough evidence to say that the repair was effective at reducing the proportion of defective products, but clearly, the larger the difference, the more evidence it is against the null hypothesis. So if, for example, our sample proportion of defective products had been, say, 0.10 instead of 0.16, then I think you would all agree that cutting the proportion of defective products in half (from 20% to 10%) would be extremely strong evidence that the repair was effective at reducing the proportion of defective products.

The parameter of interest is p, the proportion of students in a college who use marijuana.

The data estimate p to be p-hat = 0.19

The null hypothesis claims that p = 0.157

The data are therefore 0.033 (or 3.3. percentage points) above the null hypothesis value.

The parameter of interest is p, the proportion of U.S. adults who support the death penalty for convicted murderers.

The data estimate p to be p-hat = 0.675

The null hypothesis claims that p = 0.64

There is a difference of 0.035 (or 3.5. percentage points) between the data and the null hypothesis value.

The problem with looking only at the difference between the sample proportion, p-hat, and the null value, p 0 is that we have not taken into account the variability of our estimator p-hat which, as we know from our study of sampling distributions, depends on the sample size.

For this reason, the test statistic cannot simply be the difference between p-hat and p 0 , but must be some form of that formula that accounts for the sample size. In other words, we need to somehow standardize the difference so that comparison between different situations will be possible. We are very close to revealing the test statistic, but before we construct it, let’s be reminded of the following two facts from probability:

Fact 1: When we take a random sample of size n from a population with population proportion p, then

mod9-sampp_hat2

Fact 2: The z-score of any normal value (a value that comes from a normal distribution) is calculated by finding the difference between the value and the mean and then dividing that difference by the standard deviation (of the normal distribution associated with the value). The z-score represents how many standard deviations below or above the mean the value is.

Thus, our test statistic should be a measure of how far the sample proportion p-hat is from the null value p 0 relative to the variation of p-hat (as measured by the standard error of p-hat).

Recall that the standard error is the standard deviation of the sampling distribution for a given statistic. For p-hat, we know the following:

sampdistsummaryphat

To find the p-value, we will need to determine how surprising our value is assuming the null hypothesis is true. We already have the tools needed for this process from our study of sampling distributions as represented in the table above.

If we assume the null hypothesis is true, we can specify that the center of the distribution of all possible values of p-hat from samples of size 400 would be 0.20 (our null value).

We can calculate the standard error, assuming p = 0.20 as

\(\sqrt{\dfrac{p_{0}\left(1-p_{0}\right)}{n}}=\sqrt{\dfrac{0.2(1-0.2)}{400}}=0.02\)

The following picture represents the sampling distribution of all possible values of p-hat of samples of size 400, assuming the true proportion p is 0.20 and our other requirements for the sampling distribution to be normal are met (we will review these during the next step).

A normal curve representing samping distribution of p-hat assuming that p=p_0. Marked on the horizontal axis is p_0 and a particular value of p-hat. z is the difference between p-hat and p_0 measured in standard deviations (with the sign of z indicating whether p-hat is below or above p_0)

In order to calculate probabilities for the picture above, we would need to find the z-score associated with our result.

This z-score is the test statistic ! In this example, the numerator of our z-score is the difference between p-hat (0.16) and null value (0.20) which we found earlier to be -0.04. The denominator of our z-score is the standard error calculated above (0.02) and thus quickly we find the z-score, our test statistic, to be -2.

The sample proportion based upon this data is 2 standard errors below the null value.

Hopefully you now understand more about the reasons we need probability in statistics!!

Now we will formalize the definition and look at our remaining examples before moving on to the next step, which will be to determine if a normal distribution applies and calculate the p-value.

Test Statistic for Hypothesis Tests for One Proportion is:

\(z=\dfrac{\hat{p}-p_{0}}{\sqrt{\dfrac{p_{0}\left(1-p_{0}\right)}{n}}}\)

It represents the difference between the sample proportion and the null value, measured in standard deviations (standard error of p-hat).

The picture above is a representation of the sampling distribution of p-hat assuming p = p 0 . In other words, this is a model of how p-hat behaves if we are drawing random samples from a population for which Ho is true.

Notice the center of the sampling distribution is at p 0 , which is the hypothesized proportion given in the null hypothesis (Ho: p = p 0 .) We could also mark the axis in standard error units,

\(\sqrt{\dfrac{p_{0}\left(1-p_{0}\right)}{n}}\)

For example, if our null hypothesis claims that the proportion of U.S. adults supporting the death penalty is 0.64, then the sampling distribution is drawn as if the null is true. We draw a normal distribution centered at 0.64 (p 0 ) with a standard error dependent on sample size,

\(\sqrt{\dfrac{0.64(1-0.64)}{n}}\).

Important Comment:

  • Note that under the assumption that Ho is true (and if the conditions for the sampling distribution to be normal are satisfied) the test statistic follows a N(0,1) (standard normal) distribution. Another way to say the same thing which is quite common is: “The null distribution of the test statistic is N(0,1).”

By “null distribution,” we mean the distribution under the assumption that Ho is true. As we’ll see and stress again later, the null distribution of the test statistic is what the calculation of the p-value is based on.

Let’s go back to our remaining two examples and find the test statistic in each case:

Since the null hypothesis is Ho: p = 0.157, the standardized (z) score of p-hat = 0.19 is

\(z=\dfrac{0.19-0.157}{\sqrt{\dfrac{0.157(1-0.157)}{100}}} \approx 0.91\)

This is the value of the test statistic for this example.

We interpret this to mean that, assuming that Ho is true, the sample proportion p-hat = 0.19 is 0.91 standard errors above the null value (0.157).

Since the null hypothesis is Ho: p = 0.64, the standardized (z) score of p-hat = 0.675 is

\(z=\dfrac{0.675-0.64}{\sqrt{\dfrac{0.64(1-0.64)}{1000}}} \approx 2.31\)

We interpret this to mean that, assuming that Ho is true, the sample proportion p-hat = 0.675 is 2.31 standard errors above the null value (0.64).

Learn by Doing: Proportions (Step 2)

Comments about the Test Statistic:

  • We mentioned earlier that to some degree, the test statistic captures the essence of the test. In this case, the test statistic measures the difference between p-hat and p 0 in standard errors. This is exactly what this test is about. Get data, and look at the discrepancy between what the data estimates p to be (represented by p-hat) and what Ho claims about p (represented by p 0 ).
  • You can think about this test statistic as a measure of evidence in the data against Ho. The larger the test statistic, the “further the data are from Ho” and therefore the more evidence the data provide against Ho.

Learn by Doing: Proportions (Step 2) Understanding the Test Statistic

Did I Get This?: Proportions (Step 2)

  • It should now be clear why this test is commonly known as the z-test for the population proportion . The name comes from the fact that it is based on a test statistic that is a z-score.
  • Recall fact 1 that we used for constructing the z-test statistic. Here is part of it again:

When we take a random sample of size n from a population with population proportion p 0 , the possible values of the sample proportion p-hat ( when certain conditions are met ) have approximately a normal distribution with a mean of p 0 … and a standard deviation of

stderror

This result provides the theoretical justification for constructing the test statistic the way we did, and therefore the assumptions under which this result holds (in bold, above) are the conditions that our data need to satisfy so that we can use this test. These two conditions are:

i. The sample has to be random.

ii. The conditions under which the sampling distribution of p-hat is normal are met. In other words:

sampsizprop

  • Here we will pause to say more about condition (i.) above, the need for a random sample. In the Probability Unit we discussed sampling plans based on probability (such as a simple random sample, cluster, or stratified sampling) that produce a non-biased sample, which can be safely used in order to make inferences about a population. We noted in the Probability Unit that, in practice, other (non-random) sampling techniques are sometimes used when random sampling is not feasible. It is important though, when these techniques are used, to be aware of the type of bias that they introduce, and thus the limitations of the conclusions that can be drawn from them. For our purpose here, we will focus on one such practice, the situation in which a sample is not really chosen randomly, but in the context of the categorical variable that is being studied, the sample is regarded as random. For example, say that you are interested in the proportion of students at a certain college who suffer from seasonal allergies. For that purpose, the students in a large engineering class could be considered as a random sample, since there is nothing about being in an engineering class that makes you more or less likely to suffer from seasonal allergies. Technically, the engineering class is a convenience sample, but it is treated as a random sample in the context of this categorical variable. On the other hand, if you are interested in the proportion of students in the college who have math anxiety, then the class of engineering students clearly could not possibly be viewed as a random sample, since engineering students probably have a much lower incidence of math anxiety than the college population overall.

Learn by Doing: Proportions (Step 2) Valid or Invalid Sampling?

Let’s check the conditions in our three examples.

i. The 400 products were chosen at random.

ii. n = 400, p 0 = 0.2 and therefore:

\(n p_{0}=400(0.2)=80 \geq 10\)

\(n\left(1-p_{0}\right)=400(1-0.2)=320 \geq 10\)

i. The 100 students were chosen at random.

ii. n = 100, p 0 = 0.157 and therefore:

\begin{gathered} n p_{0}=100(0.157)=15.7 \geq 10 \\ n\left(1-p_{0}\right)=100(1-0.157)=84.3 \geq 10 \end{gathered}

i. The 1000 adults were chosen at random.

ii. n = 1000, p 0 = 0.64 and therefore:

\begin{gathered} n p_{0}=1000(0.64)=640 \geq 10 \\ n\left(1-p_{0}\right)=1000(1-0.64)=360 \geq 10 \end{gathered}

Learn by Doing: Proportions (Step 2) Verify Conditions

Checking that our data satisfy the conditions under which the test can be reliably used is a very important part of the hypothesis testing process. Be sure to consider this for every hypothesis test you conduct in this course and certainly in practice.

The Four Steps in Hypothesis Testing

With respect to the z-test, the population proportion that we are currently discussing we have:

Step 1: Completed

Step 2: Completed

Step 3: This is what we will work on next.

Proportions (Step 3)

Video: Proportions (Step 3) (14:46)

Calculators and Tables

Step 3. Finding the P-value of the Test

So far we’ve talked about the p-value at the intuitive level: understanding what it is (or what it measures) and how we use it to draw conclusions about the statistical significance of our results. We will now go more deeply into how the p-value is calculated.

It should be mentioned that eventually we will rely on technology to calculate the p-value for us (as well as the test statistic), but in order to make intelligent use of the output, it is important to first understand the details, and only then let the computer do the calculations for us. Again, our goal is to use this simple example to give you the tools you need to understand the process entirely. Let’s start.

Recall that so far we have said that the p-value is the probability of obtaining data like those observed assuming that Ho is true. Like the test statistic, the p-value is, therefore, a measure of the evidence against Ho. In the case of the test statistic, the larger it is in magnitude (positive or negative), the further p-hat is from p 0 , the more evidence we have against Ho. In the case of the p-value , it is the opposite; the smaller it is, the more unlikely it is to get data like those observed when Ho is true, the more evidence it is against Ho . One can actually draw conclusions in hypothesis testing just using the test statistic, and as we’ll see the p-value is, in a sense, just another way of looking at the test statistic. The reason that we actually take the extra step in this course and derive the p-value from the test statistic is that even though in this case (the test about the population proportion) and some other tests, the value of the test statistic has a very clear and intuitive interpretation, there are some tests where its value is not as easy to interpret. On the other hand, the p-value keeps its intuitive appeal across all statistical tests.

How is the p-value calculated?

Intuitively, the p-value is the probability of observing data like those observed assuming that Ho is true. Let’s be a bit more formal:

  • Since this is a probability question about the data , it makes sense that the calculation will involve the data summary, the test statistic.
  • What do we mean by “like” those observed? By “like” we mean “as extreme or even more extreme.”

Putting it all together, we get that in general:

The p-value is the probability of observing a test statistic as extreme as that observed (or even more extreme) assuming that the null hypothesis is true.

By “extreme” we mean extreme in the direction(s) of the alternative hypothesis.

Specifically , for the z-test for the population proportion:

  • If the alternative hypothesis is Ha: p < p 0 (less than) , then “extreme” means small or less than , and the p-value is: The probability of observing a test statistic as small as that observed or smaller if the null hypothesis is true.
  • If the alternative hypothesis is Ha: p > p 0 (greater than) , then “extreme” means large or greater than , and the p-value is: The probability of observing a test statistic as large as that observed or larger if the null hypothesis is true.
  • If the alternative is Ha: p ≠ p 0 (different from) , then “extreme” means extreme in either direction either small or large (i.e., large in magnitude) or just different from , and the p-value therefore is: The probability of observing a test statistic as large in magnitude as that observed or larger if the null hypothesis is true.(Examples: If z = -2.5: p-value = probability of observing a test statistic as small as -2.5 or smaller or as large as 2.5 or larger. If z = 1.5: p-value = probability of observing a test statistic as large as 1.5 or larger, or as small as -1.5 or smaller.)

OK, hopefully that makes (some) sense. But how do we actually calculate it?

Recall the important comment from our discussion about our test statistic,

ztestprop

which said that when the null hypothesis is true (i.e., when p = p 0 ), the possible values of our test statistic follow a standard normal (N(0,1), denoted by Z) distribution. Therefore, the p-value calculations (which assume that Ho is true) are simply standard normal distribution calculations for the 3 possible alternative hypotheses.

Alternative Hypothesis is “Less Than”

The probability of observing a test statistic as small as that observed or smaller , assuming that the values of the test statistic follow a standard normal distribution. We will now represent this probability in symbols and also using the normal distribution.

Looking at the shaded region, you can see why this is often referred to as a left-tailed test. We shaded to the left of the test statistic, since less than is to the left.

Alternative Hypothesis is “Greater Than”

The probability of observing a test statistic as large as that observed or larger , assuming that the values of the test statistic follow a standard normal distribution. Again, we will represent this probability in symbols and using the normal distribution

Looking at the shaded region, you can see why this is often referred to as a right-tailed test. We shaded to the right of the test statistic, since greater than is to the right.

Alternative Hypothesis is “Not Equal To”

The probability of observing a test statistic which is as large in magnitude as that observed or larger, assuming that the values of the test statistic follow a standard normal distribution.

This is often referred to as a two-tailed test, since we shaded in both directions.

Next, we will apply this to our three examples. But first, work through the following activities, which should help your understanding.

Learn by Doing: Proportions (Step 3)

Did I Get This?: Proportions (Step 3)

The p-value in this case is:

  • The probability of observing a test statistic as small as -2 or smaller, assuming that Ho is true.

OR (recalling what the test statistic actually means in this case),

  • The probability of observing a sample proportion that is 2 standard deviations or more below the null value (p 0 = 0.20), assuming that p 0 is the true population proportion.

OR, more specifically,

  • The probability of observing a sample proportion of 0.16 or lower in a random sample of size 400, when the true population proportion is p 0 =0.20

In either case, the p-value is found as shown in the following figure:

To find P(Z ≤ -2) we can either use the calculator or table we learned to use in the probability unit for normal random variables. Eventually, after we understand the details, we will use software to run the test for us and the output will give us all the information we need. The p-value that the statistical software provides for this specific example is 0.023. The p-value tells us that it is pretty unlikely (probability of 0.023) to get data like those observed (test statistic of -2 or less) assuming that Ho is true.

  • The probability of observing a test statistic as large as 0.91 or larger, assuming that Ho is true.
  • The probability of observing a sample proportion that is 0.91 standard deviations or more above the null value (p 0 = 0.157), assuming that p 0 is the true population proportion.
  • The probability of observing a sample proportion of 0.19 or higher in a random sample of size 100, when the true population proportion is p 0 =0.157

Again, at this point we can either use the calculator or table to find that the p-value is 0.182, this is P(Z ≥ 0.91).

The p-value tells us that it is not very surprising (probability of 0.182) to get data like those observed (which yield a test statistic of 0.91 or higher) assuming that the null hypothesis is true.

  • The probability of observing a test statistic as large as 2.31 (or larger) or as small as -2.31 (or smaller), assuming that Ho is true.
  • The probability of observing a sample proportion that is 2.31 standard deviations or more away from the null value (p 0 = 0.64), assuming that p 0 is the true population proportion.
  • The probability of observing a sample proportion as different as 0.675 is from 0.64, or even more different (i.e. as high as 0.675 or higher or as low as 0.605 or lower) in a random sample of size 1,000, when the true population proportion is p 0 = 0.64

Again, at this point we can either use the calculator or table to find that the p-value is 0.021, this is P(Z ≤ -2.31) + P(Z ≥ 2.31) = 2*P(Z ≥ |2.31|)

The p-value tells us that it is pretty unlikely (probability of 0.021) to get data like those observed (test statistic as high as 2.31 or higher or as low as -2.31 or lower) assuming that Ho is true.

  • We’ve just seen that finding p-values involves probability calculations about the value of the test statistic assuming that Ho is true. In this case, when Ho is true, the values of the test statistic follow a standard normal distribution (i.e., the sampling distribution of the test statistic when the null hypothesis is true is N(0,1)). Therefore, p-values correspond to areas (probabilities) under the standard normal curve.

Similarly, in any test , p-values are found using the sampling distribution of the test statistic when the null hypothesis is true (also known as the “null distribution” of the test statistic). In this case, it was relatively easy to argue that the null distribution of our test statistic is N(0,1). As we’ll see, in other tests, other distributions come up (like the t-distribution and the F-distribution), which we will just mention briefly, and rely heavily on the output of our statistical package for obtaining the p-values.

We’ve just completed our discussion about the p-value, and how it is calculated both in general and more specifically for the z-test for the population proportion. Let’s go back to the four-step process of hypothesis testing and see what we’ve covered and what still needs to be discussed.

With respect to the z-test the population proportion:

Step 3: Completed

Step 4. This is what we will work on next.

Learn by Doing: Proportions (Step 3) Understanding P-values

Proportions (Step 4 & Summary)

Video: Proportions (Step 4 & Summary) (4:30)

Step 4. Drawing Conclusions Based on the P-Value

This last part of the four-step process of hypothesis testing is the same across all statistical tests, and actually, we’ve already said basically everything there is to say about it, but it can’t hurt to say it again.

The p-value is a measure of how much evidence the data present against Ho. The smaller the p-value, the more evidence the data present against Ho.

We already mentioned that what determines what constitutes enough evidence against Ho is the significance level (α, alpha), a cutoff point below which the p-value is considered small enough to reject Ho in favor of Ha. The most commonly used significance level is 0.05.

  • Conclusion: There IS enough evidence that Ha is True
  • Conclusion: There IS NOT enough evidence that Ha is True

Where instead of Ha is True , we write what this means in the words of the problem, in other words, in the context of the current scenario.

It is important to mention again that this step has essentially two sub-steps:

(i) Based on the p-value, determine whether or not the results are statistically significant (i.e., the data present enough evidence to reject Ho).

(ii) State your conclusions in the context of the problem.

Note: We always still must consider whether the results have any practical significance, particularly if they are statistically significant as a statistically significant result which has not practical use is essentially meaningless!

Let’s go back to our three examples and draw conclusions.

We found that the p-value for this test was 0.023.

Since 0.023 is small (in particular, 0.023 < 0.05), the data provide enough evidence to reject Ho.

Conclusion:

  • There IS enough evidence that the proportion of defective products is less than 20% after the repair .

The following figure is the complete story of this example, and includes all the steps we went through, starting from stating the hypotheses and ending with our conclusions:

We found that the p-value for this test was 0.182.

Since .182 is not small (in particular, 0.182 > 0.05), the data do not provide enough evidence to reject Ho.

  • There IS NOT enough evidence that the proportion of students at the college who use marijuana is higher than the national figure.

Here is the complete story of this example:

Learn by Doing: Learn by Doing – Proportions (Step 4)

We found that the p-value for this test was 0.021.

Since 0.021 is small (in particular, 0.021 < 0.05), the data provide enough evidence to reject Ho

  • There IS enough evidence that the proportion of adults who support the death penalty for convicted murderers has changed since 2003.

Did I Get This?: Proportions (Step 4)

Many Students Wonder: Hypothesis Testing for the Population Proportion

Many students wonder why 5% is often selected as the significance level in hypothesis testing, and why 1% is the next most typical level. This is largely due to just convenience and tradition.

When Ronald Fisher (one of the founders of modern statistics) published one of his tables, he used a mathematically convenient scale that included 5% and 1%. Later, these same 5% and 1% levels were used by other people, in part just because Fisher was so highly esteemed. But mostly these are arbitrary levels.

The idea of selecting some sort of relatively small cutoff was historically important in the development of statistics; but it’s important to remember that there is really a continuous range of increasing confidence towards the alternative hypothesis, not a single all-or-nothing value. There isn’t much meaningful difference, for instance, between a p-value of .049 or .051, and it would be foolish to declare one case definitely a “real” effect and to declare the other case definitely a “random” effect. In either case, the study results were roughly 5% likely by chance if there’s no actual effect.

Whether such a p-value is sufficient for us to reject a particular null hypothesis ultimately depends on the risk of making the wrong decision, and the extent to which the hypothesized effect might contradict our prior experience or previous studies.

Let’s Summarize!!

We have now completed going through the four steps of hypothesis testing, and in particular we learned how they are applied to the z-test for the population proportion. Here is a brief summary:

Step 1: State the hypotheses

State the null hypothesis:

State the alternative hypothesis:

where the choice of the appropriate alternative (out of the three) is usually quite clear from the context of the problem. If you feel it is not clear, it is most likely a two-sided problem. Students are usually good at recognizing the “more than” and “less than” terminology but differences can sometimes be more difficult to spot, sometimes this is because you have preconceived ideas of how you think it should be! Use only the information given in the problem.

Step 2: Obtain data, check conditions, and summarize data

Obtain data from a sample and:

(i) Check whether the data satisfy the conditions which allow you to use this test.

random sample (or at least a sample that can be considered random in context)

the conditions under which the sampling distribution of p-hat is normal are met

sampsizprop

(ii) Calculate the sample proportion p-hat, and summarize the data using the test statistic:

ztestprop

( Recall: This standardized test statistic represents how many standard deviations above or below p 0 our sample proportion p-hat is.)

Step 3: Find the p-value of the test by using the test statistic as follows

IMPORTANT FACT: In all future tests, we will rely on software to obtain the p-value.

When the alternative hypothesis is “less than” the probability of observing a test statistic as small as that observed or smaller , assuming that the values of the test statistic follow a standard normal distribution. We will now represent this probability in symbols and also using the normal distribution.

When the alternative hypothesis is “greater than” the probability of observing a test statistic as large as that observed or larger , assuming that the values of the test statistic follow a standard normal distribution. Again, we will represent this probability in symbols and using the normal distribution

When the alternative hypothesis is “not equal to” the probability of observing a test statistic which is as large in magnitude as that observed or larger, assuming that the values of the test statistic follow a standard normal distribution.

Step 4: Conclusion

Reach a conclusion first regarding the statistical significance of the results, and then determine what it means in the context of the problem.

If p-value ≤ 0.05 then WE REJECT Ho Conclusion: There IS enough evidence that Ha is True

If p-value > 0.05 then WE FAIL TO REJECT Ho Conclusion: There IS NOT enough evidence that Ha is True

Recall that: If the p-value is small (in particular, smaller than the significance level, which is usually 0.05), the results are statistically significant (in the sense that there is a statistically significant difference between what was observed in the sample and what was claimed in Ho), and so we reject Ho.

If the p-value is not small, we do not have enough statistical evidence to reject Ho, and so we continue to believe that Ho may be true. ( Remember: In hypothesis testing we never “accept” Ho ).

Finally, in practice, we should always consider the practical significance of the results as well as the statistical significance.

Learn by Doing: Z-Test for a Population Proportion

What’s next?

Before we move on to the next test, we are going to use the z-test for proportions to bring up and illustrate a few more very important issues regarding hypothesis testing. This might also be a good time to review the concepts of Type I error, Type II error, and Power before continuing on.

More about Hypothesis Testing

CO-1: Describe the roles biostatistics serves in the discipline of public health.

LO 1.11: Recognize the distinction between statistical significance and practical significance.

LO 6.30: Use a confidence interval to determine the correct conclusion to the associated two-sided hypothesis test.

Video: More about Hypothesis Testing (18:25)

The issues regarding hypothesis testing that we will discuss are:

  • The effect of sample size on hypothesis testing.
  • Statistical significance vs. practical importance.
  • Hypothesis testing and confidence intervals—how are they related?

Let’s begin.

1. The Effect of Sample Size on Hypothesis Testing

We have already seen the effect that the sample size has on inference, when we discussed point and interval estimation for the population mean (μ, mu) and population proportion (p). Intuitively …

Larger sample sizes give us more information to pin down the true nature of the population. We can therefore expect the sample mean and sample proportion obtained from a larger sample to be closer to the population mean and proportion, respectively. As a result, for the same level of confidence, we can report a smaller margin of error, and get a narrower confidence interval. What we’ve seen, then, is that larger sample size gives a boost to how much we trust our sample results.

In hypothesis testing, larger sample sizes have a similar effect. We have also discussed that the power of our test increases when the sample size increases, all else remaining the same. This means, we have a better chance to detect the difference between the true value and the null value for larger samples.

The following two examples will illustrate that a larger sample size provides more convincing evidence (the test has greater power), and how the evidence manifests itself in hypothesis testing. Let’s go back to our example 2 (marijuana use at a certain liberal arts college).

We do not have enough evidence to conclude that the proportion of students at the college who use marijuana is higher than the national figure.

Now, let’s increase the sample size.

There are rumors that students in a certain liberal arts college are more inclined to use drugs than U.S. college students in general. Suppose that in a simple random sample of 400 students from the college, 76 admitted to marijuana use . Do the data provide enough evidence to conclude that the proportion of marijuana users among the students in the college (p) is higher than the national proportion, which is 0.157? (Reported by the Harvard School of Public Health).

Our results here are statistically significant . In other words, in example 2* the data provide enough evidence to reject Ho.

  • Conclusion: There is enough evidence that the proportion of marijuana users at the college is higher than among all U.S. students.

What do we learn from this?

We see that sample results that are based on a larger sample carry more weight (have greater power).

In example 2, we saw that a sample proportion of 0.19 based on a sample of size of 100 was not enough evidence that the proportion of marijuana users in the college is higher than 0.157. Recall, from our general overview of hypothesis testing, that this conclusion (not having enough evidence to reject the null hypothesis) doesn’t mean the null hypothesis is necessarily true (so, we never “accept” the null); it only means that the particular study didn’t yield sufficient evidence to reject the null. It might be that the sample size was simply too small to detect a statistically significant difference.

However, in example 2*, we saw that when the sample proportion of 0.19 is obtained from a sample of size 400, it carries much more weight, and in particular, provides enough evidence that the proportion of marijuana users in the college is higher than 0.157 (the national figure). In this case, the sample size of 400 was large enough to detect a statistically significant difference.

The following activity will allow you to practice the ideas and terminology used in hypothesis testing when a result is not statistically significant.

Learn by Doing: Interpreting Non-significant Results

2. Statistical significance vs. practical importance.

Now, we will address the issue of statistical significance versus practical importance (which also involves issues of sample size).

The following activity will let you explore the effect of the sample size on the statistical significance of the results yourself, and more importantly will discuss issue 2: Statistical significance vs. practical importance.

Important Fact: In general, with a sufficiently large sample size you can make any result that has very little practical importance statistically significant! A large sample size alone does NOT make a “good” study!!

This suggests that when interpreting the results of a test, you should always think not only about the statistical significance of the results but also about their practical importance.

Learn by Doing: Statistical vs. Practical Significance

3. Hypothesis Testing and Confidence Intervals

The last topic we want to discuss is the relationship between hypothesis testing and confidence intervals. Even though the flavor of these two forms of inference is different (confidence intervals estimate a parameter, and hypothesis testing assesses the evidence in the data against one claim and in favor of another), there is a strong link between them.

We will explain this link (using the z-test and confidence interval for the population proportion), and then explain how confidence intervals can be used after a test has been carried out.

Recall that a confidence interval gives us a set of plausible values for the unknown population parameter. We may therefore examine a confidence interval to informally decide if a proposed value of population proportion seems plausible.

For example, if a 95% confidence interval for p, the proportion of all U.S. adults already familiar with Viagra in May 1998, was (0.61, 0.67), then it seems clear that we should be able to reject a claim that only 50% of all U.S. adults were familiar with the drug, since based on the confidence interval, 0.50 is not one of the plausible values for p.

In fact, the information provided by a confidence interval can be formally related to the information provided by a hypothesis test. ( Comment: The relationship is more straightforward for two-sided alternatives, and so we will not present results for the one-sided cases.)

Suppose we want to carry out the two-sided test:

  • Ha: p ≠ p 0

using a significance level of 0.05.

An alternative way to perform this test is to find a 95% confidence interval for p and check:

  • If p 0 falls outside the confidence interval, reject Ho.
  • If p 0 falls inside the confidence interval, do not reject Ho.

In other words,

  • If p 0 is not one of the plausible values for p, we reject Ho.
  • If p 0 is a plausible value for p, we cannot reject Ho.

( Comment: Similarly, the results of a test using a significance level of 0.01 can be related to the 99% confidence interval.)

Let’s look at an example:

Recall example 3, where we wanted to know whether the proportion of U.S. adults who support the death penalty for convicted murderers has changed since 2003, when it was 0.64.

We are testing:

and as the figure reminds us, we took a sample of 1,000 U.S. adults, and the data told us that 675 supported the death penalty for convicted murderers (p-hat = 0.675).

A 95% confidence interval for p, the proportion of all U.S. adults who support the death penalty, is:

\(0.675 \pm 1.96 \sqrt{\dfrac{0.675(1-0.675)}{1000}} \approx 0.675 \pm 0.029=(0.646,0.704)\)

Since the 95% confidence interval for p does not include 0.64 as a plausible value for p, we can reject Ho and conclude (as we did before) that there is enough evidence that the proportion of U.S. adults who support the death penalty for convicted murderers has changed since 2003.

You and your roommate are arguing about whose turn it is to clean the apartment. Your roommate suggests that you settle this by tossing a coin and takes one out of a locked box he has on the shelf. Suspecting that the coin might not be fair, you decide to test it first. You toss the coin 80 times, thinking to yourself that if, indeed, the coin is fair, you should get around 40 heads. Instead you get 48 heads. You are puzzled. You are not sure whether getting 48 heads out of 80 is enough evidence to conclude that the coin is unbalanced, or whether this a result that could have happened just by chance when the coin is fair.

Statistics can help you answer this question.

Let p be the true proportion (probability) of heads. We want to test whether the coin is fair or not.

  • Ho: p = 0.5 (the coin is fair).
  • Ha: p ≠ 0.5 (the coin is not fair).

The data we have are that out of n = 80 tosses, we got 48 heads, or that the sample proportion of heads is p-hat = 48/80 = 0.6.

A 95% confidence interval for p, the true proportion of heads for this coin, is:

\(0.6 \pm 1.96 \sqrt{\dfrac{0.6(1-0.6)}{80}} \approx 0.6 \pm 0.11=(0.49,0.71)\)

Since in this case 0.5 is one of the plausible values for p, we cannot reject Ho. In other words, the data do not provide enough evidence to conclude that the coin is not fair.

The context of the last example is a good opportunity to bring up an important point that was discussed earlier.

Even though we use 0.05 as a cutoff to guide our decision about whether the results are statistically significant, we should not treat it as inviolable and we should always add our own judgment. Let’s look at the last example again.

It turns out that the p-value of this test is 0.0734. In other words, it is maybe not extremely unlikely, but it is quite unlikely (probability of 0.0734) that when you toss a fair coin 80 times you’ll get a sample proportion of heads of 48/80 = 0.6 (or even more extreme). It is true that using the 0.05 significance level (cutoff), 0.0734 is not considered small enough to conclude that the coin is not fair. However, if you really don’t want to clean the apartment, the p-value might be small enough for you to ask your roommate to use a different coin, or to provide one yourself!

Did I Get This?: Connection between Confidence Intervals and Hypothesis Tests

Did I Get This?: Hypothesis Tests for Proportions (Extra Practice)

Here is our final point on this subject:

When the data provide enough evidence to reject Ho, we can conclude (depending on the alternative hypothesis) that the population proportion is either less than, greater than, or not equal to the null value p 0 . However, we do not get a more informative statement about its actual value. It might be of interest, then, to follow the test with a 95% confidence interval that will give us more insight into the actual value of p.

In our example 3,

we concluded that the proportion of U.S. adults who support the death penalty for convicted murderers has changed since 2003, when it was 0.64. It is probably of interest not only to know that the proportion has changed, but also to estimate what it has changed to. We’ve calculated the 95% confidence interval for p on the previous page and found that it is (0.646, 0.704).

We can combine our conclusions from the test and the confidence interval and say:

Data provide evidence that the proportion of U.S. adults who support the death penalty for convicted murderers has changed since 2003, and we are 95% confident that it is now between 0.646 and 0.704. (i.e. between 64.6% and 70.4%).

Let’s look at our example 1 to see how a confidence interval following a test might be insightful in a different way.

Here is a summary of example 1:

We conclude that as a result of the repair, the proportion of defective products has been reduced to below 0.20 (which was the proportion prior to the repair). It is probably of great interest to the company not only to know that the proportion of defective has been reduced, but also estimate what it has been reduced to, to get a better sense of how effective the repair was. A 95% confidence interval for p in this case is:

\(0.16 \pm 1.96 \sqrt{\dfrac{0.16(1-0.16)}{400}} \approx 0.16 \pm 0.036=(0.124,0.196)\)

We can therefore say that the data provide evidence that the proportion of defective products has been reduced, and we are 95% confident that it has been reduced to somewhere between 12.4% and 19.6%. This is very useful information, since it tells us that even though the results were significant (i.e., the repair reduced the number of defective products), the repair might not have been effective enough, if it managed to reduce the number of defective products only to the range provided by the confidence interval. This, of course, ties back in to the idea of statistical significance vs. practical importance that we discussed earlier. Even though the results are statistically significant (Ho was rejected), practically speaking, the repair might still be considered ineffective.

Learn by Doing: Hypothesis Tests and Confidence Intervals

Even though this portion of the current section is about the z-test for population proportion, it is loaded with very important ideas that apply to hypothesis testing in general. We’ve already summarized the details that are specific to the z-test for proportions, so the purpose of this summary is to highlight the general ideas.

The process of hypothesis testing has four steps :

I. Stating the null and alternative hypotheses (Ho and Ha).

II. Obtaining a random sample (or at least one that can be considered random) and collecting data. Using the data:

Check that the conditions under which the test can be reliably used are met.

Summarize the data using a test statistic.

  • The test statistic is a measure of the evidence in the data against Ho. The larger the test statistic is in magnitude, the more evidence the data present against Ho.

III. Finding the p-value of the test. The p-value is the probability of getting data like those observed (or even more extreme) assuming that the null hypothesis is true, and is calculated using the null distribution of the test statistic. The p-value is a measure of the evidence against Ho. The smaller the p-value, the more evidence the data present against Ho.

IV. Making conclusions.

Conclusions about the statistical significance of the results:

If the p-value is small, the data present enough evidence to reject Ho (and accept Ha).

If the p-value is not small, the data do not provide enough evidence to reject Ho.

To help guide our decision, we use the significance level as a cutoff for what is considered a small p-value. The significance cutoff is usually set at 0.05.

Conclusions should then be provided in the context of the problem.

Additional Important Ideas about Hypothesis Testing

  • Results that are based on a larger sample carry more weight, and therefore as the sample size increases, results become more statistically significant.
  • Even a very small and practically unimportant effect becomes statistically significant with a large enough sample size. The distinction between statistical significance and practical importance should therefore always be considered.
  • Confidence intervals can be used in order to carry out two-sided tests (95% confidence for the 0.05 significance level). If the null value is not included in the confidence interval (i.e., is not one of the plausible values for the parameter), we have enough evidence to reject Ho. Otherwise, we cannot reject Ho.
  • If the results are statistically significant, it might be of interest to follow up the tests with a confidence interval in order to get insight into the actual value of the parameter of interest.
  • It is important to be aware that there are two types of errors in hypothesis testing ( Type I and Type II ) and that the power of a statistical test is an important measure of how likely we are to be able to detect a difference of interest to us in a particular problem.

Means (All Steps)

NOTE: Beginning on this page, the Learn By Doing and Did I Get This activities are presented as interactive PDF files. The interactivity may not work on mobile devices or with certain PDF viewers. Use an official ADOBE product such as ADOBE READER .

If you have any issues with the Learn By Doing or Did I Get This interactive PDF files, you can view all of the questions and answers presented on this page in this document:

  • QUESTION/Answer (SPOILER ALERT!)

Tests About μ (mu) When σ (sigma) is Unknown – The t-test for a Population Mean

The t-distribution.

Video: Means (All Steps) (13:11)

So far we have talked about the logic behind hypothesis testing and then illustrated how this process proceeds in practice, using the z-test for the population proportion (p).

We are now moving on to discuss testing for the population mean (μ, mu), which is the parameter of interest when the variable of interest is quantitative.

A few comments about the structure of this section:

  • The basic groundwork for carrying out hypothesis tests has already been laid in our general discussion and in our presentation of tests about proportions.

Therefore we can easily modify the four steps to carry out tests about means instead, without going into all of the details again.

We will use this approach for all future tests so be sure to go back to the discussion in general and for proportions to review the concepts in more detail.

  • In our discussion about confidence intervals for the population mean, we made the distinction between whether the population standard deviation, σ (sigma) was known or if we needed to estimate this value using the sample standard deviation, s .

In this section, we will only discuss the second case as in most realistic settings we do not know the population standard deviation .

In this case we need to use the t- distribution instead of the standard normal distribution for the probability aspects of confidence intervals (choosing table values) and hypothesis tests (finding p-values).

  • Although we will discuss some theoretical or conceptual details for some of the analyses we will learn, from this point on we will rely on software to conduct tests and calculate confidence intervals for us , while we focus on understanding which methods are used for which situations and what the results say in context.

If you are interested in more information about the z-test, where we assume the population standard deviation σ (sigma) is known, you can review the Carnegie Mellon Open Learning Statistics Course (you will need to click “ENTER COURSE”).

Like any other tests, the t- test for the population mean follows the four-step process:

  • STEP 1: Stating the hypotheses H o and H a .
  • STEP 2: Collecting relevant data, checking that the data satisfy the conditions which allow us to use this test, and summarizing the data using a test statistic.
  • STEP 3: Finding the p-value of the test, the probability of obtaining data as extreme as those collected (or even more extreme, in the direction of the alternative hypothesis), assuming that the null hypothesis is true. In other words, how likely is it that the only reason for getting data like those observed is sampling variability (and not because H o is not true)?
  • STEP 4: Drawing conclusions, assessing the statistical significance of the results based on the p-value, and stating our conclusions in context. (Do we or don’t we have evidence to reject H o and accept H a ?)
  • Note: In practice, we should also always consider the practical significance of the results as well as the statistical significance.

We will now go through the four steps specifically for the t- test for the population mean and apply them to our two examples.

Only in a few cases is it reasonable to assume that the population standard deviation, σ (sigma), is known and so we will not cover hypothesis tests in this case. We discussed both cases for confidence intervals so that we could still calculate some confidence intervals by hand.

For this and all future tests we will rely on software to obtain our summary statistics, test statistics, and p-values for us.

The case where σ (sigma) is unknown is much more common in practice. What can we use to replace σ (sigma)? If you don’t know the population standard deviation, the best you can do is find the sample standard deviation, s, and use it instead of σ (sigma). (Note that this is exactly what we did when we discussed confidence intervals).

Is that it? Can we just use s instead of σ (sigma), and the rest is the same as the previous case? Unfortunately, it’s not that simple, but not very complicated either.

Here, when we use the sample standard deviation, s, as our estimate of σ (sigma) we can no longer use a normal distribution to find the cutoff for confidence intervals or the p-values for hypothesis tests.

Instead we must use the t- distribution (with n-1 degrees of freedom) to obtain the p-value for this test.

We discussed this issue for confidence intervals. We will talk more about the t- distribution after we discuss the details of this test for those who are interested in learning more.

It isn’t really necessary for us to understand this distribution but it is important that we use the correct distributions in practice via our software.

We will wait until UNIT 4B to look at how to accomplish this test in the software. For now focus on understanding the process and drawing the correct conclusions from the p-values given.

Now let’s go through the four steps in conducting the t- test for the population mean.

The null and alternative hypotheses for the t- test for the population mean (μ, mu) have exactly the same structure as the hypotheses for z-test for the population proportion (p):

The null hypothesis has the form:

  • Ho: μ = μ 0 (mu = mu_zero)

(where μ 0 (mu_zero) is often called the null value)

  • Ha: μ < μ 0 (mu < mu_zero) (one-sided)
  • Ha: μ > μ 0 (mu > mu_zero) (one-sided)
  • Ha: μ ≠ μ 0 (mu ≠ mu_zero) (two-sided)

where the choice of the appropriate alternative (out of the three) is usually quite clear from the context of the problem.

If you feel it is not clear, it is most likely a two-sided problem. Students are usually good at recognizing the “more than” and “less than” terminology but differences can sometimes be more difficult to spot, sometimes this is because you have preconceived ideas of how you think it should be! You also cannot use the information from the sample to help you determine the hypothesis. We would not know our data when we originally asked the question.

Now try it yourself. Here are a few exercises on stating the hypotheses for tests for a population mean.

Learn by Doing: State the Hypotheses for a test for a population mean

Here are a few more activities for practice.

Did I Get This?: State the Hypotheses for a test for a population mean

When setting up hypotheses, be sure to use only the information in the research question. We cannot use our sample data to help us set up our hypotheses.

For this test, it is still important to correctly choose the alternative hypothesis as “less than”, “greater than”, or “different” although generally in practice two-sample tests are used.

Obtain data from a sample:

  • In this step we would obtain data from a sample. This is not something we do much of in courses but it is done very often in practice!

Check the conditions:

  • Then we check the conditions under which this test (the t- test for one population mean) can be safely carried out – which are:
  • The sample is random (or at least can be considered random in context).
  • We are in one of the three situations marked with a green check mark in the following table (which ensure that x-bar is at least approximately normal and the test statistic using the sample standard deviation, s, is therefore a t- distribution with n-1 degrees of freedom – proving this is beyond the scope of this course):
  • For large samples, we don’t need to check for normality in the population . We can rely on the sample size as the basis for the validity of using this test.
  • For small samples , we need to have data from a normal population in order for the p-values and confidence intervals to be valid.

In practice, for small samples, it can be very difficult to determine if the population is normal. Here is a simulation to give you a better understanding of the difficulties.

Video: Simulations – Are Samples from a Normal Population? (4:58)

Now try it yourself with a few activities.

Learn by Doing: Checking Conditions for Hypothesis Testing for the Population Mean

  • It is always a good idea to look at the data and get a sense of their pattern regardless of whether you actually need to do it in order to assess whether the conditions are met.
  • This idea of looking at the data is relevant to all tests in general. In the next module—inference for relationships—conducting exploratory data analysis before inference will be an integral part of the process.

Here are a few more problems for extra practice.

Did I Get This?: Checking Conditions for Hypothesis Testing for the Population Mean

When setting up hypotheses, be sure to use only the information in the res

Calculate Test Statistic

Assuming that the conditions are met, we calculate the sample mean x-bar and the sample standard deviation, s (which estimates σ (sigma)), and summarize the data with a test statistic.

The test statistic for the t -test for the population mean is:

\(t=\dfrac{\bar{x} - \mu_0}{s/ \sqrt{n}}\)

Recall that such a standardized test statistic represents how many standard deviations above or below μ 0 (mu_zero) our sample mean x-bar is.

Therefore our test statistic is a measure of how different our data are from what is claimed in the null hypothesis. This is an idea that we mentioned in the previous test as well.

Again we will rely on the p-value to determine how unusual our data would be if the null hypothesis is true.

As we mentioned, the test statistic in the t -test for a population mean does not follow a standard normal distribution. Rather, it follows another bell-shaped distribution called the t- distribution.

We will present the details of this distribution at the end for those interested but for now we will work on the process of the test.

Here are a few important facts.

  • In statistical language we say that the null distribution of our test statistic is the t- distribution with (n-1) degrees of freedom. In other words, when Ho is true (i.e., when μ = μ 0 (mu = mu_zero)), our test statistic has a t- distribution with (n-1) d.f., and this is the distribution under which we find p-values.
  • For a large sample size (n), the null distribution of the test statistic is approximately Z, so whether we use t (n – 1) or Z to calculate the p-values does not make a big difference. However, software will use the t -distribution regardless of the sample size and so will we.

Although we will not calculate p-values by hand for this test, we can still easily calculate the test statistic.

Try it yourself:

Learn by Doing: Calculate the Test Statistic for a Test for a Population Mean

From this point in this course and certainly in practice we will allow the software to calculate our test statistics and we will use the p-values provided to draw our conclusions.

We will use software to obtain the p-value for this (and all future) tests but here are the images illustrating how the p-value is calculated in each of the three cases corresponding to the three choices for our alternative hypothesis.

Note that due to the symmetry of the t distribution, for a given value of the test statistic t, the p-value for the two-sided test is twice as large as the p-value of either of the one-sided tests. The same thing happens when p-values are calculated under the t distribution as when they are calculated under the Z distribution.

We will show some examples of p-values obtained from software in our examples. For now let’s continue our summary of the steps.

As usual, based on the p-value (and some significance level of choice) we assess the statistical significance of results, and draw our conclusions in context.

To review what we have said before:

If p-value ≤ 0.05 then WE REJECT Ho

If p-value > 0.05 then WE FAIL TO REJECT Ho

This step has essentially two sub-steps:

We are now ready to look at two examples.

A certain prescription medicine is supposed to contain an average of 250 parts per million (ppm) of a certain chemical. If the concentration is higher than this, the drug may cause harmful side effects; if it is lower, the drug may be ineffective.

The manufacturer runs a check to see if the mean concentration in a large shipment conforms to the target level of 250 ppm or not.

A simple random sample of 100 portions is tested, and the sample mean concentration is found to be 247 ppm with a sample standard deviation of 12 ppm.

Here is a figure that represents this example:

A large circle represents the population, which is the shipment. μ represents the concentration of the chemical. The question we want to answer is "is the mean concentration the required 250ppm or not? (Assume: SD = 12)." Selected from the population is a sample of size n=100, represented by a smaller circle. x-bar for this sample is 247.

1. The hypotheses being tested are:

  • Ha: μ ≠ μ 0 (mu ≠ mu_zero)
  • Where μ = population mean part per million of the chemical in the entire shipment

2. The conditions that allow us to use the t-test are met since:

  • The sample is random
  • The sample size is large enough for the Central Limit Theorem to apply and ensure the normality of x-bar. We do not need normality of the population in order to be able to conduct this test for the population mean. We are in the 2 nd column in the table below.
  • The test statistic is:

\(t=\dfrac{\bar{x}-\mu_{0}}{s / \sqrt{n}}=\dfrac{247-250}{12 / \sqrt{100}}=-2.5\)

  • The data (represented by the sample mean) are 2.5 standard errors below the null value.

3. Finding the p-value.

  • To find the p-value we use statistical software, and we calculate a p-value of 0.014.

4. Conclusions:

  • The p-value is small (.014) indicating that at the 5% significance level, the results are significant.
  • We reject the null hypothesis.
  • There is enough evidence to conclude that the mean concentration in entire shipment is not the required 250 ppm.
  • It is difficult to comment on the practical significance of this result without more understanding of the practical considerations of this problem.

Here is a summary:

  • The 95% confidence interval for μ (mu) can be used here in the same way as for proportions to conduct the two-sided test (checking whether the null value falls inside or outside the confidence interval) or following a t- test where Ho was rejected to get insight into the value of μ (mu).
  • We find the 95% confidence interval to be (244.619, 249.381) . Since 250 is not in the interval we know we would reject our null hypothesis that μ (mu) = 250. The confidence interval gives additional information. By accounting for estimation error, it estimates that the population mean is likely to be between 244.62 and 249.38. This is lower than the target concentration and that information might help determine the seriousness and appropriate course of action in this situation.

In most situations in practice we use TWO-SIDED HYPOTHESIS TESTS, followed by confidence intervals to gain more insight.

For completeness in covering one sample t-tests for a population mean, we still cover all three possible alternative hypotheses here HOWEVER, this will be the last test for which we will do so.

A research study measured the pulse rates of 57 college men and found a mean pulse rate of 70 beats per minute with a standard deviation of 9.85 beats per minute.

Researchers want to know if the mean pulse rate for all college men is different from the current standard of 72 beats per minute.

  • The hypotheses being tested are:
  • Ho: μ = 72
  • Ha: μ ≠ 72
  • Where μ = population mean heart rate among college men
  • The conditions that allow us to use the t- test are met since:
  • The sample is random.
  • The sample size is large (n = 57) so we do not need normality of the population in order to be able to conduct this test for the population mean. We are in the 2 nd column in the table below.

\(t=\dfrac{\bar{x}-\mu}{s / \sqrt{n}}=\dfrac{70-72}{9.85 / \sqrt{57}}=-1.53\)

  • The data (represented by the sample mean) are 1.53 estimated standard errors below the null value.
  • Recall that in general the p-value is calculated under the null distribution of the test statistic, which, in the t- test case, is t (n-1). In our case, in which n = 57, the p-value is calculated under the t (56) distribution. Using statistical software, we find that the p-value is 0.132 .
  • Here is how we calculated the p-value. http://homepage.stat.uiowa.edu/~mbognar/applets/t.html .

A t(56) curve, for which the horizontal axis has been labeled with t-scores of -2.5 and 2.5 . The area under the curve and to the left of -1.53 and to the right of 1.53 is the p-value.

4. Making conclusions.

  • The p-value (0.132) is not small, indicating that the results are not significant.
  • We fail to reject the null hypothesis.
  • There is not enough evidence to conclude that the mean pulse rate for all college men is different from the current standard of 72 beats per minute.
  • The results from this sample do not appear to have any practical significance either with a mean pulse rate of 70, this is very similar to the hypothesized value, relative to the variation expected in pulse rates.

Now try a few yourself.

Learn by Doing: Hypothesis Testing for the Population Mean

From this point in this course and certainly in practice we will allow the software to calculate our test statistic and p-value and we will use the p-values provided to draw our conclusions.

That concludes our discussion of hypothesis tests in Unit 4A.

In the next unit we will continue to use both confidence intervals and hypothesis test to investigate the relationship between two variables in the cases we covered in Unit 1 on exploratory data analysis – we will look at Case CQ, Case CC, and Case QQ.

Before moving on, we will discuss the details about the t- distribution as a general object.

We have seen that variables can be visually modeled by many different sorts of shapes, and we call these shapes distributions. Several distributions arise so frequently that they have been given special names, and they have been studied mathematically.

So far in the course, the only one we’ve named, for continuous quantitative variables, is the normal distribution, but there are others. One of them is called the t- distribution.

The t- distribution is another bell-shaped (unimodal and symmetric) distribution, like the normal distribution; and the center of the t- distribution is standardized at zero, like the center of the standard normal distribution.

Like all distributions that are used as probability models, the normal and the t- distribution are both scaled, so the total area under each of them is 1.

So how is the t-distribution fundamentally different from the normal distribution?

  • The spread .

The following picture illustrates the fundamental difference between the normal distribution and the t-distribution:

Here we have an image which illustrates the fundamental difference between the normal distribution and the t- distribution:

You can see in the picture that the t- distribution has slightly less area near the expected central value than the normal distribution does, and you can see that the t distribution has correspondingly more area in the “tails” than the normal distribution does. (It’s often said that the t- distribution has “fatter tails” or “heavier tails” than the normal distribution.)

This reflects the fact that the t- distribution has a larger spread than the normal distribution. The same total area of 1 is spread out over a slightly wider range on the t- distribution, making it a bit lower near the center compared to the normal distribution, and giving the t- distribution slightly more probability in the ‘tails’ compared to the normal distribution.

Therefore, the t- distribution ends up being the appropriate model in certain cases where there is more variability than would be predicted by the normal distribution. One of these cases is stock values, which have more variability (or “volatility,” to use the economic term) than would be predicted by the normal distribution.

There’s actually an entire family of t- distributions. They all have similar formulas (but the math is beyond the scope of this introductory course in statistics), and they all have slightly “fatter tails” than the normal distribution. But some are closer to normal than others.

The t- distributions that have higher “degrees of freedom” are closer to normal (degrees of freedom is a mathematical concept that we won’t study in this course, beyond merely mentioning it here). So, there’s a t- distribution “with one degree of freedom,” another t- distribution “with 2 degrees of freedom” which is slightly closer to normal, another t- distribution “with 3 degrees of freedom” which is a bit closer to normal than the previous ones, and so on.

The following picture illustrates this idea with just a couple of t- distributions (note that “degrees of freedom” is abbreviated “d.f.” on the picture):

The test statistic for our t-test for one population mean is a t -score which follows a t- distribution with (n – 1) degrees of freedom. Recall that each t- distribution is indexed according to “degrees of freedom.” Notice that, in the context of a test for a mean, the degrees of freedom depend on the sample size in the study.

Remember that we said that higher degrees of freedom indicate that the t- distribution is closer to normal. So in the context of a test for the mean, the larger the sample size , the higher the degrees of freedom, and the closer the t- distribution is to a normal z distribution .

As a result, in the context of a test for a mean, the effect of the t- distribution is most important for a study with a relatively small sample size .

We are now done introducing the t-distribution. What are implications of all of this?

  • The null distribution of our t-test statistic is the t-distribution with (n-1) d.f. In other words, when Ho is true (i.e., when μ = μ 0 (mu = mu_zero)), our test statistic has a t-distribution with (n-1) d.f., and this is the distribution under which we find p-values.
  • For a large sample size (n), the null distribution of the test statistic is approximately Z, so whether we use t(n – 1) or Z to calculate the p-values does not make a big difference.

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Biology LibreTexts

4.14: Experiments and Hypotheses

  • Last updated
  • Save as PDF
  • Page ID 43806

Now we’ll focus on the methods of scientific inquiry. Science often involves making observations and developing hypotheses. Experiments and further observations are often used to test the hypotheses.

A scientific experiment is a carefully organized procedure in which the scientist intervenes in a system to change something, then observes the result of the change. Scientific inquiry often involves doing experiments, though not always. For example, a scientist studying the mating behaviors of ladybugs might begin with detailed observations of ladybugs mating in their natural habitats. While this research may not be experimental, it is scientific: it involves careful and verifiable observation of the natural world. The same scientist might then treat some of the ladybugs with a hormone hypothesized to trigger mating and observe whether these ladybugs mated sooner or more often than untreated ones. This would qualify as an experiment because the scientist is now making a change in the system and observing the effects.

Forming a Hypothesis

When conducting scientific experiments, researchers develop hypotheses to guide experimental design. A hypothesis is a suggested explanation that is both testable and falsifiable. You must be able to test your hypothesis, and it must be possible to prove your hypothesis true or false.

For example, Michael observes that maple trees lose their leaves in the fall. He might then propose a possible explanation for this observation: “cold weather causes maple trees to lose their leaves in the fall.” This statement is testable. He could grow maple trees in a warm enclosed environment such as a greenhouse and see if their leaves still dropped in the fall. The hypothesis is also falsifiable. If the leaves still dropped in the warm environment, then clearly temperature was not the main factor in causing maple leaves to drop in autumn.

In the Try It below, you can practice recognizing scientific hypotheses. As you consider each statement, try to think as a scientist would: can I test this hypothesis with observations or experiments? Is the statement falsifiable? If the answer to either of these questions is “no,” the statement is not a valid scientific hypothesis.

Practice Questions

Determine whether each following statement is a scientific hypothesis.

  • No. This statement is not testable or falsifiable.
  • No. This statement is not testable.
  • No. This statement is not falsifiable.
  • Yes. This statement is testable and falsifiable.

[reveal-answer q=”429550″] Show Answers [/reveal-answer] [hidden-answer a=”429550″]

  • d: Yes. This statement is testable and falsifiable. This could be tested with a number of different kinds of observations and experiments, and it is possible to gather evidence that indicates that air pollution is not linked with asthma.
  • a: No. This statement is not testable or falsifiable. “Bad thoughts and behaviors” are excessively vague and subjective variables that would be impossible to measure or agree upon in a reliable way. The statement might be “falsifiable” if you came up with a counterexample: a “wicked” place that was not punished by a natural disaster. But some would question whether the people in that place were really wicked, and others would continue to predict that a natural disaster was bound to strike that place at some point. There is no reason to suspect that people’s immoral behavior affects the weather unless you bring up the intervention of a supernatural being, making this idea even harder to test.

[/hidden-answer]

Testing a Vaccine

Let’s examine the scientific process by discussing an actual scientific experiment conducted by researchers at the University of Washington. These researchers investigated whether a vaccine may reduce the incidence of the human papillomavirus (HPV). The experimental process and results were published in an article titled, “ A controlled trial of a human papillomavirus type 16 vaccine .”

Preliminary observations made by the researchers who conducted the HPV experiment are listed below:

  • Human papillomavirus (HPV) is the most common sexually transmitted virus in the United States.
  • There are about 40 different types of HPV. A significant number of people that have HPV are unaware of it because many of these viruses cause no symptoms.
  • Some types of HPV can cause cervical cancer.
  • About 4,000 women a year die of cervical cancer in the United States.

Practice Question

Researchers have developed a potential vaccine against HPV and want to test it. What is the first testable hypothesis that the researchers should study?

  • HPV causes cervical cancer.
  • People should not have unprotected sex with many partners.
  • People who get the vaccine will not get HPV.
  • The HPV vaccine will protect people against cancer.

[reveal-answer q=”20917″] Show Answer [/reveal-answer] [hidden-answer a=”20917″]Hypothesis A is not the best choice because this information is already known from previous studies. Hypothesis B is not testable because scientific hypotheses are not value statements; they do not include judgments like “should,” “better than,” etc. Scientific evidence certainly might support this value judgment, but a hypothesis would take a different form: “Having unprotected sex with many partners increases a person’s risk for cervical cancer.” Before the researchers can test if the vaccine protects against cancer (hypothesis D), they want to test if it protects against the virus. This statement will make an excellent hypothesis for the next study. The researchers should first test hypothesis C—whether or not the new vaccine can prevent HPV.[/hidden-answer]

Experimental Design

You’ve successfully identified a hypothesis for the University of Washington’s study on HPV: People who get the HPV vaccine will not get HPV.

The next step is to design an experiment that will test this hypothesis. There are several important factors to consider when designing a scientific experiment. First, scientific experiments must have an experimental group. This is the group that receives the experimental treatment necessary to address the hypothesis.

The experimental group receives the vaccine, but how can we know if the vaccine made a difference? Many things may change HPV infection rates in a group of people over time. To clearly show that the vaccine was effective in helping the experimental group, we need to include in our study an otherwise similar control group that does not get the treatment. We can then compare the two groups and determine if the vaccine made a difference. The control group shows us what happens in the absence of the factor under study.

However, the control group cannot get “nothing.” Instead, the control group often receives a placebo. A placebo is a procedure that has no expected therapeutic effect—such as giving a person a sugar pill or a shot containing only plain saline solution with no drug. Scientific studies have shown that the “placebo effect” can alter experimental results because when individuals are told that they are or are not being treated, this knowledge can alter their actions or their emotions, which can then alter the results of the experiment.

Moreover, if the doctor knows which group a patient is in, this can also influence the results of the experiment. Without saying so directly, the doctor may show—through body language or other subtle cues—his or her views about whether the patient is likely to get well. These errors can then alter the patient’s experience and change the results of the experiment. Therefore, many clinical studies are “double blind.” In these studies, neither the doctor nor the patient knows which group the patient is in until all experimental results have been collected.

Both placebo treatments and double-blind procedures are designed to prevent bias. Bias is any systematic error that makes a particular experimental outcome more or less likely. Errors can happen in any experiment: people make mistakes in measurement, instruments fail, computer glitches can alter data. But most such errors are random and don’t favor one outcome over another. Patients’ belief in a treatment can make it more likely to appear to “work.” Placebos and double-blind procedures are used to level the playing field so that both groups of study subjects are treated equally and share similar beliefs about their treatment.

The scientists who are researching the effectiveness of the HPV vaccine will test their hypothesis by separating 2,392 young women into two groups: the control group and the experimental group. Answer the following questions about these two groups.

  • This group is given a placebo.
  • This group is deliberately infected with HPV.
  • This group is given nothing.
  • This group is given the HPV vaccine.

[reveal-answer q=”918962″] Show Answers [/reveal-answer] [hidden-answer a=”918962″]

  • a: This group is given a placebo. A placebo will be a shot, just like the HPV vaccine, but it will have no active ingredient. It may change peoples’ thinking or behavior to have such a shot given to them, but it will not stimulate the immune systems of the subjects in the same way as predicted for the vaccine itself.
  • d: This group is given the HPV vaccine. The experimental group will receive the HPV vaccine and researchers will then be able to see if it works, when compared to the control group.

Experimental Variables

A variable is a characteristic of a subject (in this case, of a person in the study) that can vary over time or among individuals. Sometimes a variable takes the form of a category, such as male or female; often a variable can be measured precisely, such as body height. Ideally, only one variable is different between the control group and the experimental group in a scientific experiment. Otherwise, the researchers will not be able to determine which variable caused any differences seen in the results. For example, imagine that the people in the control group were, on average, much more sexually active than the people in the experimental group. If, at the end of the experiment, the control group had a higher rate of HPV infection, could you confidently determine why? Maybe the experimental subjects were protected by the vaccine, but maybe they were protected by their low level of sexual contact.

To avoid this situation, experimenters make sure that their subject groups are as similar as possible in all variables except for the variable that is being tested in the experiment. This variable, or factor, will be deliberately changed in the experimental group. The one variable that is different between the two groups is called the independent variable. An independent variable is known or hypothesized to cause some outcome. Imagine an educational researcher investigating the effectiveness of a new teaching strategy in a classroom. The experimental group receives the new teaching strategy, while the control group receives the traditional strategy. It is the teaching strategy that is the independent variable in this scenario. In an experiment, the independent variable is the variable that the scientist deliberately changes or imposes on the subjects.

Dependent variables are known or hypothesized consequences; they are the effects that result from changes or differences in an independent variable. In an experiment, the dependent variables are those that the scientist measures before, during, and particularly at the end of the experiment to see if they have changed as expected. The dependent variable must be stated so that it is clear how it will be observed or measured. Rather than comparing “learning” among students (which is a vague and difficult to measure concept), an educational researcher might choose to compare test scores, which are very specific and easy to measure.

In any real-world example, many, many variables MIGHT affect the outcome of an experiment, yet only one or a few independent variables can be tested. Other variables must be kept as similar as possible between the study groups and are called control variables . For our educational research example, if the control group consisted only of people between the ages of 18 and 20 and the experimental group contained people between the ages of 30 and 35, we would not know if it was the teaching strategy or the students’ ages that played a larger role in the results. To avoid this problem, a good study will be set up so that each group contains students with a similar age profile. In a well-designed educational research study, student age will be a controlled variable, along with other possibly important factors like gender, past educational achievement, and pre-existing knowledge of the subject area.

What is the independent variable in this experiment?

  • Sex (all of the subjects will be female)
  • Presence or absence of the HPV vaccine
  • Presence or absence of HPV (the virus)

[reveal-answer q=”68680″]Show Answer[/reveal-answer] [hidden-answer a=”68680″]Answer b. Presence or absence of the HPV vaccine. This is the variable that is different between the control and the experimental groups. All the subjects in this study are female, so this variable is the same in all groups. In a well-designed study, the two groups will be of similar age. The presence or absence of the virus is what the researchers will measure at the end of the experiment. Ideally the two groups will both be HPV-free at the start of the experiment.

List three control variables other than age.

[practice-area rows=”3″][/practice-area] [reveal-answer q=”903121″]Show Answer[/reveal-answer] [hidden-answer a=”903121″]Some possible control variables would be: general health of the women, sexual activity, lifestyle, diet, socioeconomic status, etc.

What is the dependent variable in this experiment?

  • Sex (male or female)
  • Rates of HPV infection
  • Age (years)

[reveal-answer q=”907103″]Show Answer[/reveal-answer] [hidden-answer a=”907103″]Answer b. Rates of HPV infection. The researchers will measure how many individuals got infected with HPV after a given period of time.[/hidden-answer]

Contributors and Attributions

  • Revision and adaptation. Authored by : Shelli Carter and Lumen Learning. Provided by : Lumen Learning. License : CC BY-NC-SA: Attribution-NonCommercial-ShareAlike
  • Scientific Inquiry. Provided by : Open Learning Initiative. Located at : https://oli.cmu.edu/jcourse/workbook/activity/page?context=434a5c2680020ca6017c03488572e0f8 . Project : Introduction to Biology (Open + Free). License : CC BY-NC-SA: Attribution-NonCommercial-ShareAlike

explain how your hypothesis is testable

How to Write a Hypothesis: A Step-by-Step Guide

explain how your hypothesis is testable

Introduction

An overview of the research hypothesis, different types of hypotheses, variables in a hypothesis, how to formulate an effective research hypothesis, designing a study around your hypothesis.

The scientific method can derive and test predictions as hypotheses. Empirical research can then provide support (or lack thereof) for the hypotheses. Even failure to find support for a hypothesis still represents a valuable contribution to scientific knowledge. Let's look more closely at the idea of the hypothesis and the role it plays in research.

explain how your hypothesis is testable

As much as the term exists in everyday language, there is a detailed development that informs the word "hypothesis" when applied to research. A good research hypothesis is informed by prior research and guides research design and data analysis , so it is important to understand how a hypothesis is defined and understood by researchers.

What is the simple definition of a hypothesis?

A hypothesis is a testable prediction about an outcome between two or more variables . It functions as a navigational tool in the research process, directing what you aim to predict and how.

What is the hypothesis for in research?

In research, a hypothesis serves as the cornerstone for your empirical study. It not only lays out what you aim to investigate but also provides a structured approach for your data collection and analysis.

Essentially, it bridges the gap between the theoretical and the empirical, guiding your investigation throughout its course.

explain how your hypothesis is testable

What is an example of a hypothesis?

If you are studying the relationship between physical exercise and mental health, a suitable hypothesis could be: "Regular physical exercise leads to improved mental well-being among adults."

This statement constitutes a specific and testable hypothesis that directly relates to the variables you are investigating.

What makes a good hypothesis?

A good hypothesis possesses several key characteristics. Firstly, it must be testable, allowing you to analyze data through empirical means, such as observation or experimentation, to assess if there is significant support for the hypothesis. Secondly, a hypothesis should be specific and unambiguous, giving a clear understanding of the expected relationship between variables. Lastly, it should be grounded in existing research or theoretical frameworks , ensuring its relevance and applicability.

Understanding the types of hypotheses can greatly enhance how you construct and work with hypotheses. While all hypotheses serve the essential function of guiding your study, there are varying purposes among the types of hypotheses. In addition, all hypotheses stand in contrast to the null hypothesis, or the assumption that there is no significant relationship between the variables .

Here, we explore various kinds of hypotheses to provide you with the tools needed to craft effective hypotheses for your specific research needs. Bear in mind that many of these hypothesis types may overlap with one another, and the specific type that is typically used will likely depend on the area of research and methodology you are following.

Null hypothesis

The null hypothesis is a statement that there is no effect or relationship between the variables being studied. In statistical terms, it serves as the default assumption that any observed differences are due to random chance.

For example, if you're studying the effect of a drug on blood pressure, the null hypothesis might state that the drug has no effect.

Alternative hypothesis

Contrary to the null hypothesis, the alternative hypothesis suggests that there is a significant relationship or effect between variables.

Using the drug example, the alternative hypothesis would posit that the drug does indeed affect blood pressure. This is what researchers aim to prove.

explain how your hypothesis is testable

Simple hypothesis

A simple hypothesis makes a prediction about the relationship between two variables, and only two variables.

For example, "Increased study time results in better exam scores." Here, "study time" and "exam scores" are the only variables involved.

Complex hypothesis

A complex hypothesis, as the name suggests, involves more than two variables. For instance, "Increased study time and access to resources result in better exam scores." Here, "study time," "access to resources," and "exam scores" are all variables.

This hypothesis refers to multiple potential mediating variables. Other hypotheses could also include predictions about variables that moderate the relationship between the independent variable and dependent variable .

Directional hypothesis

A directional hypothesis specifies the direction of the expected relationship between variables. For example, "Eating more fruits and vegetables leads to a decrease in heart disease."

Here, the direction of heart disease is explicitly predicted to decrease, due to effects from eating more fruits and vegetables. All hypotheses typically specify the expected direction of the relationship between the independent and dependent variable, such that researchers can test if this prediction holds in their data analysis .

explain how your hypothesis is testable

Statistical hypothesis

A statistical hypothesis is one that is testable through statistical methods, providing a numerical value that can be analyzed. This is commonly seen in quantitative research .

For example, "There is a statistically significant difference in test scores between students who study for one hour and those who study for two."

Empirical hypothesis

An empirical hypothesis is derived from observations and is tested through empirical methods, often through experimentation or survey data . Empirical hypotheses may also be assessed with statistical analyses.

For example, "Regular exercise is correlated with a lower incidence of depression," could be tested through surveys that measure exercise frequency and depression levels.

Causal hypothesis

A causal hypothesis proposes that one variable causes a change in another. This type of hypothesis is often tested through controlled experiments.

For example, "Smoking causes lung cancer," assumes a direct causal relationship.

Associative hypothesis

Unlike causal hypotheses, associative hypotheses suggest a relationship between variables but do not imply causation.

For instance, "People who smoke are more likely to get lung cancer," notes an association but doesn't claim that smoking causes lung cancer directly.

Relational hypothesis

A relational hypothesis explores the relationship between two or more variables but doesn't specify the nature of the relationship.

For example, "There is a relationship between diet and heart health," leaves the nature of the relationship (causal, associative, etc.) open to interpretation.

Logical hypothesis

A logical hypothesis is based on sound reasoning and logical principles. It's often used in theoretical research to explore abstract concepts, rather than being based on empirical data.

For example, "If all men are mortal and Socrates is a man, then Socrates is mortal," employs logical reasoning to make its point.

explain how your hypothesis is testable

Let ATLAS.ti take you from research question to key insights

Get started with a free trial and see how ATLAS.ti can make the most of your data.

In any research hypothesis, variables play a critical role. These are the elements or factors that the researcher manipulates, controls, or measures. Understanding variables is essential for crafting a clear, testable hypothesis and for the stages of research that follow, such as data collection and analysis.

In the realm of hypotheses, there are generally two types of variables to consider: independent and dependent. Independent variables are what you, as the researcher, manipulate or change in your study. It's considered the cause in the relationship you're investigating. For instance, in a study examining the impact of sleep duration on academic performance, the independent variable would be the amount of sleep participants get.

Conversely, the dependent variable is the outcome you measure to gauge the effect of your manipulation. It's the effect in the cause-and-effect relationship. The dependent variable thus refers to the main outcome of interest in your study. In the same sleep study example, the academic performance, perhaps measured by exam scores or GPA, would be the dependent variable.

Beyond these two primary types, you might also encounter control variables. These are variables that could potentially influence the outcome and are therefore kept constant to isolate the relationship between the independent and dependent variables . For example, in the sleep and academic performance study, control variables could include age, diet, or even the subject of study.

By clearly identifying and understanding the roles of these variables in your hypothesis, you set the stage for a methodologically sound research project. It helps you develop focused research questions, design appropriate experiments or observations, and carry out meaningful data analysis . It's a step that lays the groundwork for the success of your entire study.

explain how your hypothesis is testable

Crafting a strong, testable hypothesis is crucial for the success of any research project. It sets the stage for everything from your study design to data collection and analysis . Below are some key considerations to keep in mind when formulating your hypothesis:

  • Be specific : A vague hypothesis can lead to ambiguous results and interpretations . Clearly define your variables and the expected relationship between them.
  • Ensure testability : A good hypothesis should be testable through empirical means, whether by observation , experimentation, or other forms of data analysis.
  • Ground in literature : Before creating your hypothesis, consult existing research and theories. This not only helps you identify gaps in current knowledge but also gives you valuable context and credibility for crafting your hypothesis.
  • Use simple language : While your hypothesis should be conceptually sound, it doesn't have to be complicated. Aim for clarity and simplicity in your wording.
  • State direction, if applicable : If your hypothesis involves a directional outcome (e.g., "increase" or "decrease"), make sure to specify this. You also need to think about how you will measure whether or not the outcome moved in the direction you predicted.
  • Keep it focused : One of the common pitfalls in hypothesis formulation is trying to answer too many questions at once. Keep your hypothesis focused on a specific issue or relationship.
  • Account for control variables : Identify any variables that could potentially impact the outcome and consider how you will control for them in your study.
  • Be ethical : Make sure your hypothesis and the methods for testing it comply with ethical standards , particularly if your research involves human or animal subjects.

explain how your hypothesis is testable

Designing your study involves multiple key phases that help ensure the rigor and validity of your research. Here we discuss these crucial components in more detail.

Literature review

Starting with a comprehensive literature review is essential. This step allows you to understand the existing body of knowledge related to your hypothesis and helps you identify gaps that your research could fill. Your research should aim to contribute some novel understanding to existing literature, and your hypotheses can reflect this. A literature review also provides valuable insights into how similar research projects were executed, thereby helping you fine-tune your own approach.

explain how your hypothesis is testable

Research methods

Choosing the right research methods is critical. Whether it's a survey, an experiment, or observational study, the methodology should be the most appropriate for testing your hypothesis. Your choice of methods will also depend on whether your research is quantitative, qualitative, or mixed-methods. Make sure the chosen methods align well with the variables you are studying and the type of data you need.

Preliminary research

Before diving into a full-scale study, it’s often beneficial to conduct preliminary research or a pilot study . This allows you to test your research methods on a smaller scale, refine your tools, and identify any potential issues. For instance, a pilot survey can help you determine if your questions are clear and if the survey effectively captures the data you need. This step can save you both time and resources in the long run.

Data analysis

Finally, planning your data analysis in advance is crucial for a successful study. Decide which statistical or analytical tools are most suited for your data type and research questions . For quantitative research, you might opt for t-tests, ANOVA, or regression analyses. For qualitative research , thematic analysis or grounded theory may be more appropriate. This phase is integral for interpreting your results and drawing meaningful conclusions in relation to your research question.

explain how your hypothesis is testable

What is a Hypothesis – Types, Examples and Writing Guide

Table of Contents

What is a Hypothesis

Definition:

Hypothesis is an educated guess or proposed explanation for a phenomenon, based on some initial observations or data. It is a tentative statement that can be tested and potentially proven or disproven through further investigation and experimentation.

Hypothesis is often used in scientific research to guide the design of experiments and the collection and analysis of data. It is an essential element of the scientific method, as it allows researchers to make predictions about the outcome of their experiments and to test those predictions to determine their accuracy.

Types of Hypothesis

Types of Hypothesis are as follows:

Research Hypothesis

A research hypothesis is a statement that predicts a relationship between variables. It is usually formulated as a specific statement that can be tested through research, and it is often used in scientific research to guide the design of experiments.

Null Hypothesis

The null hypothesis is a statement that assumes there is no significant difference or relationship between variables. It is often used as a starting point for testing the research hypothesis, and if the results of the study reject the null hypothesis, it suggests that there is a significant difference or relationship between variables.

Alternative Hypothesis

An alternative hypothesis is a statement that assumes there is a significant difference or relationship between variables. It is often used as an alternative to the null hypothesis and is tested against the null hypothesis to determine which statement is more accurate.

Directional Hypothesis

A directional hypothesis is a statement that predicts the direction of the relationship between variables. For example, a researcher might predict that increasing the amount of exercise will result in a decrease in body weight.

Non-directional Hypothesis

A non-directional hypothesis is a statement that predicts the relationship between variables but does not specify the direction. For example, a researcher might predict that there is a relationship between the amount of exercise and body weight, but they do not specify whether increasing or decreasing exercise will affect body weight.

Statistical Hypothesis

A statistical hypothesis is a statement that assumes a particular statistical model or distribution for the data. It is often used in statistical analysis to test the significance of a particular result.

Composite Hypothesis

A composite hypothesis is a statement that assumes more than one condition or outcome. It can be divided into several sub-hypotheses, each of which represents a different possible outcome.

Empirical Hypothesis

An empirical hypothesis is a statement that is based on observed phenomena or data. It is often used in scientific research to develop theories or models that explain the observed phenomena.

Simple Hypothesis

A simple hypothesis is a statement that assumes only one outcome or condition. It is often used in scientific research to test a single variable or factor.

Complex Hypothesis

A complex hypothesis is a statement that assumes multiple outcomes or conditions. It is often used in scientific research to test the effects of multiple variables or factors on a particular outcome.

Applications of Hypothesis

Hypotheses are used in various fields to guide research and make predictions about the outcomes of experiments or observations. Here are some examples of how hypotheses are applied in different fields:

  • Science : In scientific research, hypotheses are used to test the validity of theories and models that explain natural phenomena. For example, a hypothesis might be formulated to test the effects of a particular variable on a natural system, such as the effects of climate change on an ecosystem.
  • Medicine : In medical research, hypotheses are used to test the effectiveness of treatments and therapies for specific conditions. For example, a hypothesis might be formulated to test the effects of a new drug on a particular disease.
  • Psychology : In psychology, hypotheses are used to test theories and models of human behavior and cognition. For example, a hypothesis might be formulated to test the effects of a particular stimulus on the brain or behavior.
  • Sociology : In sociology, hypotheses are used to test theories and models of social phenomena, such as the effects of social structures or institutions on human behavior. For example, a hypothesis might be formulated to test the effects of income inequality on crime rates.
  • Business : In business research, hypotheses are used to test the validity of theories and models that explain business phenomena, such as consumer behavior or market trends. For example, a hypothesis might be formulated to test the effects of a new marketing campaign on consumer buying behavior.
  • Engineering : In engineering, hypotheses are used to test the effectiveness of new technologies or designs. For example, a hypothesis might be formulated to test the efficiency of a new solar panel design.

How to write a Hypothesis

Here are the steps to follow when writing a hypothesis:

Identify the Research Question

The first step is to identify the research question that you want to answer through your study. This question should be clear, specific, and focused. It should be something that can be investigated empirically and that has some relevance or significance in the field.

Conduct a Literature Review

Before writing your hypothesis, it’s essential to conduct a thorough literature review to understand what is already known about the topic. This will help you to identify the research gap and formulate a hypothesis that builds on existing knowledge.

Determine the Variables

The next step is to identify the variables involved in the research question. A variable is any characteristic or factor that can vary or change. There are two types of variables: independent and dependent. The independent variable is the one that is manipulated or changed by the researcher, while the dependent variable is the one that is measured or observed as a result of the independent variable.

Formulate the Hypothesis

Based on the research question and the variables involved, you can now formulate your hypothesis. A hypothesis should be a clear and concise statement that predicts the relationship between the variables. It should be testable through empirical research and based on existing theory or evidence.

Write the Null Hypothesis

The null hypothesis is the opposite of the alternative hypothesis, which is the hypothesis that you are testing. The null hypothesis states that there is no significant difference or relationship between the variables. It is important to write the null hypothesis because it allows you to compare your results with what would be expected by chance.

Refine the Hypothesis

After formulating the hypothesis, it’s important to refine it and make it more precise. This may involve clarifying the variables, specifying the direction of the relationship, or making the hypothesis more testable.

Examples of Hypothesis

Here are a few examples of hypotheses in different fields:

  • Psychology : “Increased exposure to violent video games leads to increased aggressive behavior in adolescents.”
  • Biology : “Higher levels of carbon dioxide in the atmosphere will lead to increased plant growth.”
  • Sociology : “Individuals who grow up in households with higher socioeconomic status will have higher levels of education and income as adults.”
  • Education : “Implementing a new teaching method will result in higher student achievement scores.”
  • Marketing : “Customers who receive a personalized email will be more likely to make a purchase than those who receive a generic email.”
  • Physics : “An increase in temperature will cause an increase in the volume of a gas, assuming all other variables remain constant.”
  • Medicine : “Consuming a diet high in saturated fats will increase the risk of developing heart disease.”

Purpose of Hypothesis

The purpose of a hypothesis is to provide a testable explanation for an observed phenomenon or a prediction of a future outcome based on existing knowledge or theories. A hypothesis is an essential part of the scientific method and helps to guide the research process by providing a clear focus for investigation. It enables scientists to design experiments or studies to gather evidence and data that can support or refute the proposed explanation or prediction.

The formulation of a hypothesis is based on existing knowledge, observations, and theories, and it should be specific, testable, and falsifiable. A specific hypothesis helps to define the research question, which is important in the research process as it guides the selection of an appropriate research design and methodology. Testability of the hypothesis means that it can be proven or disproven through empirical data collection and analysis. Falsifiability means that the hypothesis should be formulated in such a way that it can be proven wrong if it is incorrect.

In addition to guiding the research process, the testing of hypotheses can lead to new discoveries and advancements in scientific knowledge. When a hypothesis is supported by the data, it can be used to develop new theories or models to explain the observed phenomenon. When a hypothesis is not supported by the data, it can help to refine existing theories or prompt the development of new hypotheses to explain the phenomenon.

When to use Hypothesis

Here are some common situations in which hypotheses are used:

  • In scientific research , hypotheses are used to guide the design of experiments and to help researchers make predictions about the outcomes of those experiments.
  • In social science research , hypotheses are used to test theories about human behavior, social relationships, and other phenomena.
  • I n business , hypotheses can be used to guide decisions about marketing, product development, and other areas. For example, a hypothesis might be that a new product will sell well in a particular market, and this hypothesis can be tested through market research.

Characteristics of Hypothesis

Here are some common characteristics of a hypothesis:

  • Testable : A hypothesis must be able to be tested through observation or experimentation. This means that it must be possible to collect data that will either support or refute the hypothesis.
  • Falsifiable : A hypothesis must be able to be proven false if it is not supported by the data. If a hypothesis cannot be falsified, then it is not a scientific hypothesis.
  • Clear and concise : A hypothesis should be stated in a clear and concise manner so that it can be easily understood and tested.
  • Based on existing knowledge : A hypothesis should be based on existing knowledge and research in the field. It should not be based on personal beliefs or opinions.
  • Specific : A hypothesis should be specific in terms of the variables being tested and the predicted outcome. This will help to ensure that the research is focused and well-designed.
  • Tentative: A hypothesis is a tentative statement or assumption that requires further testing and evidence to be confirmed or refuted. It is not a final conclusion or assertion.
  • Relevant : A hypothesis should be relevant to the research question or problem being studied. It should address a gap in knowledge or provide a new perspective on the issue.

Advantages of Hypothesis

Hypotheses have several advantages in scientific research and experimentation:

  • Guides research: A hypothesis provides a clear and specific direction for research. It helps to focus the research question, select appropriate methods and variables, and interpret the results.
  • Predictive powe r: A hypothesis makes predictions about the outcome of research, which can be tested through experimentation. This allows researchers to evaluate the validity of the hypothesis and make new discoveries.
  • Facilitates communication: A hypothesis provides a common language and framework for scientists to communicate with one another about their research. This helps to facilitate the exchange of ideas and promotes collaboration.
  • Efficient use of resources: A hypothesis helps researchers to use their time, resources, and funding efficiently by directing them towards specific research questions and methods that are most likely to yield results.
  • Provides a basis for further research: A hypothesis that is supported by data provides a basis for further research and exploration. It can lead to new hypotheses, theories, and discoveries.
  • Increases objectivity: A hypothesis can help to increase objectivity in research by providing a clear and specific framework for testing and interpreting results. This can reduce bias and increase the reliability of research findings.

Limitations of Hypothesis

Some Limitations of the Hypothesis are as follows:

  • Limited to observable phenomena: Hypotheses are limited to observable phenomena and cannot account for unobservable or intangible factors. This means that some research questions may not be amenable to hypothesis testing.
  • May be inaccurate or incomplete: Hypotheses are based on existing knowledge and research, which may be incomplete or inaccurate. This can lead to flawed hypotheses and erroneous conclusions.
  • May be biased: Hypotheses may be biased by the researcher’s own beliefs, values, or assumptions. This can lead to selective interpretation of data and a lack of objectivity in research.
  • Cannot prove causation: A hypothesis can only show a correlation between variables, but it cannot prove causation. This requires further experimentation and analysis.
  • Limited to specific contexts: Hypotheses are limited to specific contexts and may not be generalizable to other situations or populations. This means that results may not be applicable in other contexts or may require further testing.
  • May be affected by chance : Hypotheses may be affected by chance or random variation, which can obscure or distort the true relationship between variables.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Data collection

Data Collection – Methods Types and Examples

Delimitations

Delimitations in Research – Types, Examples and...

Research Process

Research Process – Steps, Examples and Tips

Research Design

Research Design – Types, Methods and Examples

Institutional Review Board (IRB)

Institutional Review Board – Application Sample...

Evaluating Research

Evaluating Research – Process, Examples and...

Module 1: Introduction to Biology

Experiments and hypotheses, learning outcomes.

  • Form a hypothesis and use it to design a scientific experiment

Now we’ll focus on the methods of scientific inquiry. Science often involves making observations and developing hypotheses. Experiments and further observations are often used to test the hypotheses.

A scientific experiment is a carefully organized procedure in which the scientist intervenes in a system to change something, then observes the result of the change. Scientific inquiry often involves doing experiments, though not always. For example, a scientist studying the mating behaviors of ladybugs might begin with detailed observations of ladybugs mating in their natural habitats. While this research may not be experimental, it is scientific: it involves careful and verifiable observation of the natural world. The same scientist might then treat some of the ladybugs with a hormone hypothesized to trigger mating and observe whether these ladybugs mated sooner or more often than untreated ones. This would qualify as an experiment because the scientist is now making a change in the system and observing the effects.

Forming a Hypothesis

When conducting scientific experiments, researchers develop hypotheses to guide experimental design. A hypothesis is a suggested explanation that is both testable and falsifiable. You must be able to test your hypothesis, and it must be possible to prove your hypothesis true or false.

For example, Michael observes that maple trees lose their leaves in the fall. He might then propose a possible explanation for this observation: “cold weather causes maple trees to lose their leaves in the fall.” This statement is testable. He could grow maple trees in a warm enclosed environment such as a greenhouse and see if their leaves still dropped in the fall. The hypothesis is also falsifiable. If the leaves still dropped in the warm environment, then clearly temperature was not the main factor in causing maple leaves to drop in autumn.

In the Try It below, you can practice recognizing scientific hypotheses. As you consider each statement, try to think as a scientist would: can I test this hypothesis with observations or experiments? Is the statement falsifiable? If the answer to either of these questions is “no,” the statement is not a valid scientific hypothesis.

Practice Questions

Determine whether each following statement is a scientific hypothesis.

Air pollution from automobile exhaust can trigger symptoms in people with asthma.

  • No. This statement is not testable or falsifiable.
  • No. This statement is not testable.
  • No. This statement is not falsifiable.
  • Yes. This statement is testable and falsifiable.

Natural disasters, such as tornadoes, are punishments for bad thoughts and behaviors.

a: No. This statement is not testable or falsifiable. “Bad thoughts and behaviors” are excessively vague and subjective variables that would be impossible to measure or agree upon in a reliable way. The statement might be “falsifiable” if you came up with a counterexample: a “wicked” place that was not punished by a natural disaster. But some would question whether the people in that place were really wicked, and others would continue to predict that a natural disaster was bound to strike that place at some point. There is no reason to suspect that people’s immoral behavior affects the weather unless you bring up the intervention of a supernatural being, making this idea even harder to test.

Testing a Vaccine

Let’s examine the scientific process by discussing an actual scientific experiment conducted by researchers at the University of Washington. These researchers investigated whether a vaccine may reduce the incidence of the human papillomavirus (HPV). The experimental process and results were published in an article titled, “ A controlled trial of a human papillomavirus type 16 vaccine .”

Preliminary observations made by the researchers who conducted the HPV experiment are listed below:

  • Human papillomavirus (HPV) is the most common sexually transmitted virus in the United States.
  • There are about 40 different types of HPV. A significant number of people that have HPV are unaware of it because many of these viruses cause no symptoms.
  • Some types of HPV can cause cervical cancer.
  • About 4,000 women a year die of cervical cancer in the United States.

Practice Question

Researchers have developed a potential vaccine against HPV and want to test it. What is the first testable hypothesis that the researchers should study?

  • HPV causes cervical cancer.
  • People should not have unprotected sex with many partners.
  • People who get the vaccine will not get HPV.
  • The HPV vaccine will protect people against cancer.

Experimental Design

You’ve successfully identified a hypothesis for the University of Washington’s study on HPV: People who get the HPV vaccine will not get HPV.

The next step is to design an experiment that will test this hypothesis. There are several important factors to consider when designing a scientific experiment. First, scientific experiments must have an experimental group. This is the group that receives the experimental treatment necessary to address the hypothesis.

The experimental group receives the vaccine, but how can we know if the vaccine made a difference? Many things may change HPV infection rates in a group of people over time. To clearly show that the vaccine was effective in helping the experimental group, we need to include in our study an otherwise similar control group that does not get the treatment. We can then compare the two groups and determine if the vaccine made a difference. The control group shows us what happens in the absence of the factor under study.

However, the control group cannot get “nothing.” Instead, the control group often receives a placebo. A placebo is a procedure that has no expected therapeutic effect—such as giving a person a sugar pill or a shot containing only plain saline solution with no drug. Scientific studies have shown that the “placebo effect” can alter experimental results because when individuals are told that they are or are not being treated, this knowledge can alter their actions or their emotions, which can then alter the results of the experiment.

Moreover, if the doctor knows which group a patient is in, this can also influence the results of the experiment. Without saying so directly, the doctor may show—through body language or other subtle cues—their views about whether the patient is likely to get well. These errors can then alter the patient’s experience and change the results of the experiment. Therefore, many clinical studies are “double blind.” In these studies, neither the doctor nor the patient knows which group the patient is in until all experimental results have been collected.

Both placebo treatments and double-blind procedures are designed to prevent bias. Bias is any systematic error that makes a particular experimental outcome more or less likely. Errors can happen in any experiment: people make mistakes in measurement, instruments fail, computer glitches can alter data. But most such errors are random and don’t favor one outcome over another. Patients’ belief in a treatment can make it more likely to appear to “work.” Placebos and double-blind procedures are used to level the playing field so that both groups of study subjects are treated equally and share similar beliefs about their treatment.

The scientists who are researching the effectiveness of the HPV vaccine will test their hypothesis by separating 2,392 young women into two groups: the control group and the experimental group. Answer the following questions about these two groups.

  • This group is given a placebo.
  • This group is deliberately infected with HPV.
  • This group is given nothing.
  • This group is given the HPV vaccine.
  • a: This group is given a placebo. A placebo will be a shot, just like the HPV vaccine, but it will have no active ingredient. It may change peoples’ thinking or behavior to have such a shot given to them, but it will not stimulate the immune systems of the subjects in the same way as predicted for the vaccine itself.
  • d: This group is given the HPV vaccine. The experimental group will receive the HPV vaccine and researchers will then be able to see if it works, when compared to the control group.

Experimental Variables

A variable is a characteristic of a subject (in this case, of a person in the study) that can vary over time or among individuals. Sometimes a variable takes the form of a category, such as male or female; often a variable can be measured precisely, such as body height. Ideally, only one variable is different between the control group and the experimental group in a scientific experiment. Otherwise, the researchers will not be able to determine which variable caused any differences seen in the results. For example, imagine that the people in the control group were, on average, much more sexually active than the people in the experimental group. If, at the end of the experiment, the control group had a higher rate of HPV infection, could you confidently determine why? Maybe the experimental subjects were protected by the vaccine, but maybe they were protected by their low level of sexual contact.

To avoid this situation, experimenters make sure that their subject groups are as similar as possible in all variables except for the variable that is being tested in the experiment. This variable, or factor, will be deliberately changed in the experimental group. The one variable that is different between the two groups is called the independent variable. An independent variable is known or hypothesized to cause some outcome. Imagine an educational researcher investigating the effectiveness of a new teaching strategy in a classroom. The experimental group receives the new teaching strategy, while the control group receives the traditional strategy. It is the teaching strategy that is the independent variable in this scenario. In an experiment, the independent variable is the variable that the scientist deliberately changes or imposes on the subjects.

Dependent variables are known or hypothesized consequences; they are the effects that result from changes or differences in an independent variable. In an experiment, the dependent variables are those that the scientist measures before, during, and particularly at the end of the experiment to see if they have changed as expected. The dependent variable must be stated so that it is clear how it will be observed or measured. Rather than comparing “learning” among students (which is a vague and difficult to measure concept), an educational researcher might choose to compare test scores, which are very specific and easy to measure.

In any real-world example, many, many variables MIGHT affect the outcome of an experiment, yet only one or a few independent variables can be tested. Other variables must be kept as similar as possible between the study groups and are called control variables . For our educational research example, if the control group consisted only of people between the ages of 18 and 20 and the experimental group contained people between the ages of 30 and 35, we would not know if it was the teaching strategy or the students’ ages that played a larger role in the results. To avoid this problem, a good study will be set up so that each group contains students with a similar age profile. In a well-designed educational research study, student age will be a controlled variable, along with other possibly important factors like gender, past educational achievement, and pre-existing knowledge of the subject area.

What is the independent variable in this experiment?

  • Sex (all of the subjects will be female)
  • Presence or absence of the HPV vaccine
  • Presence or absence of HPV (the virus)

List three control variables other than age.

What is the dependent variable in this experiment?

  • Sex (male or female)
  • Rates of HPV infection
  • Age (years)

Contribute!

Improve this page Learn More

  • Revision and adaptation. Authored by : Shelli Carter and Lumen Learning. Provided by : Lumen Learning. License : CC BY-NC-SA: Attribution-NonCommercial-ShareAlike
  • Scientific Inquiry. Provided by : Open Learning Initiative. Located at : https://oli.cmu.edu/jcourse/workbook/activity/page?context=434a5c2680020ca6017c03488572e0f8 . Project : Introduction to Biology (Open + Free). License : CC BY-NC-SA: Attribution-NonCommercial-ShareAlike

Footer Logo Lumen Waymaker

PrepScholar

Choose Your Test

Sat / act prep online guides and tips, what is a hypothesis and how do i write one.

author image

General Education

body-glowing-question-mark

Think about something strange and unexplainable in your life. Maybe you get a headache right before it rains, or maybe you think your favorite sports team wins when you wear a certain color. If you wanted to see whether these are just coincidences or scientific fact, you would form a hypothesis, then create an experiment to see whether that hypothesis is true or not.

But what is a hypothesis, anyway? If you’re not sure about what a hypothesis is--or how to test for one!--you’re in the right place. This article will teach you everything you need to know about hypotheses, including: 

  • Defining the term “hypothesis” 
  • Providing hypothesis examples 
  • Giving you tips for how to write your own hypothesis

So let’s get started!

body-picture-ask-sign

What Is a Hypothesis?

Merriam Webster defines a hypothesis as “an assumption or concession made for the sake of argument.” In other words, a hypothesis is an educated guess . Scientists make a reasonable assumption--or a hypothesis--then design an experiment to test whether it’s true or not. Keep in mind that in science, a hypothesis should be testable. You have to be able to design an experiment that tests your hypothesis in order for it to be valid. 

As you could assume from that statement, it’s easy to make a bad hypothesis. But when you’re holding an experiment, it’s even more important that your guesses be good...after all, you’re spending time (and maybe money!) to figure out more about your observation. That’s why we refer to a hypothesis as an educated guess--good hypotheses are based on existing data and research to make them as sound as possible.

Hypotheses are one part of what’s called the scientific method .  Every (good) experiment or study is based in the scientific method. The scientific method gives order and structure to experiments and ensures that interference from scientists or outside influences does not skew the results. It’s important that you understand the concepts of the scientific method before holding your own experiment. Though it may vary among scientists, the scientific method is generally made up of six steps (in order):

  • Observation
  • Asking questions
  • Forming a hypothesis
  • Analyze the data
  • Communicate your results

You’ll notice that the hypothesis comes pretty early on when conducting an experiment. That’s because experiments work best when they’re trying to answer one specific question. And you can’t conduct an experiment until you know what you’re trying to prove!

Independent and Dependent Variables 

After doing your research, you’re ready for another important step in forming your hypothesis: identifying variables. Variables are basically any factor that could influence the outcome of your experiment . Variables have to be measurable and related to the topic being studied.

There are two types of variables:  independent variables and dependent variables. I ndependent variables remain constant . For example, age is an independent variable; it will stay the same, and researchers can look at different ages to see if it has an effect on the dependent variable. 

Speaking of dependent variables... dependent variables are subject to the influence of the independent variable , meaning that they are not constant. Let’s say you want to test whether a person’s age affects how much sleep they need. In that case, the independent variable is age (like we mentioned above), and the dependent variable is how much sleep a person gets. 

Variables will be crucial in writing your hypothesis. You need to be able to identify which variable is which, as both the independent and dependent variables will be written into your hypothesis. For instance, in a study about exercise, the independent variable might be the speed at which the respondents walk for thirty minutes, and the dependent variable would be their heart rate. In your study and in your hypothesis, you’re trying to understand the relationship between the two variables.

Elements of a Good Hypothesis

The best hypotheses start by asking the right questions . For instance, if you’ve observed that the grass is greener when it rains twice a week, you could ask what kind of grass it is, what elevation it’s at, and if the grass across the street responds to rain in the same way. Any of these questions could become the backbone of experiments to test why the grass gets greener when it rains fairly frequently.

As you’re asking more questions about your first observation, make sure you’re also making more observations . If it doesn’t rain for two weeks and the grass still looks green, that’s an important observation that could influence your hypothesis. You'll continue observing all throughout your experiment, but until the hypothesis is finalized, every observation should be noted.

Finally, you should consult secondary research before writing your hypothesis . Secondary research is comprised of results found and published by other people. You can usually find this information online or at your library. Additionally, m ake sure the research you find is credible and related to your topic. If you’re studying the correlation between rain and grass growth, it would help you to research rain patterns over the past twenty years for your county, published by a local agricultural association. You should also research the types of grass common in your area, the type of grass in your lawn, and whether anyone else has conducted experiments about your hypothesis. Also be sure you’re checking the quality of your research . Research done by a middle school student about what minerals can be found in rainwater would be less useful than an article published by a local university.

body-pencil-notebook-writing

Writing Your Hypothesis

Once you’ve considered all of the factors above, you’re ready to start writing your hypothesis. Hypotheses usually take a certain form when they’re written out in a research report.

When you boil down your hypothesis statement, you are writing down your best guess and not the question at hand . This means that your statement should be written as if it is fact already, even though you are simply testing it.

The reason for this is that, after you have completed your study, you'll either accept or reject your if-then or your null hypothesis. All hypothesis testing examples should be measurable and able to be confirmed or denied. You cannot confirm a question, only a statement! 

In fact, you come up with hypothesis examples all the time! For instance, when you guess on the outcome of a basketball game, you don’t say, “Will the Miami Heat beat the Boston Celtics?” but instead, “I think the Miami Heat will beat the Boston Celtics.” You state it as if it is already true, even if it turns out you’re wrong. You do the same thing when writing your hypothesis.

Additionally, keep in mind that hypotheses can range from very specific to very broad.  These hypotheses can be specific, but if your hypothesis testing examples involve a broad range of causes and effects, your hypothesis can also be broad.  

body-hand-number-two

The Two Types of Hypotheses

Now that you understand what goes into a hypothesis, it’s time to look more closely at the two most common types of hypothesis: the if-then hypothesis and the null hypothesis.

#1: If-Then Hypotheses

First of all, if-then hypotheses typically follow this formula:

If ____ happens, then ____ will happen.

The goal of this type of hypothesis is to test the causal relationship between the independent and dependent variable. It’s fairly simple, and each hypothesis can vary in how detailed it can be. We create if-then hypotheses all the time with our daily predictions. Here are some examples of hypotheses that use an if-then structure from daily life: 

  • If I get enough sleep, I’ll be able to get more work done tomorrow.
  • If the bus is on time, I can make it to my friend’s birthday party. 
  • If I study every night this week, I’ll get a better grade on my exam. 

In each of these situations, you’re making a guess on how an independent variable (sleep, time, or studying) will affect a dependent variable (the amount of work you can do, making it to a party on time, or getting better grades). 

You may still be asking, “What is an example of a hypothesis used in scientific research?” Take one of the hypothesis examples from a real-world study on whether using technology before bed affects children’s sleep patterns. The hypothesis read s:

“We hypothesized that increased hours of tablet- and phone-based screen time at bedtime would be inversely correlated with sleep quality and child attention.”

It might not look like it, but this is an if-then statement. The researchers basically said, “If children have more screen usage at bedtime, then their quality of sleep and attention will be worse.” The sleep quality and attention are the dependent variables and the screen usage is the independent variable. (Usually, the independent variable comes after the “if” and the dependent variable comes after the “then,” as it is the independent variable that affects the dependent variable.) This is an excellent example of how flexible hypothesis statements can be, as long as the general idea of “if-then” and the independent and dependent variables are present.

#2: Null Hypotheses

Your if-then hypothesis is not the only one needed to complete a successful experiment, however. You also need a null hypothesis to test it against. In its most basic form, the null hypothesis is the opposite of your if-then hypothesis . When you write your null hypothesis, you are writing a hypothesis that suggests that your guess is not true, and that the independent and dependent variables have no relationship .

One null hypothesis for the cell phone and sleep study from the last section might say: 

“If children have more screen usage at bedtime, their quality of sleep and attention will not be worse.” 

In this case, this is a null hypothesis because it’s asking the opposite of the original thesis! 

Conversely, if your if-then hypothesis suggests that your two variables have no relationship, then your null hypothesis would suggest that there is one. So, pretend that there is a study that is asking the question, “Does the amount of followers on Instagram influence how long people spend on the app?” The independent variable is the amount of followers, and the dependent variable is the time spent. But if you, as the researcher, don’t think there is a relationship between the number of followers and time spent, you might write an if-then hypothesis that reads:

“If people have many followers on Instagram, they will not spend more time on the app than people who have less.”

In this case, the if-then suggests there isn’t a relationship between the variables. In that case, one of the null hypothesis examples might say:

“If people have many followers on Instagram, they will spend more time on the app than people who have less.”

You then test both the if-then and the null hypothesis to gauge if there is a relationship between the variables, and if so, how much of a relationship. 

feature_tips

4 Tips to Write the Best Hypothesis

If you’re going to take the time to hold an experiment, whether in school or by yourself, you’re also going to want to take the time to make sure your hypothesis is a good one. The best hypotheses have four major elements in common: plausibility, defined concepts, observability, and general explanation.

#1: Plausibility

At first glance, this quality of a hypothesis might seem obvious. When your hypothesis is plausible, that means it’s possible given what we know about science and general common sense. However, improbable hypotheses are more common than you might think. 

Imagine you’re studying weight gain and television watching habits. If you hypothesize that people who watch more than  twenty hours of television a week will gain two hundred pounds or more over the course of a year, this might be improbable (though it’s potentially possible). Consequently, c ommon sense can tell us the results of the study before the study even begins.

Improbable hypotheses generally go against  science, as well. Take this hypothesis example: 

“If a person smokes one cigarette a day, then they will have lungs just as healthy as the average person’s.” 

This hypothesis is obviously untrue, as studies have shown again and again that cigarettes negatively affect lung health. You must be careful that your hypotheses do not reflect your own personal opinion more than they do scientifically-supported findings. This plausibility points to the necessity of research before the hypothesis is written to make sure that your hypothesis has not already been disproven.

#2: Defined Concepts

The more advanced you are in your studies, the more likely that the terms you’re using in your hypothesis are specific to a limited set of knowledge. One of the hypothesis testing examples might include the readability of printed text in newspapers, where you might use words like “kerning” and “x-height.” Unless your readers have a background in graphic design, it’s likely that they won’t know what you mean by these terms. Thus, it’s important to either write what they mean in the hypothesis itself or in the report before the hypothesis.

Here’s what we mean. Which of the following sentences makes more sense to the common person?

If the kerning is greater than average, more words will be read per minute.

If the space between letters is greater than average, more words will be read per minute.

For people reading your report that are not experts in typography, simply adding a few more words will be helpful in clarifying exactly what the experiment is all about. It’s always a good idea to make your research and findings as accessible as possible. 

body-blue-eye

Good hypotheses ensure that you can observe the results. 

#3: Observability

In order to measure the truth or falsity of your hypothesis, you must be able to see your variables and the way they interact. For instance, if your hypothesis is that the flight patterns of satellites affect the strength of certain television signals, yet you don’t have a telescope to view the satellites or a television to monitor the signal strength, you cannot properly observe your hypothesis and thus cannot continue your study.

Some variables may seem easy to observe, but if you do not have a system of measurement in place, you cannot observe your hypothesis properly. Here’s an example: if you’re experimenting on the effect of healthy food on overall happiness, but you don’t have a way to monitor and measure what “overall happiness” means, your results will not reflect the truth. Monitoring how often someone smiles for a whole day is not reasonably observable, but having the participants state how happy they feel on a scale of one to ten is more observable. 

In writing your hypothesis, always keep in mind how you'll execute the experiment.

#4: Generalizability 

Perhaps you’d like to study what color your best friend wears the most often by observing and documenting the colors she wears each day of the week. This might be fun information for her and you to know, but beyond you two, there aren’t many people who could benefit from this experiment. When you start an experiment, you should note how generalizable your findings may be if they are confirmed. Generalizability is basically how common a particular phenomenon is to other people’s everyday life.

Let’s say you’re asking a question about the health benefits of eating an apple for one day only, you need to realize that the experiment may be too specific to be helpful. It does not help to explain a phenomenon that many people experience. If you find yourself with too specific of a hypothesis, go back to asking the big question: what is it that you want to know, and what do you think will happen between your two variables?

body-experiment-chemistry

Hypothesis Testing Examples

We know it can be hard to write a good hypothesis unless you’ve seen some good hypothesis examples. We’ve included four hypothesis examples based on some made-up experiments. Use these as templates or launch pads for coming up with your own hypotheses.

Experiment #1: Students Studying Outside (Writing a Hypothesis)

You are a student at PrepScholar University. When you walk around campus, you notice that, when the temperature is above 60 degrees, more students study in the quad. You want to know when your fellow students are more likely to study outside. With this information, how do you make the best hypothesis possible?

You must remember to make additional observations and do secondary research before writing your hypothesis. In doing so, you notice that no one studies outside when it’s 75 degrees and raining, so this should be included in your experiment. Also, studies done on the topic beforehand suggested that students are more likely to study in temperatures less than 85 degrees. With this in mind, you feel confident that you can identify your variables and write your hypotheses:

If-then: “If the temperature in Fahrenheit is less than 60 degrees, significantly fewer students will study outside.”

Null: “If the temperature in Fahrenheit is less than 60 degrees, the same number of students will study outside as when it is more than 60 degrees.”

These hypotheses are plausible, as the temperatures are reasonably within the bounds of what is possible. The number of people in the quad is also easily observable. It is also not a phenomenon specific to only one person or at one time, but instead can explain a phenomenon for a broader group of people.

To complete this experiment, you pick the month of October to observe the quad. Every day (except on the days where it’s raining)from 3 to 4 PM, when most classes have released for the day, you observe how many people are on the quad. You measure how many people come  and how many leave. You also write down the temperature on the hour. 

After writing down all of your observations and putting them on a graph, you find that the most students study on the quad when it is 70 degrees outside, and that the number of students drops a lot once the temperature reaches 60 degrees or below. In this case, your research report would state that you accept or “failed to reject” your first hypothesis with your findings.

Experiment #2: The Cupcake Store (Forming a Simple Experiment)

Let’s say that you work at a bakery. You specialize in cupcakes, and you make only two colors of frosting: yellow and purple. You want to know what kind of customers are more likely to buy what kind of cupcake, so you set up an experiment. Your independent variable is the customer’s gender, and the dependent variable is the color of the frosting. What is an example of a hypothesis that might answer the question of this study?

Here’s what your hypotheses might look like: 

If-then: “If customers’ gender is female, then they will buy more yellow cupcakes than purple cupcakes.”

Null: “If customers’ gender is female, then they will be just as likely to buy purple cupcakes as yellow cupcakes.”

This is a pretty simple experiment! It passes the test of plausibility (there could easily be a difference), defined concepts (there’s nothing complicated about cupcakes!), observability (both color and gender can be easily observed), and general explanation ( this would potentially help you make better business decisions ).

body-bird-feeder

Experiment #3: Backyard Bird Feeders (Integrating Multiple Variables and Rejecting the If-Then Hypothesis)

While watching your backyard bird feeder, you realized that different birds come on the days when you change the types of seeds. You decide that you want to see more cardinals in your backyard, so you decide to see what type of food they like the best and set up an experiment. 

However, one morning, you notice that, while some cardinals are present, blue jays are eating out of your backyard feeder filled with millet. You decide that, of all of the other birds, you would like to see the blue jays the least. This means you'll have more than one variable in your hypothesis. Your new hypotheses might look like this: 

If-then: “If sunflower seeds are placed in the bird feeders, then more cardinals will come than blue jays. If millet is placed in the bird feeders, then more blue jays will come than cardinals.”

Null: “If either sunflower seeds or millet are placed in the bird, equal numbers of cardinals and blue jays will come.”

Through simple observation, you actually find that cardinals come as often as blue jays when sunflower seeds or millet is in the bird feeder. In this case, you would reject your “if-then” hypothesis and “fail to reject” your null hypothesis . You cannot accept your first hypothesis, because it’s clearly not true. Instead you found that there was actually no relation between your different variables. Consequently, you would need to run more experiments with different variables to see if the new variables impact the results.

Experiment #4: In-Class Survey (Including an Alternative Hypothesis)

You’re about to give a speech in one of your classes about the importance of paying attention. You want to take this opportunity to test a hypothesis you’ve had for a while: 

If-then: If students sit in the first two rows of the classroom, then they will listen better than students who do not.

Null: If students sit in the first two rows of the classroom, then they will not listen better or worse than students who do not.

You give your speech and then ask your teacher if you can hand out a short survey to the class. On the survey, you’ve included questions about some of the topics you talked about. When you get back the results, you’re surprised to see that not only do the students in the first two rows not pay better attention, but they also scored worse than students in other parts of the classroom! Here, both your if-then and your null hypotheses are not representative of your findings. What do you do?

This is when you reject both your if-then and null hypotheses and instead create an alternative hypothesis . This type of hypothesis is used in the rare circumstance that neither of your hypotheses is able to capture your findings . Now you can use what you’ve learned to draft new hypotheses and test again! 

Key Takeaways: Hypothesis Writing

The more comfortable you become with writing hypotheses, the better they will become. The structure of hypotheses is flexible and may need to be changed depending on what topic you are studying. The most important thing to remember is the purpose of your hypothesis and the difference between the if-then and the null . From there, in forming your hypothesis, you should constantly be asking questions, making observations, doing secondary research, and considering your variables. After you have written your hypothesis, be sure to edit it so that it is plausible, clearly defined, observable, and helpful in explaining a general phenomenon.

Writing a hypothesis is something that everyone, from elementary school children competing in a science fair to professional scientists in a lab, needs to know how to do. Hypotheses are vital in experiments and in properly executing the scientific method . When done correctly, hypotheses will set up your studies for success and help you to understand the world a little better, one experiment at a time.

body-whats-next-post-it-note

What’s Next?

If you’re studying for the science portion of the ACT, there’s definitely a lot you need to know. We’ve got the tools to help, though! Start by checking out our ultimate study guide for the ACT Science subject test. Once you read through that, be sure to download our recommended ACT Science practice tests , since they’re one of the most foolproof ways to improve your score. (And don’t forget to check out our expert guide book , too.)

If you love science and want to major in a scientific field, you should start preparing in high school . Here are the science classes you should take to set yourself up for success.

If you’re trying to think of science experiments you can do for class (or for a science fair!), here’s a list of 37 awesome science experiments you can do at home

author image

Ashley Sufflé Robinson has a Ph.D. in 19th Century English Literature. As a content writer for PrepScholar, Ashley is passionate about giving college-bound students the in-depth information they need to get into the school of their dreams.

Student and Parent Forum

Our new student and parent forum, at ExpertHub.PrepScholar.com , allow you to interact with your peers and the PrepScholar staff. See how other students and parents are navigating high school, college, and the college admissions process. Ask questions; get answers.

Join the Conversation

Ask a Question Below

Have any questions about this article or other topics? Ask below and we'll reply!

Improve With Our Famous Guides

  • For All Students

The 5 Strategies You Must Be Using to Improve 160+ SAT Points

How to Get a Perfect 1600, by a Perfect Scorer

Series: How to Get 800 on Each SAT Section:

Score 800 on SAT Math

Score 800 on SAT Reading

Score 800 on SAT Writing

Series: How to Get to 600 on Each SAT Section:

Score 600 on SAT Math

Score 600 on SAT Reading

Score 600 on SAT Writing

Free Complete Official SAT Practice Tests

What SAT Target Score Should You Be Aiming For?

15 Strategies to Improve Your SAT Essay

The 5 Strategies You Must Be Using to Improve 4+ ACT Points

How to Get a Perfect 36 ACT, by a Perfect Scorer

Series: How to Get 36 on Each ACT Section:

36 on ACT English

36 on ACT Math

36 on ACT Reading

36 on ACT Science

Series: How to Get to 24 on Each ACT Section:

24 on ACT English

24 on ACT Math

24 on ACT Reading

24 on ACT Science

What ACT target score should you be aiming for?

ACT Vocabulary You Must Know

ACT Writing: 15 Tips to Raise Your Essay Score

How to Get Into Harvard and the Ivy League

How to Get a Perfect 4.0 GPA

How to Write an Amazing College Essay

What Exactly Are Colleges Looking For?

Is the ACT easier than the SAT? A Comprehensive Guide

Should you retake your SAT or ACT?

When should you take the SAT or ACT?

Stay Informed

explain how your hypothesis is testable

Get the latest articles and test prep tips!

Looking for Graduate School Test Prep?

Check out our top-rated graduate blogs here:

GRE Online Prep Blog

GMAT Online Prep Blog

TOEFL Online Prep Blog

Holly R. "I am absolutely overjoyed and cannot thank you enough for helping me!”

IMAGES

  1. What Makes A Hypothesis Testable

    explain how your hypothesis is testable

  2. 13 Different Types of Hypothesis (2024)

    explain how your hypothesis is testable

  3. PPT

    explain how your hypothesis is testable

  4. What is Hypothesis? Functions- Characteristics-types-Criteria

    explain how your hypothesis is testable

  5. Hypothesis Testing Steps & Examples

    explain how your hypothesis is testable

  6. Hypothesis Testing Solved Examples(Questions and Solutions)

    explain how your hypothesis is testable

VIDEO

  1. Testable Hypothesis / Scientific Phenomena

  2. CHEMIOSMOTIC HYPOTHESIS || HINDI EXPLANATION

  3. Hypothesis explain biology book

  4. How to Validate Your Hypothesis for Business Success #shorts

  5. Understanding "Develop a Hypothesis"

  6. Biological Method

COMMENTS

  1. What Is a Testable Hypothesis?

    A hypothesis is a tentative answer to a scientific question. A testable hypothesis is a hypothesis that can be proved or disproved as a result of testing, data collection, or experience. Only testable hypotheses can be used to conceive and perform an experiment using the scientific method .

  2. Hypothesis Testing

    Present the findings in your results and discussion section. Though the specific details might vary, the procedure you will use when testing a hypothesis will always follow some version of these steps. Table of contents. Step 1: State your null and alternate hypothesis. Step 2: Collect data. Step 3: Perform a statistical test.

  3. How to Write a Strong Hypothesis

    5. Phrase your hypothesis in three ways. To identify the variables, you can write a simple prediction in if…then form. The first part of the sentence states the independent variable and the second part states the dependent variable. If a first-year student starts attending more lectures, then their exam scores will improve.

  4. What is a Research Hypothesis: How to Write it, Types, and Examples

    It seeks to explore and understand a particular aspect of the research subject. In contrast, a research hypothesis is a specific statement or prediction that suggests an expected relationship between variables. It is formulated based on existing knowledge or theories and guides the research design and data analysis. 7.

  5. The scientific method (article)

    The scientific method. At the core of biology and other sciences lies a problem-solving approach called the scientific method. The scientific method has five basic steps, plus one feedback step: Make an observation. Ask a question. Form a hypothesis, or testable explanation. Make a prediction based on the hypothesis.

  6. Formulating Hypotheses for Different Study Designs

    Formulating Hypotheses for Different Study Designs. Generating a testable working hypothesis is the first step towards conducting original research. Such research may prove or disprove the proposed hypothesis. Case reports, case series, online surveys and other observational studies, clinical trials, and narrative reviews help to generate ...

  7. How to Write a Great Hypothesis

    What is a hypothesis and how can you write a great one for your research? A hypothesis is a tentative statement about the relationship between two or more variables that can be tested empirically. Find out how to formulate a clear, specific, and testable hypothesis with examples and tips from Verywell Mind, a trusted source of psychology and mental health information.

  8. Research Hypothesis In Psychology: Types, & Examples

    A research hypothesis, in its plural form "hypotheses," is a specific, testable prediction about the anticipated results of a study, established at its outset. It is a key component of the scientific method. Hypotheses connect theory to data and guide the research process towards expanding scientific understanding.

  9. How to Write a Testable Hypothesis

    A testable hypothesis is one that can be used as the basis for an experiment. It predicts the correlation between two variables and can be tested by varying one of the variables. If the variables cannot be measured, the hypothesis cannot be proved or disproved. If one of the variables cannot be varied, it is impossible to conduct an experiment.

  10. The scientific method (video)

    The scientific method. The scientific method is a logical approach to understanding the world. It starts with an observation, followed by a question. A testable explanation or hypothesis is then created. An experiment is designed to test the hypothesis, and based on the results, the hypothesis is refined.

  11. 7.1: Basics of Hypothesis Testing

    Test Statistic: z = x¯¯¯ −μo σ/ n−−√ z = x ¯ − μ o σ / n since it is calculated as part of the testing of the hypothesis. Definition 7.1.4 7.1. 4. p - value: probability that the test statistic will take on more extreme values than the observed test statistic, given that the null hypothesis is true.

  12. Hypothesis Testing

    The Four Steps in Hypothesis Testing. STEP 1: State the appropriate null and alternative hypotheses, Ho and Ha. STEP 2: Obtain a random sample, collect relevant data, and check whether the data meet the conditions under which the test can be used. If the conditions are met, summarize the data using a test statistic.

  13. 4.14: Experiments and Hypotheses

    Forming a Hypothesis. When conducting scientific experiments, researchers develop hypotheses to guide experimental design. A hypothesis is a suggested explanation that is both testable and falsifiable. You must be able to test your hypothesis, and it must be possible to prove your hypothesis true or false.

  14. How to Write a Hypothesis

    Use simple language: While your hypothesis should be conceptually sound, it doesn't have to be complicated. Aim for clarity and simplicity in your wording. State direction, if applicable: If your hypothesis involves a directional outcome (e.g., "increase" or "decrease"), make sure to specify this.

  15. How to Write a Hypothesis in 6 Steps, With Examples

    A hypothesis is a statement that explains the predictions and reasoning of your research—an "educated guess" about how your scientific experiments will end. Use this guide to learn how to write a hypothesis and read successful and unsuccessful examples of a testable hypotheses.

  16. Writing a Hypothesis for Your Science Fair Project

    A hypothesis is the best answer to a question based on what is known. Scientists take that best answer and do experiments to see if it still makes sense or if a better answer can be made. When a scientist has a question they want to answer, they research what is already known about the topic. Then, they come up with their best answer to the ...

  17. What is a Hypothesis

    Definition: Hypothesis is an educated guess or proposed explanation for a phenomenon, based on some initial observations or data. It is a tentative statement that can be tested and potentially proven or disproven through further investigation and experimentation. Hypothesis is often used in scientific research to guide the design of experiments ...

  18. Scientific hypothesis

    hypothesis. science. scientific hypothesis, an idea that proposes a tentative explanation about a phenomenon or a narrow set of phenomena observed in the natural world. The two primary features of a scientific hypothesis are falsifiability and testability, which are reflected in an "If…then" statement summarizing the idea and in the ...

  19. Experiments and Hypotheses

    A hypothesis is a suggested explanation that is both testable and falsifiable. You must be able to test your hypothesis, and it must be possible to prove your hypothesis true or false. For example, Michael observes that maple trees lose their leaves in the fall. He might then propose a possible explanation for this observation: "cold weather ...

  20. How To Develop a Hypothesis (With Elements, Types and Examples)

    4. Formulate your hypothesis. After collecting background information and making a prediction based on your question, plan a statement that lays out your variables, subjects and predicted outcome. Whether you write it as an "if/then" or declarative statement, your hypothesis should include the prediction to be tested.

  21. What Is a Hypothesis and How Do I Write One?

    Merriam Webster defines a hypothesis as "an assumption or concession made for the sake of argument.". In other words, a hypothesis is an educated guess. Scientists make a reasonable assumption--or a hypothesis--then design an experiment to test whether it's true or not.

  22. SCI 100 Module Four Activity Template

    Explain how your hypothesis is testable (in 1-3 sentences). It's in testing phase now and so far, it has been a success with the animal hearts. So, testing the combination of COVID-19 and cancer immunotherapy on a heart that has fibrotic scarring it is possible to see the hearts healed. Explain how your hypothesis is falsifiable (in 1-3 ...

  23. SCI 100 4-2 Activity Turning Questions Into Hypotheses

    Explain how your hypothesis is testable (in 1-3 sentences). To investigate this hypothesis, researchers chose to model two merging black in a growing universe, rather than the static universes other research teams build for the sake of simplifying the complex equations (Einstein's theory of general relativity) that provide the foundations ...

  24. De Coordinatione Motus Humani: The Synergy Expansion Hypothesis

    The search for an answer to Bernstein's degrees of freedom problem has propelled a large portion of research studies in human motor control over the past six decades. Different theories have been developed to explain how humans might use their incredibly complex neuro-musculo-skeletal system with astonishing ease. Among these theories, motor synergies appeared as one possible explanation. In ...