how does a hypothesis help you in making decisions

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

Knowledge Base

Methodology

How to Write a Strong Hypothesis | Steps & Examples

How to Write a Strong Hypothesis | Steps & Examples

Published on May 6, 2022 by Shona McCombes . Revised on November 20, 2023.

A hypothesis is a statement that can be tested by scientific research. If you want to test a relationship between two or more variables, you need to write hypotheses before you start your experiment or data collection .

Example: Hypothesis

Daily apple consumption leads to fewer doctor’s visits.

What is a hypothesis, developing a hypothesis (with example), hypothesis examples, other interesting articles, frequently asked questions about writing hypotheses.

A hypothesis states your predictions about what your research will find. It is a tentative answer to your research question that has not yet been tested. For some research projects, you might have to write several hypotheses that address different aspects of your research question.

A hypothesis is not just a guess – it should be based on existing theories and knowledge. It also has to be testable, which means you can support or refute it through scientific research methods (such as experiments, observations and statistical analysis of data).

Variables in hypotheses

Hypotheses propose a relationship between two or more types of variables .

An independent variable is something the researcher changes or controls.
A dependent variable is something the researcher observes and measures.

If there are any control variables , extraneous variables , or confounding variables , be sure to jot those down as you go to minimize the chances that research bias will affect your results.

In this example, the independent variable is exposure to the sun – the assumed cause . The dependent variable is the level of happiness – the assumed effect .

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

Step 1. Ask a question

Writing a hypothesis begins with a research question that you want to answer. The question should be focused, specific, and researchable within the constraints of your project.

Step 2. Do some preliminary research

Your initial answer to the question should be based on what is already known about the topic. Look for theories and previous studies to help you form educated assumptions about what your research will find.

At this stage, you might construct a conceptual framework to ensure that you’re embarking on a relevant topic . This can also help you identify which variables you will study and what you think the relationships are between them. Sometimes, you’ll have to operationalize more complex constructs.

Step 3. Formulate your hypothesis

Now you should have some idea of what you expect to find. Write your initial answer to the question in a clear, concise sentence.

4. Refine your hypothesis

You need to make sure your hypothesis is specific and testable. There are various ways of phrasing a hypothesis, but all the terms you use should have clear definitions, and the hypothesis should contain:

The relevant variables
The specific group being studied
The predicted outcome of the experiment or analysis

5. Phrase your hypothesis in three ways

To identify the variables, you can write a simple prediction in if…then form. The first part of the sentence states the independent variable and the second part states the dependent variable.

In academic research, hypotheses are more commonly phrased in terms of correlations or effects, where you directly state the predicted relationship between variables.

If you are comparing two groups, the hypothesis can state what difference you expect to find between them.

6. Write a null hypothesis

If your research involves statistical hypothesis testing , you will also have to write a null hypothesis . The null hypothesis is the default position that there is no association between the variables. The null hypothesis is written as H 0 , while the alternative hypothesis is H 1 or H a .

H 0 : The number of lectures attended by first-year students has no effect on their final exam scores.
H 1 : The number of lectures attended by first-year students has a positive effect on their final exam scores.

If you want to know more about the research process , methodology , research bias , or statistics , make sure to check out some of our other articles with explanations and examples.

Sampling methods
Simple random sampling
Stratified sampling
Cluster sampling
Likert scales
Reproducibility

Statistics

Null hypothesis
Statistical power
Probability distribution
Effect size
Poisson distribution

Research bias

Optimism bias
Cognitive bias
Implicit bias
Hawthorne effect
Anchoring bias
Explicit bias

A hypothesis is not just a guess — it should be based on existing theories and knowledge. It also has to be testable, which means you can support or refute it through scientific research methods (such as experiments, observations and statistical analysis of data).

Null and alternative hypotheses are used in statistical hypothesis testing . The null hypothesis of a test always predicts no effect or no relationship between variables, while the alternative hypothesis states your research prediction of an effect or relationship.

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

McCombes, S. (2023, November 20). How to Write a Strong Hypothesis | Steps & Examples. Scribbr. Retrieved April 1, 2024, from https://www.scribbr.com/methodology/hypothesis/

Is this article helpful?

Shona McCombes

Other students also liked, construct validity | definition, types, & examples, what is a conceptual framework | tips & examples, operationalization | a guide with examples, pros & cons, what is your plagiarism score.

Business Essentials
Leadership & Management
Credential of Leadership, Impact, and Management in Business (CLIMB)
Entrepreneurship & Innovation
*New* Digital Transformation
Finance & Accounting
Business in Society
For Organizations
Support Portal
Media Coverage
Founding Donors
Leadership Team

how does a hypothesis help you in making decisions

Harvard Business School →
HBS Online →
Business Insights →

Business Insights

Harvard Business School Online's Business Insights Blog provides the career insights you need to achieve your goals and gain confidence in your business skills.

Career Development
Communication
Decision-Making
Earning Your MBA
Negotiation
News & Events
Productivity
Staff Spotlight
Student Profiles
Work-Life Balance
Alternative Investments
Business Analytics
Business Strategy
Business and Climate Change
Design Thinking and Innovation
Digital Marketing Strategy
Disruptive Strategy
Economics for Managers
Entrepreneurship Essentials
Financial Accounting
Global Business
Launching Tech Ventures
Leadership Principles
Leadership, Ethics, and Corporate Accountability
Leading with Finance
Management Essentials
Negotiation Mastery
Organizational Leadership
Power and Influence for Positive Impact
Strategy Execution
Sustainable Business Strategy
Sustainable Investing
Winning with Digital Platforms

A Beginner’s Guide to Hypothesis Testing in Business

Business professionals performing hypothesis testing

30 Mar 2021

Becoming a more data-driven decision-maker can bring several benefits to your organization, enabling you to identify new opportunities to pursue and threats to abate. Rather than allowing subjective thinking to guide your business strategy, backing your decisions with data can empower your company to become more innovative and, ultimately, profitable.

If you’re new to data-driven decision-making, you might be wondering how data translates into business strategy. The answer lies in generating a hypothesis and verifying or rejecting it based on what various forms of data tell you.

Below is a look at hypothesis testing and the role it plays in helping businesses become more data-driven.

Access your free e-book today.

What Is Hypothesis Testing?

To understand what hypothesis testing is, it’s important first to understand what a hypothesis is.

A hypothesis or hypothesis statement seeks to explain why something has happened, or what might happen, under certain conditions. It can also be used to understand how different variables relate to each other. Hypotheses are often written as if-then statements; for example, “If this happens, then this will happen.”

Hypothesis testing , then, is a statistical means of testing an assumption stated in a hypothesis. While the specific methodology leveraged depends on the nature of the hypothesis and data available, hypothesis testing typically uses sample data to extrapolate insights about a larger population.

Hypothesis Testing in Business

When it comes to data-driven decision-making, there’s a certain amount of risk that can mislead a professional. This could be due to flawed thinking or observations, incomplete or inaccurate data , or the presence of unknown variables. The danger in this is that, if major strategic decisions are made based on flawed insights, it can lead to wasted resources, missed opportunities, and catastrophic outcomes.

The real value of hypothesis testing in business is that it allows professionals to test their theories and assumptions before putting them into action. This essentially allows an organization to verify its analysis is correct before committing resources to implement a broader strategy.

As one example, consider a company that wishes to launch a new marketing campaign to revitalize sales during a slow period. Doing so could be an incredibly expensive endeavor, depending on the campaign’s size and complexity. The company, therefore, may wish to test the campaign on a smaller scale to understand how it will perform.

In this example, the hypothesis that’s being tested would fall along the lines of: “If the company launches a new marketing campaign, then it will translate into an increase in sales.” It may even be possible to quantify how much of a lift in sales the company expects to see from the effort. Pending the results of the pilot campaign, the business would then know whether it makes sense to roll it out more broadly.

Related: 9 Fundamental Data Science Skills for Business Professionals

Key Considerations for Hypothesis Testing

1. alternative hypothesis and null hypothesis.

In hypothesis testing, the hypothesis that’s being tested is known as the alternative hypothesis . Often, it’s expressed as a correlation or statistical relationship between variables. The null hypothesis , on the other hand, is a statement that’s meant to show there’s no statistical relationship between the variables being tested. It’s typically the exact opposite of whatever is stated in the alternative hypothesis.

For example, consider a company’s leadership team that historically and reliably sees $12 million in monthly revenue. They want to understand if reducing the price of their services will attract more customers and, in turn, increase revenue.

In this case, the alternative hypothesis may take the form of a statement such as: “If we reduce the price of our flagship service by five percent, then we’ll see an increase in sales and realize revenues greater than $12 million in the next month.”

The null hypothesis, on the other hand, would indicate that revenues wouldn’t increase from the base of $12 million, or might even decrease.

Check out the video below about the difference between an alternative and a null hypothesis, and subscribe to our YouTube channel for more explainer content.

2. Significance Level and P-Value

Statistically speaking, if you were to run the same scenario 100 times, you’d likely receive somewhat different results each time. If you were to plot these results in a distribution plot, you’d see the most likely outcome is at the tallest point in the graph, with less likely outcomes falling to the right and left of that point.

With this in mind, imagine you’ve completed your hypothesis test and have your results, which indicate there may be a correlation between the variables you were testing. To understand your results' significance, you’ll need to identify a p-value for the test, which helps note how confident you are in the test results.

In statistics, the p-value depicts the probability that, assuming the null hypothesis is correct, you might still observe results that are at least as extreme as the results of your hypothesis test. The smaller the p-value, the more likely the alternative hypothesis is correct, and the greater the significance of your results.

3. One-Sided vs. Two-Sided Testing

When it’s time to test your hypothesis, it’s important to leverage the correct testing method. The two most common hypothesis testing methods are one-sided and two-sided tests , or one-tailed and two-tailed tests, respectively.

Typically, you’d leverage a one-sided test when you have a strong conviction about the direction of change you expect to see due to your hypothesis test. You’d leverage a two-sided test when you’re less confident in the direction of change.

Business Analytics | Become a data-driven leader | Learn More

4. Sampling

To perform hypothesis testing in the first place, you need to collect a sample of data to be analyzed. Depending on the question you’re seeking to answer or investigate, you might collect samples through surveys, observational studies, or experiments.

A survey involves asking a series of questions to a random population sample and recording self-reported responses.

Observational studies involve a researcher observing a sample population and collecting data as it occurs naturally, without intervention.

Finally, an experiment involves dividing a sample into multiple groups, one of which acts as the control group. For each non-control group, the variable being studied is manipulated to determine how the data collected differs from that of the control group.

A Beginner's Guide to Data and Analytics | Access Your Free E-Book | Download Now

Learn How to Perform Hypothesis Testing

Hypothesis testing is a complex process involving different moving pieces that can allow an organization to effectively leverage its data and inform strategic decisions.

If you’re interested in better understanding hypothesis testing and the role it can play within your organization, one option is to complete a course that focuses on the process. Doing so can lay the statistical and analytical foundation you need to succeed.

Do you want to learn more about hypothesis testing? Explore Business Analytics —one of our online business essentials courses —and download our Beginner’s Guide to Data & Analytics .

About the Author

Guide: Hypothesis Testing

Daniel Croft

Daniel Croft is an experienced continuous improvement manager with a Lean Six Sigma Black Belt and a Bachelor's degree in Business Management. With more than ten years of experience applying his skills across various industries, Daniel specializes in optimizing processes and improving efficiency. His approach combines practical experience with a deep understanding of business fundamentals to drive meaningful change.

Last Updated: September 8, 2023
Learn Lean Sigma

In the world of data-driven decision-making, Hypothesis Testing stands as a cornerstone methodology. It serves as the statistical backbone for a multitude of sectors, from manufacturing and logistics to healthcare and finance. But what exactly is Hypothesis Testing, and why is it so indispensable? Simply put, it’s a technique that allows you to validate or invalidate claims about a population based on sample data. Whether you’re looking to streamline a manufacturing process, optimize logistics, or improve customer satisfaction, Hypothesis Testing offers a structured approach to reach conclusive, data-supported decisions.

The graphical example above provides a simplified snapshot of a hypothesis test. The bell curve represents a normal distribution, the green area is where you’d accept the null hypothesis ( H 0), and the red area is the “rejection zone,” where you’d favor the alternative hypothesis ( Ha ). The vertical blue line represents the threshold value or “critical value,” beyond which you’d reject H 0.

Here’s a graphical example of a hypothesis test, which you can include in the introduction section of your guide. In this graph:

The curve represents a standard normal distribution, often encountered in hypothesis tests.
The green-shaded area signifies the “Acceptance Region,” where you would fail to reject the null hypothesis ( H 0).
The red-shaded areas are the “Rejection Regions,” where you would reject H 0 in favor of the alternative hypothesis ( Ha ).
The blue dashed lines indicate the “Critical Values” (±1.96), which are the thresholds for rejecting H 0.

This graphical representation serves as a conceptual foundation for understanding the mechanics of hypothesis testing. It visually illustrates what it means to accept or reject a hypothesis based on a predefined level of significance.

What is hypothesis testing.

Hypothesis testing is a structured procedure in statistics used for drawing conclusions about a larger population based on a subset of that population, known as a sample. The method is widely used across different industries and sectors for a variety of purposes. Below, we’ll dissect the key components of hypothesis testing to provide a more in-depth understanding.

The Hypotheses: H 0 and Ha

In every hypothesis test, there are two competing statements:

Null Hypothesis ( H 0) : This is the “status quo” hypothesis that you are trying to test against. It is a statement that asserts that there is no effect or difference. For example, in a manufacturing setting, the null hypothesis might state that a new production process does not improve the average output quality.
Alternative Hypothesis ( Ha or H 1) : This is what you aim to prove by conducting the hypothesis test. It is the statement that there is an effect or difference. Using the same manufacturing example, the alternative hypothesis might state that the new process does improve the average output quality.

Significance Level ( α )

Before conducting the test, you decide on a “Significance Level” ( α ), typically set at 0.05 or 5%. This level represents the probability of rejecting the null hypothesis when it is actually true. Lower α values make the test more stringent, reducing the chances of a ‘false positive’.

Data Collection

You then proceed to gather data, which is usually a sample from a larger population. The quality of your test heavily relies on how well this sample represents the population. The data can be collected through various means such as surveys, observations, or experiments.

Statistical Test

Depending on the nature of the data and what you’re trying to prove, different statistical tests can be applied (e.g., t-test, chi-square test , ANOVA , etc.). These tests will compute a test statistic (e.g., t , 2 χ 2, F , etc.) based on your sample data.

Here are graphical examples of the distributions commonly used in three different types of statistical tests: t-test, Chi-square test, and ANOVA (Analysis of Variance), displayed side by side for comparison.

Graph 1 (Leftmost): This graph represents a t-distribution, often used in t-tests. The t-distribution is similar to the normal distribution but tends to have heavier tails. It is commonly used when the sample size is small or the population variance is unknown.

Chi-square Test

Graph 2 (Middle): The Chi-square distribution is used in Chi-square tests, often for testing independence or goodness-of-fit. Unlike the t-distribution, the Chi-square distribution is not symmetrical and only takes on positive values.

ANOVA (F-distribution)

Graph 3 (Rightmost): The F-distribution is used in Analysis of Variance (ANOVA), a statistical test used to analyze the differences between group means. Like the Chi-square distribution, the F-distribution is also not symmetrical and takes only positive values.

These visual representations provide an intuitive understanding of the different statistical tests and their underlying distributions. Knowing which test to use and when is crucial for conducting accurate and meaningful hypothesis tests.

Decision Making

The test statistic is then compared to a critical value determined by the significance level ( α ) and the sample size. This comparison will give you a p-value. If the p-value is less than α , you reject the null hypothesis in favor of the alternative hypothesis. Otherwise, you fail to reject the null hypothesis.

Interpretation

Finally, you interpret the results in the context of what you were investigating. Rejecting the null hypothesis might mean implementing a new process or strategy, while failing to reject it might lead to a continuation of current practices.

To sum it up, hypothesis testing is not just a set of formulas but a methodical approach to problem-solving and decision-making based on data. It’s a crucial tool for anyone interested in deriving meaningful insights from data to make informed decisions.

Why is Hypothesis Testing Important?

Hypothesis testing is a cornerstone of statistical and empirical research, serving multiple functions in various fields. Let’s delve into each of the key areas where hypothesis testing holds significant importance:

Data-Driven Decisions

In today’s complex business environment, making decisions based on gut feeling or intuition is not enough; you need data to back up your choices. Hypothesis testing serves as a rigorous methodology for making decisions based on data. By setting up a null hypothesis and an alternative hypothesis, you can use statistical methods to determine which is more likely to be true given a data sample. This structured approach eliminates guesswork and adds empirical weight to your decisions, thereby increasing their credibility and effectiveness.

Risk Management

Hypothesis testing allows you to assign a ‘p-value’ to your findings, which is essentially the probability of observing the given sample data if the null hypothesis is true. This p-value can be directly used to quantify risk. For instance, a p-value of 0.05 implies there’s a 5% risk of rejecting the null hypothesis when it’s actually true. This is invaluable in scenarios like product launches or changes in operational processes, where understanding the risk involved can be as crucial as the decision itself.

Here’s an example to help you understand the concept better.

The graph above serves as a graphical representation to help explain the concept of a ‘p-value’ and its role in quantifying risk in hypothesis testing. Here’s how to interpret the graph:

Elements of the Graph

The curve represents a Standard Normal Distribution , which is often used to represent z-scores in hypothesis testing.
The red-shaded area on the right represents the Rejection Region . It corresponds to a 5% risk ( α =0.05) of rejecting the null hypothesis when it is actually true. This is the area where, if your test statistic falls, you would reject the null hypothesis.
The green-shaded area represents the Acceptance Region , with a 95% level of confidence. If your test statistic falls in this region, you would fail to reject the null hypothesis.
The blue dashed line is the Critical Value (approximately 1.645 in this example). If your standardized test statistic (z-value) exceeds this point, you enter the rejection region, and your p-value becomes less than 0.05, leading you to reject the null hypothesis.

Relating to Risk Management

The p-value can be directly related to risk management. For example, if you’re considering implementing a new manufacturing process, the p-value quantifies the risk of that decision. A low p-value (less than α ) would mean that the risk of rejecting the null hypothesis (i.e., going ahead with the new process) when it’s actually true is low, thus indicating a lower risk in implementing the change.

Quality Control

In sectors like manufacturing, automotive, and logistics, maintaining a high level of quality is not just an option but a necessity. Hypothesis testing is often employed in quality assurance and control processes to test whether a certain process or product conforms to standards. For example, if a car manufacturing line claims its error rate is below 5%, hypothesis testing can confirm or disprove this claim based on a sample of products. This ensures that quality is not compromised and that stakeholders can trust the end product.

Resource Optimization

Resource allocation is a significant challenge for any organization. Hypothesis testing can be a valuable tool in determining where resources will be most effectively utilized. For instance, in a manufacturing setting, you might want to test whether a new piece of machinery significantly increases production speed. A hypothesis test could provide the statistical evidence needed to decide whether investing in more of such machinery would be a wise use of resources.

In the realm of research and development, hypothesis testing can be a game-changer. When developing a new product or process, you’ll likely have various theories or hypotheses. Hypothesis testing allows you to systematically test these, filtering out the less likely options and focusing on the most promising ones. This not only speeds up the innovation process but also makes it more cost-effective by reducing the likelihood of investing in ideas that are statistically unlikely to be successful.

In summary, hypothesis testing is a versatile tool that adds rigor, reduces risk, and enhances the decision-making and innovation processes across various sectors and functions.

This graphical representation makes it easier to grasp how the p-value is used to quantify the risk involved in making a decision based on a hypothesis test.

Step-by-Step Guide to Hypothesis Testing

To make this guide practical and helpful if you are new learning about the concept we will explain each step of the process and follow it up with an example of the method being applied to a manufacturing line, and you want to test if a new process reduces the average time it takes to assemble a product.

Step 1: State the Hypotheses

The first and foremost step in hypothesis testing is to clearly define your hypotheses. This sets the stage for your entire test and guides the subsequent steps, from data collection to decision-making. At this stage, you formulate two competing hypotheses:

Null Hypothesis ( H 0)

The null hypothesis is a statement that there is no effect or no difference, and it serves as the hypothesis that you are trying to test against. It’s the default assumption that any kind of effect or difference you suspect is not real, and is due to chance. Formulating a clear null hypothesis is crucial, as your statistical tests will be aimed at challenging this hypothesis.

In a manufacturing context, if you’re testing whether a new assembly line process has reduced the time it takes to produce an item, your null hypothesis ( H 0) could be:

H 0:”The new process does not reduce the average assembly time.”

Alternative Hypothesis ( Ha or H 1)

The alternative hypothesis is what you want to prove. It is a statement that there is an effect or difference. This hypothesis is considered only after you find enough evidence against the null hypothesis.

Continuing with the manufacturing example, the alternative hypothesis ( Ha ) could be:

Ha :”The new process reduces the average assembly time.”

Types of Alternative Hypothesis

Depending on what exactly you are trying to prove, the alternative hypothesis can be:

Two-Sided : You’re interested in deviations in either direction (greater or smaller).
One-Sided : You’re interested in deviations only in one direction (either greater or smaller).

Scenario: Reducing Assembly Time in a Car Manufacturing Plant

You are a continuous improvement manager at a car manufacturing plant. One of the assembly lines has been struggling with longer assembly times, affecting the overall production schedule. A new assembly process has been proposed, promising to reduce the assembly time per car. Before rolling it out on the entire line, you decide to conduct a hypothesis test to see if the new process actually makes a difference. Null Hypothesis ( H 0) In this context, the null hypothesis would be the status quo, asserting that the new assembly process doesn’t reduce the assembly time per car. Mathematically, you could state it as: H 0:The average assembly time per car with the new process ≥ The average assembly time per car with the old process. Or simply: H 0:”The new process does not reduce the average assembly time per car.” Alternative Hypothesis ( Ha or H 1) The alternative hypothesis is what you aim to prove — that the new process is more efficient. Mathematically, it could be stated as: Ha :The average assembly time per car with the new process < The average assembly time per car with the old process Or simply: Ha :”The new process reduces the average assembly time per car.” Types of Alternative Hypothesis In this example, you’re only interested in knowing if the new process reduces the time, making it a One-Sided Alternative Hypothesis .

Step 2: Determine the Significance Level ( α )

Once you’ve clearly stated your null and alternative hypotheses, the next step is to decide on the significance level, often denoted by α . The significance level is a threshold below which the null hypothesis will be rejected. It quantifies the level of risk you’re willing to accept when making a decision based on the hypothesis test.

What is a Significance Level?

The significance level, usually expressed as a percentage, represents the probability of rejecting the null hypothesis when it is actually true. Common choices for α are 0.05, 0.01, and 0.10, representing 5%, 1%, and 10% levels of significance, respectively.

5% Significance Level ( α =0.05) : This is the most commonly used level and implies that you are willing to accept a 5% chance of rejecting the null hypothesis when it is true.
1% Significance Level ( α =0.01) : This is a more stringent level, used when you want to be more sure of your decision. The risk of falsely rejecting the null hypothesis is reduced to 1%.
10% Significance Level ( α =0.10) : This is a more lenient level, used when you are willing to take a higher risk. Here, the chance of falsely rejecting the null hypothesis is 10%.

Continuing with the manufacturing example, let’s say you decide to set α =0.05, meaning you’re willing to take a 5% risk of concluding that the new process is effective when it might not be.

How to Choose the Right Significance Level?

Choosing the right significance level depends on the context and the consequences of making a wrong decision. Here are some factors to consider:

Criticality of Decision : For highly critical decisions with severe consequences if wrong, a lower α like 0.01 may be appropriate.
Resource Constraints : If the cost of collecting more data is high, you may choose a higher α to make a decision based on a smaller sample size.
Industry Standards : Sometimes, the choice of α may be dictated by industry norms or regulatory guidelines.

By the end of Step 2, you should have a well-defined significance level that will guide the rest of your hypothesis testing process. This level serves as the cut-off for determining whether the observed effect or difference in your sample is statistically significant or not.

Continuing the Scenario: Reducing Assembly Time in a Car Manufacturing Plant

After formulating the hypotheses, the next step is to set the significance level ( α ) that will be used to interpret the results of the hypothesis test. This is a critical decision as it quantifies the level of risk you’re willing to accept when making a conclusion based on the test. Setting the Significance Level Given that assembly time is a critical factor affecting the production schedule, and ultimately, the company’s bottom line, you decide to be fairly stringent in your test. You opt for a commonly used significance level: α = 0.05 This means you are willing to accept a 5% chance of rejecting the null hypothesis when it is actually true. In practical terms, if you find that the p-value of the test is less than 0.05, you will conclude that the new process significantly reduces assembly time and consider implementing it across the entire line. Why α = 0.05 ? Industry Standard : A 5% significance level is widely accepted in many industries, including manufacturing, for hypothesis testing. Risk Management : By setting α = 0.05 , you’re limiting the risk of concluding that the new process is effective when it may not be to just 5%. Balanced Approach : This level offers a balance between being too lenient (e.g., α=0.10) and too stringent (e.g., α=0.01), making it a reasonable choice for this scenario.

Step 3: Collect and Prepare the Data

After stating your hypotheses and setting the significance level, the next vital step is data collection. The data you collect serves as the basis for your hypothesis test, so it’s essential to gather accurate and relevant data.

Types of Data

Depending on your hypothesis, you’ll need to collect either:

Quantitative Data : Numerical data that can be measured. Examples include height, weight, and temperature.
Qualitative Data : Categorical data that represent characteristics. Examples include colors, gender, and material types.

Data Collection Methods

Various methods can be used to collect data, such as:

Surveys and Questionnaires : Useful for collecting qualitative data and opinions.
Observation : Collecting data through direct or participant observation.
Experiments : Especially useful in scientific research where control over variables is possible.
Existing Data : Utilizing databases, records, or any other data previously collected.

Sample Size

The sample size ( n ) is another crucial factor. A larger sample size generally gives more accurate results, but it’s often constrained by resources like time and money. The choice of sample size might also depend on the statistical test you plan to use.

Continuing with the manufacturing example, suppose you decide to collect data on the assembly time of 30 randomly chosen products, 15 made using the old process and 15 made using the new process. Here, your sample size n =30.

Data Preparation

Once data is collected, it often needs to be cleaned and prepared for analysis. This could involve:

Removing Outliers : Outliers can skew the results and provide an inaccurate picture.
Data Transformation : Converting data into a format suitable for statistical analysis.
Data Coding : Categorizing or labeling data, necessary for qualitative data.

By the end of Step 3, you should have a dataset that is ready for statistical analysis. This dataset should be representative of the population you’re interested in and prepared in a way that makes it suitable for hypothesis testing.

With the hypotheses stated and the significance level set, you’re now ready to collect the data that will serve as the foundation for your hypothesis test. Given that you’re testing a change in a manufacturing process, the data will most likely be quantitative, representing the assembly time of cars produced on the line. Data Collection Plan You decide to use a Random Sampling Method for your data collection. For two weeks, assembly times for randomly selected cars will be recorded: one week using the old process and another week using the new process. Your aim is to collect data for 40 cars from each process, giving you a sample size ( n ) of 80 cars in total. Types of Data Quantitative Data : In this case, you’re collecting numerical data representing the assembly time in minutes for each car. Data Preparation Data Cleaning : Once the data is collected, you’ll need to inspect it for any anomalies or outliers that could skew your results. For example, if a significant machine breakdown happened during one of the weeks, you may need to adjust your data or collect more. Data Transformation : Given that you’re dealing with time, you may not need to transform your data, but it’s something to consider, depending on the statistical test you plan to use. Data Coding : Since you’re dealing with quantitative data in this scenario, coding is likely unnecessary unless you’re planning to categorize assembly times into bins (e.g., ‘fast’, ‘medium’, ‘slow’) for some reason. Example Data Points: Car_ID Process_Type Assembly_Time_Minutes 1 Old 38.53 2 Old 35.80 3 Old 36.96 4 Old 39.48 5 Old 38.74 6 Old 33.05 7 Old 36.90 8 Old 34.70 9 Old 34.79 … … … The complete dataset would contain 80 rows: 40 for the old process and 40 for the new process.

Step 4: Conduct the Statistical Test

After you have your hypotheses, significance level, and collected data, the next step is to actually perform the statistical test. This step involves calculations that will lead to a test statistic, which you’ll then use to make your decision regarding the null hypothesis.

Choose the Right Test

The first task is to decide which statistical test to use. The choice depends on several factors:

Type of Data : Quantitative or Qualitative
Sample Size : Large or Small
Number of Groups or Categories : One-sample, Two-sample, or Multiple groups

For instance, you might choose a t-test for comparing means of two groups when you have a small sample size. Chi-square tests are often used for categorical data, and ANOVA is used for comparing means across more than two groups.

Calculation of Test Statistic

Once you’ve chosen the appropriate statistical test, the next step is to calculate the test statistic. This involves using the sample data in a specific formula for the chosen test.

Obtain the p-value

After calculating the test statistic, the next step is to find the p-value associated with it. The p-value represents the probability of observing the given test statistic if the null hypothesis is true.

A small p-value (< α ) indicates strong evidence against the null hypothesis, so you reject the null hypothesis.
A large p-value (> α ) indicates weak evidence against the null hypothesis, so you fail to reject the null hypothesis.

Make the Decision

You now compare the p-value to the predetermined significance level ( α ):

If p < α , you reject the null hypothesis in favor of the alternative hypothesis.
If p > α , you fail to reject the null hypothesis.

In the manufacturing case, if your calculated p-value is 0.03 and your α is 0.05, you would reject the null hypothesis, concluding that the new process effectively reduces the average assembly time.

By the end of Step 4, you will have either rejected or failed to reject the null hypothesis, providing a statistical basis for your decision-making process.

Now that you have collected and prepared your data, the next step is to conduct the actual statistical test to evaluate the null and alternative hypotheses. In this case, you’ll be comparing the mean assembly times between cars produced using the old and new processes to determine if the new process is statistically significantly faster. Choosing the Right Test Given that you have two sets of independent samples (old process and new process), a Two-sample t-test for Equality of Means seems appropriate for comparing the average assembly times. Preparing Data for Minitab Firstly, you would prepare your data in an Excel sheet or CSV file with one column for the assembly times using the old process and another column for the assembly times using the new process. Import this file into Minitab. Steps to Perform the Two-sample t-test in Minitab Open Minitab : Launch the Minitab software on your computer. Import Data : Navigate to File > Open and import your data file. Navigate to the t-test Menu : Go to Stat > Basic Statistics > 2-Sample t... . Select Columns : In the dialog box, specify the columns corresponding to the old and new process assembly times under “Sample 1” and “Sample 2.” Options : Click on Options and make sure that you set the confidence level to 95% (which corresponds to α = 0.05 ). Run the Test : Click OK to run the test. In this example output, the p-value is 0.0012, which is less than the significance level α = 0.05 . Hence, you would reject the null hypothesis. The t-statistic is -3.45, indicating that the mean of the new process is statistically significantly less than the mean of the old process, which aligns with your alternative hypothesis. Showing the data displayed as a Box plot in the below graphic it is easy to see the new process is statistically significantly better.

Why do a Hypothesis test?

You might ask, after all this why do a hypothesis test and not just look at the averages, which is a good question. While looking at average times might give you a general idea of which process is faster, hypothesis testing provides several advantages that a simple comparison of averages doesn’t offer:

Statistical Significance

Account for Random Variability : Hypothesis testing considers not just the averages, but also the variability within each group. This allows you to make more robust conclusions that account for random chance.

Quantify the Evidence : With hypothesis testing, you obtain a p-value that quantifies the strength of the evidence against the null hypothesis. A simple comparison of averages doesn’t provide this level of detail.

Control Type I Error : Hypothesis testing allows you to control the probability of making a Type I error (i.e., rejecting a true null hypothesis). This is particularly useful in settings where the consequences of such an error could be costly or risky.

Quantify Risk : Hypothesis testing provides a structured way to make decisions based on a predefined level of risk (the significance level, α ).

Decision-making Confidence

Objective Decision Making : The formal structure of hypothesis testing provides an objective framework for decision-making. This is especially useful in a business setting where decisions often have to be justified to stakeholders.

Replicability : The statistical rigor ensures that the results are replicable. Another team could perform the same test and expect to get similar results, which is not necessarily the case when comparing only averages.

Additional Insights

Understanding of Variability : Hypothesis testing often involves looking at measures of spread and distribution, not just the mean. This can offer additional insights into the processes you’re comparing.

Basis for Further Analysis : Once you’ve performed a hypothesis test, you can often follow it up with other analyses (like confidence intervals for the difference in means, or effect size calculations) that offer more detailed information.

In summary, while comparing averages is quicker and simpler, hypothesis testing provides a more reliable, nuanced, and objective basis for making data-driven decisions.

Step 5: Interpret the Results and Make Conclusions

Having conducted the statistical test and obtained the p-value, you’re now at a stage where you can interpret these results in the context of the problem you’re investigating. This step is crucial for transforming the statistical findings into actionable insights.

Interpret the p-value

The p-value you obtained tells you the significance of your results:

Low p-value ( p < α ) : Indicates that the results are statistically significant, and it’s unlikely that the observed effects are due to random chance. In this case, you generally reject the null hypothesis.
High p-value ( p > α ) : Indicates that the results are not statistically significant, and the observed effects could well be due to random chance. Here, you generally fail to reject the null hypothesis.

Relate to Real-world Context

You should then relate these statistical conclusions to the real-world context of your problem. This is where your expertise in your specific field comes into play.

In our manufacturing example, if you’ve found a statistically significant reduction in assembly time with a p-value of 0.03 (which is less than the α level of 0.05), you can confidently conclude that the new manufacturing process is more efficient. You might then consider implementing this new process across the entire assembly line.

Make Recommendations

Based on your conclusions, you can make recommendations for action or further study. For example:

Implement Changes : If the test results are significant, consider making the changes on a larger scale.
Further Research : If the test results are not clear or not significant, you may recommend further studies or data collection.
Review Methodology : If you find that the results are not as expected, it might be useful to review the methodology and see if the test was conducted under the right conditions and with the right test parameters.

Document the Findings

Lastly, it’s essential to document all the steps taken, the methodology used, the data collected, and the conclusions drawn. This documentation is not only useful for any further studies but also for auditing purposes or for stakeholders who may need to understand the process and the findings.

By the end of Step 5, you’ll have turned the raw statistical findings into meaningful conclusions and actionable insights. This is the final step in the hypothesis testing process, making it a complete, robust method for informed decision-making.

You’ve successfully conducted the hypothesis test and found strong evidence to reject the null hypothesis in favor of the alternative: The new assembly process is statistically significantly faster than the old one. It’s now time to interpret these results in the context of your business operations and make actionable recommendations. Interpretation of Results Statistical Significance : The p-value of 0.0012 is well below the significance level of = 0.05 α = 0.05 , indicating that the results are statistically significant. Practical Significance : The boxplot and t-statistic (-3.45) suggest not just statistical, but also practical significance. The new process appears to be both consistently and substantially faster. Risk Assessment : The low p-value allows you to reject the null hypothesis with a high degree of confidence, meaning the risk of making a Type I error is minimal. Business Implications Increased Productivity : Implementing the new process could lead to an increase in the number of cars produced, thereby enhancing productivity. Cost Savings : Faster assembly time likely translates to lower labor costs. Quality Control : Consider monitoring the quality of cars produced under the new process closely to ensure that the speedier assembly does not compromise quality. Recommendations Implement New Process : Given the statistical and practical significance of the findings, recommend implementing the new process across the entire assembly line. Monitor and Adjust : Implement a control phase where the new process is monitored for both speed and quality. This could involve additional hypothesis tests or control charts. Communicate Findings : Share the results and recommendations with stakeholders through a formal presentation or report, emphasizing both the statistical rigor and the potential business benefits. Review Resource Allocation : Given the likely increase in productivity, assess if resources like labor and parts need to be reallocated to optimize the workflow further.

By following this step-by-step guide, you’ve journeyed through the rigorous yet enlightening process of hypothesis testing. From stating clear hypotheses to interpreting the results, each step has paved the way for making informed, data-driven decisions that can significantly impact your projects, business, or research.

Hypothesis testing is more than just a set of formulas or calculations; it’s a holistic approach to problem-solving that incorporates context, statistics, and strategic decision-making. While the process may seem daunting at first, each step serves a crucial role in ensuring that your conclusions are both statistically sound and practically relevant.

McKenzie, C.R., 2004. Hypothesis testing and evaluation . Blackwell handbook of judgment and decision making , pp.200-219.
Park, H.M., 2015. Hypothesis testing and statistical power of a test.
Eberhardt, L.L., 2003. What should we do about hypothesis testing? . The Journal of wildlife management , pp.241-247.

Q: What is hypothesis testing in the context of Lean Six Sigma?

A: Hypothesis testing is a statistical method used in Lean Six Sigma to determine whether there is enough evidence in a sample of data to infer that a certain condition holds true for the entire population. In the Lean Six Sigma process, it’s commonly used to validate the effectiveness of process improvements by comparing performance metrics before and after changes are implemented. A null hypothesis ( H 0 ) usually represents no change or effect, while the alternative hypothesis ( H 1 ) indicates a significant change or effect.

Q: How do I determine which statistical test to use for my hypothesis?

A: The choice of statistical test for hypothesis testing depends on several factors, including the type of data (nominal, ordinal, interval, or ratio), the sample size, the number of samples (one sample, two samples, paired), and whether the data distribution is normal. For example, a t-test is used for comparing the means of two groups when the data is normally distributed, while a Chi-square test is suitable for categorical data to test the relationship between two variables. It’s important to choose the right test to ensure the validity of your hypothesis testing results.

Q: What is a p-value, and how does it relate to hypothesis testing?

A: A p-value is a probability value that helps you determine the significance of your results in hypothesis testing. It represents the likelihood of obtaining a result at least as extreme as the one observed during the test, assuming that the null hypothesis is true. In hypothesis testing, if the p-value is lower than the predetermined significance level (commonly α = 0.05 ), you reject the null hypothesis, suggesting that the observed effect is statistically significant. If the p-value is higher, you fail to reject the null hypothesis, indicating that there is not enough evidence to support the alternative hypothesis.

Q: Can you explain Type I and Type II errors in hypothesis testing?

A: Type I and Type II errors are potential errors that can occur in hypothesis testing. A Type I error, also known as a “false positive,” occurs when the null hypothesis is true, but it is incorrectly rejected. It is equivalent to a false alarm. On the other hand, a Type II error, or a “false negative,” happens when the null hypothesis is false, but it is erroneously failed to be rejected. This means a real effect or difference was missed. The risk of a Type I error is represented by the significance level ( α ), while the risk of a Type II error is denoted by β . Minimizing these errors is crucial for the reliability of hypothesis tests in continuous improvement projects.

Daniel Croft is a seasoned continuous improvement manager with a Black Belt in Lean Six Sigma. With over 10 years of real-world application experience across diverse sectors, Daniel has a passion for optimizing processes and fostering a culture of efficiency. He's not just a practitioner but also an avid learner, constantly seeking to expand his knowledge. Outside of his professional life, Daniel has a keen Investing, statistics and knowledge-sharing, which led him to create the website learnleansigma.com, a platform dedicated to Lean Six Sigma and process improvement insights.

Free Lean Six Sigma Templates

Improve your Lean Six Sigma projects with our free templates. They're designed to make implementation and management easier, helping you achieve better results.

Other Guides

Comprehensive Learning Paths
150+ Hours of Videos
Complete Access to Jupyter notebooks, Datasets, References.

Hypothesis Testing – A Deep Dive into Hypothesis Testing, The Backbone of Statistical Inference

September 21, 2023

Explore the intricacies of hypothesis testing, a cornerstone of statistical analysis. Dive into methods, interpretations, and applications for making data-driven decisions.

In this Blog post we will learn:

What is Hypothesis Testing?
Steps in Hypothesis Testing 2.1. Set up Hypotheses: Null and Alternative 2.2. Choose a Significance Level (α) 2.3. Calculate a test statistic and P-Value 2.4. Make a Decision
Example : Testing a new drug.
Example in python

1. What is Hypothesis Testing?

In simple terms, hypothesis testing is a method used to make decisions or inferences about population parameters based on sample data. Imagine being handed a dice and asked if it’s biased. By rolling it a few times and analyzing the outcomes, you’d be engaging in the essence of hypothesis testing.

Think of hypothesis testing as the scientific method of the statistics world. Suppose you hear claims like “This new drug works wonders!” or “Our new website design boosts sales.” How do you know if these statements hold water? Enter hypothesis testing.

2. Steps in Hypothesis Testing

Set up Hypotheses : Begin with a null hypothesis (H0) and an alternative hypothesis (Ha).
Choose a Significance Level (α) : Typically 0.05, this is the probability of rejecting the null hypothesis when it’s actually true. Think of it as the chance of accusing an innocent person.
Calculate Test statistic and P-Value : Gather evidence (data) and calculate a test statistic.
p-value : This is the probability of observing the data, given that the null hypothesis is true. A small p-value (typically ≤ 0.05) suggests the data is inconsistent with the null hypothesis.
Decision Rule : If the p-value is less than or equal to α, you reject the null hypothesis in favor of the alternative.

2.1. Set up Hypotheses: Null and Alternative

Before diving into testing, we must formulate hypotheses. The null hypothesis (H0) represents the default assumption, while the alternative hypothesis (H1) challenges it.

For instance, in drug testing, H0 : “The new drug is no better than the existing one,” H1 : “The new drug is superior .”

2.2. Choose a Significance Level (α)

When You collect and analyze data to test H0 and H1 hypotheses. Based on your analysis, you decide whether to reject the null hypothesis in favor of the alternative, or fail to reject / Accept the null hypothesis.

The significance level, often denoted by $α$, represents the probability of rejecting the null hypothesis when it is actually true.

In other words, it’s the risk you’re willing to take of making a Type I error (false positive).

Type I Error (False Positive) :

Symbolized by the Greek letter alpha (α).
Occurs when you incorrectly reject a true null hypothesis . In other words, you conclude that there is an effect or difference when, in reality, there isn’t.
The probability of making a Type I error is denoted by the significance level of a test. Commonly, tests are conducted at the 0.05 significance level , which means there’s a 5% chance of making a Type I error .
Commonly used significance levels are 0.01, 0.05, and 0.10, but the choice depends on the context of the study and the level of risk one is willing to accept.

Example : If a drug is not effective (truth), but a clinical trial incorrectly concludes that it is effective (based on the sample data), then a Type I error has occurred.

Type II Error (False Negative) :

Symbolized by the Greek letter beta (β).
Occurs when you accept a false null hypothesis . This means you conclude there is no effect or difference when, in reality, there is.
The probability of making a Type II error is denoted by β. The power of a test (1 – β) represents the probability of correctly rejecting a false null hypothesis.

Example : If a drug is effective (truth), but a clinical trial incorrectly concludes that it is not effective (based on the sample data), then a Type II error has occurred.

Balancing the Errors :

In practice, there’s a trade-off between Type I and Type II errors. Reducing the risk of one typically increases the risk of the other. For example, if you want to decrease the probability of a Type I error (by setting a lower significance level), you might increase the probability of a Type II error unless you compensate by collecting more data or making other adjustments.

It’s essential to understand the consequences of both types of errors in any given context. In some situations, a Type I error might be more severe, while in others, a Type II error might be of greater concern. This understanding guides researchers in designing their experiments and choosing appropriate significance levels.

2.3. Calculate a test statistic and P-Value

Test statistic : A test statistic is a single number that helps us understand how far our sample data is from what we’d expect under a null hypothesis (a basic assumption we’re trying to test against). Generally, the larger the test statistic, the more evidence we have against our null hypothesis. It helps us decide whether the differences we observe in our data are due to random chance or if there’s an actual effect.

P-value : The P-value tells us how likely we would get our observed results (or something more extreme) if the null hypothesis were true. It’s a value between 0 and 1. – A smaller P-value (typically below 0.05) means that the observation is rare under the null hypothesis, so we might reject the null hypothesis. – A larger P-value suggests that what we observed could easily happen by random chance, so we might not reject the null hypothesis.

2.4. Make a Decision

Relationship between $α$ and P-Value

When conducting a hypothesis test:

We then calculate the p-value from our sample data and the test statistic.

Finally, we compare the p-value to our chosen $α$:

If $p−value≤α$: We reject the null hypothesis in favor of the alternative hypothesis. The result is said to be statistically significant.
If $p−value>α$: We fail to reject the null hypothesis. There isn’t enough statistical evidence to support the alternative hypothesis.

3. Example : Testing a new drug.

Imagine we are investigating whether a new drug is effective at treating headaches faster than drug B.

Setting Up the Experiment : You gather 100 people who suffer from headaches. Half of them (50 people) are given the new drug (let’s call this the ‘Drug Group’), and the other half are given a sugar pill, which doesn’t contain any medication.

Set up Hypotheses : Before starting, you make a prediction:
Null Hypothesis (H0): The new drug has no effect. Any difference in healing time between the two groups is just due to random chance.
Alternative Hypothesis (H1): The new drug does have an effect. The difference in healing time between the two groups is significant and not just by chance.

Calculate Test statistic and P-Value : After the experiment, you analyze the data. The “test statistic” is a number that helps you understand the difference between the two groups in terms of standard units.

For instance, let’s say:

The average healing time in the Drug Group is 2 hours.
The average healing time in the Placebo Group is 3 hours.

The test statistic helps you understand how significant this 1-hour difference is. If the groups are large and the spread of healing times in each group is small, then this difference might be significant. But if there’s a huge variation in healing times, the 1-hour difference might not be so special.

Imagine the P-value as answering this question: “If the new drug had NO real effect, what’s the probability that I’d see a difference as extreme (or more extreme) as the one I found, just by random chance?”

For instance:

P-value of 0.01 means there’s a 1% chance that the observed difference (or a more extreme difference) would occur if the drug had no effect. That’s pretty rare, so we might consider the drug effective.
P-value of 0.5 means there’s a 50% chance you’d see this difference just by chance. That’s pretty high, so we might not be convinced the drug is doing much.
If the P-value is less than ($α$) 0.05: the results are “statistically significant,” and they might reject the null hypothesis , believing the new drug has an effect.
If the P-value is greater than ($α$) 0.05: the results are not statistically significant, and they don’t reject the null hypothesis , remaining unsure if the drug has a genuine effect.

4. Example in python

For simplicity, let’s say we’re using a t-test (common for comparing means). Let’s dive into Python:

Making a Decision : “The results are statistically significant! p-value < 0.05 , The drug seems to have an effect!” If not, we’d say, “Looks like the drug isn’t as miraculous as we thought.”

5. Conclusion

Hypothesis testing is an indispensable tool in data science, allowing us to make data-driven decisions with confidence. By understanding its principles, conducting tests properly, and considering real-world applications, you can harness the power of hypothesis testing to unlock valuable insights from your data.

Correlation – connecting the dots, the role of correlation in data analysis, sampling and sampling distributions – a comprehensive guide on sampling and sampling distributions, law of large numbers – a deep dive into the world of statistics, central limit theorem – a deep dive into central limit theorem and its significance in statistics, skewness and kurtosis – peaks and tails, understanding data through skewness and kurtosis”, similar articles, complete introduction to linear regression in r, how to implement common statistical significance tests and find the p value, logistic regression – a complete tutorial with examples in r.

Subscribe to Machine Learning Plus for high value data science content

Machine Learning A-Z™: Hands-On Python & R In Data Science

Free sample videos:.

LEARN STATISTICS EASILY

Learn Data Analysis Now!

A Comprehensive Guide to Hypotheses Tests in Statistics

You will learn the essentials of hypothesis tests, from fundamental concepts to practical applications in statistics.

Null and alternative hypotheses guide hypothesis tests.
Significance level and p-value aid decision-making.
Parametric tests assume specific probability distributions.
Non-parametric tests offer flexible assumptions.
Confidence intervals provide estimate precision.

Introduction to Hypotheses Tests

Hypothesis testing is a statistical tool used to make decisions based on data.

It involves making assumptions about a population parameter and testing its validity using a population sample.

Hypothesis tests help us draw conclusions and make informed decisions in various fields like business, research, and science.

Null and Alternative Hypotheses

The null hypothesis (H0) is an initial claim about a population parameter, typically representing no effect or no difference.

The alternative hypothesis (H1) opposes the null hypothesis, suggesting an effect or difference.

Hypothesis tests aim to determine if there is evidence for the null hypothesis rejection in favor of the alternative hypothesis.

Significance Levels and P-values

The significance level (α), often set at 0.05 or 5%, serves as a threshold for determining if we should reject the null hypothesis.

A p-value, calculated during hypothesis testing, represents the probability of observing the test statistic if the null hypothesis is true.

Suppose the p-value is less than the significance level. We reject the null hypothesis, in that case, indicating that the alternative hypothesis is more likely.

Parametric and Non-Parametric Tests

Parametric tests assume the data follows a specific probability distribution, usually the normal distribution. Examples include the Student’s t-test.

Non-parametric tests do not require such assumptions and are helpful when dealing with data that do not meet the assumptions of parametric tests. Examples include the Mann-Whitney U test.

🎓 Master Data Analysis and Skyrocket Your Career

Find Out the Secrets in Our Ultimate Guide! 💼

Commonly Used Hypotheses Tests

Independent samples t-test: This analysis compares the means of two independent groups.

Paired samples t-test: Compares the means of two related groups (e.g., before and after treatment).

Chi-squared test: Determines if there is a significant association, in a contingency table, between two categorical variables.

Analysis of Variance (ANOVA): Compares the means of three or more independent groups to determine whether significant differences exist.

Pearson’s Correlation Coefficient (Pearson’s r): Quantifies the strength and direction of a linear association between two continuous variables.

Simple Linear Regression: Evaluate whether a significant linear relationship exists between a predictor variable (X) and a continuous outcome variable (y).

Logistic Regression: Determines the relationship between one or more predictor variables (continuous or categorical) and a binary outcome variable (e.g., success or failure).

Levene’s Test: Tests the equality of variances between two or more groups, often used as an assumption checks for ANOVA.

Shapiro-Wilk Test: Assesses the null hypothesis that a data sample is drawn from a population with a normal distribution.

Interpreting the Results of Hypotheses Tests

To interpret the hypothesis test results, compare the p-value to the chosen significance level.

If the p-value falls below the significance level, reject the null hypothesis and infer that a notable effect or difference exists.

Otherwise, fail to reject the null hypothesis, meaning there is insufficient evidence to support the alternative hypothesis.

Other Relevant Information

In addition to understanding the basics of hypothesis tests, it’s crucial to consider other relevant information when interpreting the results.

For example, factors such as effect size, statistical power, and confidence intervals can provide valuable insights and help you make more informed decisions.

Effect size

The effect size represents a quantitative measurement of the strength or magnitude of the observed relationship or effect between variables. It aids in evaluating the practical significance of the results. A statistically significant outcome may not necessarily imply practical relevance. At the same time, a substantial effect size can suggest meaningful findings, even when statistical significance appears marginal.

Statistical power

The power of a test represents the likelihood of accurately rejecting the null hypothesis when it is incorrect. In other words, it’s the likelihood that the test will detect an effect when it exists. Factors affecting the power of a test include the sample size, effect size, and significance level. Enhanced power reduces the likelihood of making an error of Type II — failing to reject the null hypothesis when it ought to be rejected.

Confidence intervals

A confidence interval represents a range where the true population parameter is expected to be found with a specified confidence level (e.g., 95%). Confidence intervals provide additional context to hypothesis testing, helping to assess the estimate’s precision and offering a better understanding of the uncertainty surrounding the results.

By considering these additional aspects when interpreting the results of hypothesis tests, you can gain a more comprehensive understanding of the data and make more informed conclusions.

Hypothesis testing is an indispensable statistical tool for drawing meaningful inferences and making informed data-based decisions.

By comprehending the essential concepts such as null and alternative hypotheses, significance levels, p-values, and the distinction between parametric and non-parametric tests, you can proficiently apply hypothesis testing to a wide range of real-world situations.

Additionally, understanding the importance of effect sizes, statistical power, and confidence intervals will enhance your ability to interpret the results and make better decisions.

With many applications across various fields, including medicine, psychology, business, and environmental sciences, hypothesis testing is a versatile and valuable method for research and data analysis.

A comprehensive grasp of hypothesis testing techniques will enable professionals and researchers to strengthen their decision-making processes, optimize strategies, and deepen their understanding of the relationships between variables, leading to more impactful results and discoveries.

Refine your data analysis skills and present meaningful insights with confidence using our latest digital book!

Access FREE samples now and master advanced techniques in data analysis, including optimal sample size determination and effective communication of results.

Don’t miss the chance to immerse yourself in Applied Statistics: Data Analysis and unlock your full potential in data-driven decision making.

Click the link to start exploring!

Can Standard Deviations Be Negative?

Connect with us on our social networks.

DAILY POSTS ON INSTAGRAM!

Hypothesis Tests

Applied Statistics: Data Analysis

If you’re struggling with statistics while analyzing data for your projects, this is your ultimate solution for Data Analysis!

What is the Difference Between ANOVA and T-Test?

Learn the key differences between t-tests and ANOVA, when to use each, and avoid common errors. Explore more relevant articles on our blog.

How To Lie With Statistics?

Is It Possible To Lie With Statistics? Of Course, It Is! But, How? There Are Several Techniques Presented Here — Never Fall For Such Lies!

Common Mistakes to Avoid in One-Way ANOVA Analysis

Discover how to avoid common one-way ANOVA mistakes, ensuring accurate analysis, valid conclusions, and reliable insights in your research.

Music, Tea, and P-Values: A Tale of Impossible Results and P-Hacking

Uncover the intriguing truth about p-values and the concept of p-hacking in scientific research and its impact on statistical analysis.

Random Forest in Practice: An Essential Guide

Learn the potential of Random Forest in Data Science with our essential guide on practical Python applications for predictive modeling.

User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
Duis aute irure dolor in reprehenderit in voluptate
Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

S.3 hypothesis testing.

In reviewing hypothesis tests, we start first with the general idea. Then, we keep returning to the basic procedures of hypothesis testing, each time adding a little more detail.

The general idea of hypothesis testing involves:

Making an initial assumption.
Collecting evidence (data).
Based on the available evidence (data), deciding whether to reject or not reject the initial assumption.

Every hypothesis test — regardless of the population parameter involved — requires the above three steps.

Example S.3.1

Is normal body temperature really 98.6 degrees f section .

Consider the population of many, many adults. A researcher hypothesized that the average adult body temperature is lower than the often-advertised 98.6 degrees F. That is, the researcher wants an answer to the question: "Is the average adult body temperature 98.6 degrees? Or is it lower?" To answer his research question, the researcher starts by assuming that the average adult body temperature was 98.6 degrees F.

Then, the researcher went out and tried to find evidence that refutes his initial assumption. In doing so, he selects a random sample of 130 adults. The average body temperature of the 130 sampled adults is 98.25 degrees.

Then, the researcher uses the data he collected to make a decision about his initial assumption. It is either likely or unlikely that the researcher would collect the evidence he did given his initial assumption that the average adult body temperature is 98.6 degrees:

If it is likely , then the researcher does not reject his initial assumption that the average adult body temperature is 98.6 degrees. There is not enough evidence to do otherwise.
either the researcher's initial assumption is correct and he experienced a very unusual event;
or the researcher's initial assumption is incorrect.

In statistics, we generally don't make claims that require us to believe that a very unusual event happened. That is, in the practice of statistics, if the evidence (data) we collected is unlikely in light of the initial assumption, then we reject our initial assumption.

Example S.3.2

Criminal trial analogy section .

One place where you can consistently see the general idea of hypothesis testing in action is in criminal trials held in the United States. Our criminal justice system assumes "the defendant is innocent until proven guilty." That is, our initial assumption is that the defendant is innocent.

In the practice of statistics, we make our initial assumption when we state our two competing hypotheses -- the null hypothesis ( H 0 ) and the alternative hypothesis ( H A ). Here, our hypotheses are:

H 0 : Defendant is not guilty (innocent)
H A : Defendant is guilty

In statistics, we always assume the null hypothesis is true . That is, the null hypothesis is always our initial assumption.

The prosecution team then collects evidence — such as finger prints, blood spots, hair samples, carpet fibers, shoe prints, ransom notes, and handwriting samples — with the hopes of finding "sufficient evidence" to make the assumption of innocence refutable.

In statistics, the data are the evidence.

The jury then makes a decision based on the available evidence:

If the jury finds sufficient evidence — beyond a reasonable doubt — to make the assumption of innocence refutable, the jury rejects the null hypothesis and deems the defendant guilty. We behave as if the defendant is guilty.
If there is insufficient evidence, then the jury does not reject the null hypothesis . We behave as if the defendant is innocent.

In statistics, we always make one of two decisions. We either "reject the null hypothesis" or we "fail to reject the null hypothesis."

Errors in Hypothesis Testing Section

Did you notice the use of the phrase "behave as if" in the previous discussion? We "behave as if" the defendant is guilty; we do not "prove" that the defendant is guilty. And, we "behave as if" the defendant is innocent; we do not "prove" that the defendant is innocent.

This is a very important distinction! We make our decision based on evidence not on 100% guaranteed proof. Again:

If we reject the null hypothesis, we do not prove that the alternative hypothesis is true.
If we do not reject the null hypothesis, we do not prove that the null hypothesis is true.

We merely state that there is enough evidence to behave one way or the other. This is always true in statistics! Because of this, whatever the decision, there is always a chance that we made an error .

Let's review the two types of errors that can be made in criminal trials:

Table S.3.2 shows how this corresponds to the two types of errors in hypothesis testing.

Note that, in statistics, we call the two types of errors by two different names -- one is called a "Type I error," and the other is called a "Type II error." Here are the formal definitions of the two types of errors:

There is always a chance of making one of these errors. But, a good scientific study will minimize the chance of doing so!

Making the Decision Section

Recall that it is either likely or unlikely that we would observe the evidence we did given our initial assumption. If it is likely , we do not reject the null hypothesis. If it is unlikely , then we reject the null hypothesis in favor of the alternative hypothesis. Effectively, then, making the decision reduces to determining "likely" or "unlikely."

In statistics, there are two ways to determine whether the evidence is likely or unlikely given the initial assumption:

We could take the " critical value approach " (favored in many of the older textbooks).
Or, we could take the " P -value approach " (what is used most often in research, journal articles, and statistical software).

In the next two sections, we review the procedures behind each of these two approaches. To make our review concrete, let's imagine that μ is the average grade point average of all American students who major in mathematics. We first review the critical value approach for conducting each of the following three hypothesis tests about the population mean $\mu$:

In Practice

We would want to conduct the first hypothesis test if we were interested in concluding that the average grade point average of the group is more than 3.
We would want to conduct the second hypothesis test if we were interested in concluding that the average grade point average of the group is less than 3.
And, we would want to conduct the third hypothesis test if we were only interested in concluding that the average grade point average of the group differs from 3 (without caring whether it is more or less than 3).

Upon completing the review of the critical value approach, we review the P -value approach for conducting each of the above three hypothesis tests about the population mean $\mu$. The procedures that we review here for both approaches easily extend to hypothesis tests about any other population parameter.

Introduction

Variation & sampling, the lady tasting tea & statistical significance.

Using the t-Test to Make Inferences from Data

Making Decisions with Data: Understanding Hypothesis Testing & Statistical Significance

Split-Screen
Article contents
Figures & tables
Supplementary Data
Peer Review
Open the PDF for in another window
Guest Access
Get Permissions
Cite Icon Cite
Search Site

Robert A. Cooper; Making Decisions with Data: Understanding Hypothesis Testing & Statistical Significance. The American Biology Teacher 1 October 2019; 81 (8): 535–542. doi: https://doi.org/10.1525/abt.2019.81.8.535

Download citation file:

Ris (Zotero)
Reference Manager

Statistical methods are indispensable to the practice of science. But statistical hypothesis testing can seem daunting, with P -values, null hypotheses, and the concept of statistical significance. This article explains the concepts associated with statistical hypothesis testing using the story of “the lady tasting tea,” then walks the reader through an application of the independent-samples t -test using data from Peter and Rosemary Grant's investigations of Darwin's finches. Understanding how scientists use statistics is an important component of scientific literacy, and students should have opportunities to use statistical methods like this in their science classes.

Statistical methods are indispensable to the practice of science, and understanding science includes understanding the role statistics play in its practice. Students must be given opportunities to analyze data in their science classes, using statistical methods that are suited to the data and age-appropriate. Middle school science students should be able to construct and interpret graphs, understand variation, calculate a mean, and understand what standard deviation tells us about a distribution. High school and college biology students should be able to construct and interpret error bars, and perform and interpret statistical hypothesis tests like chi-square and the independent-samples t -test. Here, I explain the meaning of statistical significance and related terms associated with hypothesis testing, using an application of the independent-samples t -test as an example.

As we engage students with inquiry labs, situations arise where students must make decisions based on data. Statistics allow us to organize data for interpretation and deal with variation in the data. There are many sources of variation. Some variation, like the genotypic and phenotypic differences between organisms, is characteristic of the systems we study. But some of the variation we see is induced by data collection ( Wild, 2006 ). Figure 1 distinguishes the sources of induced variation from the real variation in which we are interested. Measurement error can arise from mistakes made by the person making the measurements or from limitations or flaws in the measuring devices. Other errors can occur during the collection and processing of data. For example, a number could be entered in the wrong column on a data table or spreadsheet. Finally, there is always sampling error. Sampling error results when a sample that is intended to represent the entire population does not adequately do so.

Sources of variation in data ( Wild, 2006 ).

Being meticulous in your data collection and sampling methods may reduce or eliminate many of these sources of induced variation. For example, careful attention to detail can reduce or eliminate the chance of measurement errors or accidents occurring during data collection and processing. But measuring devices will always have limitations resulting in some degree of variation and uncertainty, however small. And unless we only deal with cases where populations are very small and we can measure every individual, there will always be some sampling error. Statistical methods help us filter out any real variation in sample data from the surrounding noise caused by induced variation so that we can learn something about our population of interest.

A common situation that arises in inquiry labs involves determining whether the difference between two groups, for example a treatment and a control group, is statistically significant. But statistical methods can seem daunting, with their P -values and null hypotheses. And for that matter, what does “statistical significance” actually mean? Many instructors and students struggle to understand these concepts. A story from the history of statistics about a lady tasting tea should make significance testing and related concepts more accessible.

The idea of a test of significance was conceived by Ronald Fisher (1890–1962). He played a major role in developing experimental designs and statistical methods that helped to revolutionize the practice of science in the twentieth century ( Salsburg, 2001 ). In his book The Design of Experiments (1971; first published in 1935), Fisher introduced the concept of a test of significance by recounting the following story. One summer afternoon in the late 1920s, Fisher and several colleagues were having tea. When Fisher handed Lady Muriel Bristol a cup, she declined because Fisher had poured the tea into the cup first. Lady Bristol declared that tea tasted different depending on whether the milk was poured into the tea or the tea poured into the milk. Fisher and the other scientists were skeptical and began to discuss how they could test Lady Bristol's claim.

The scientists decided to arrange eight cups, four with the milk poured into the tea and four with the tea poured into the milk. The cups were presented to Lady Bristol for tasting one at a time in random order, and she was told that she had to identify the four milk-first cups. In his book, Fisher explains how to determine the probabilities associated with having the lady evaluate eight cups of tea. Figure 2 shows the probabilities he calculated for each number of milk-first cups the lady could potentially identify correctly ( Fisher, 1971 ; Gorroochurn, 2012 ). With eight cups of tea presented in random order, the probability of the lady correctly identifying all four milk-first cups by guessing is 1 in 70, or 0.014. Assuming that she is unable to distinguish the cups, the most likely outcome will be that she guesses two out of four cups correctly. In repeated experiments, this will happen by chance 51.4% of the time. If she cannot really tell the difference, the probability of Fisher being fooled by a random occurrence where the lady happens to guess all four of the cups correctly is 0.014, or 1.4%. So if Fisher puts her to the test and she evaluates all the cups correctly – an outcome of the experiment that is unlikely to have occurred by chance – he can tentatively conclude that she can tell which liquid was poured first.

Possible outcomes of the “lady tasting tea” experiment.

While Fisher described the experimental design for the lady tasting tea, he did not tell us the outcome. However, Salsburg (2001) has it from a reliable source that the lady correctly identified all the cups. It is important to understand that by setting up this experiment and having her demonstrate her talent on one occasion, Fisher has not proved that the lady can make the necessary distinction. Even if she can't tell the difference, there is still a 1.4% chance that the outcome of the experiment was a random occurrence, albeit a very unlikely one. However, getting an unlikely result in an experiment like this is good evidence that Fisher's initial assumption, that she cannot tell the difference, may not be true. So, we can be reasonably safe in rejecting the idea that her success is just by chance and conclude, based on the result of this experiment, that she can tell the difference.

Fisher termed outcomes of well-designed experiments that are unlikely to have occurred by chance statistically significant outcomes, and a test designed to demonstrate this is a test of significance . Methods of inference commonly used in science to support or reject claims based on data, like the chi-square test or the independent-samples t -test, are also tests of significance ( Moore et al., 2009 ).

One key point to remember from the story of Fisher and the lady is that before performing an experiment, we should consider all possible outcomes and how to interpret them. Fisher used the distribution in Figure 2 to forecast all possible results of the lady tasting tea because that was an appropriate distribution to model an experiment in which a series of random events occurred, with each event having one of two possible outcomes ( Gorroochurn, 2012 ). In The Design of Experiments , he wrote:

In considering the appropriateness of any proposed experimental design, it is always needful to forecast all possible results of the experiment, and to have decided without ambiguity what interpretation shall be placed on each one of them. ( Fisher, 1971 , p. 12)

Fisher called the distribution of all possible random outcomes the null distribution . In evaluating the outcome of an experiment, a scientist is testing whether or not the actual outcome conforms to one of the most likely outcomes predicted by the null distribution. In other words, he is testing the assumption that the result of the experiment is a highly probable outcome and is therefore likely governed mainly by chance. This assumption is called the null hypothesis . If the outcome of an experiment deviates significantly from the most likely outcomes predicted by the null distribution, then the result is statistically significant.

The null distribution we use depends on the design of our experiment and the type of data we collect. For example, biology teachers are familiar with chi-square distributions as they are used in classical genetics. Chi-square distributions ( Figure 3 ) provide models of discrete data (that is, data derived from counting) where each datum randomly falls into one of two or more categories. Another important family of distributions are normal distributions that model continuous data where the measurements, like the heights of a group of people, can be meaningfully subdivided into smaller and smaller increments. Data that fit a normal distribution fall randomly around an average value, the mean, but tend to cluster near the mean and tail off to either side. To use the normal distribution, we need a large sample size and we need to know the standard deviation of the population, but this is usually not the case. The t -distribution, which we will use here in a case related to Darwin's finches, is a null distribution used as a stand-in for the normal distribution when we have a small sample. The t -distribution more closely resembles the normal distribution as sample size approaches 30.

Figure 3. Distribution of chi-square values with different degrees of freedom. Modified from Engineering 360 (https://www.globalspec.com/reference/69568/203279/11-8-the-chi-square-distribution).

Distribution of chi-square values with different degrees of freedom. Modified from Engineering 360 ( https://www.globalspec.com/reference/69568/203279/11-8-the-chi-square-distribution ).

Using the t -Test to Make Inferences from Data

McDonald (2014 , p. 14) summarizes the logic of inferential statistical procedures as follows:

The basic idea of a statistical test is to identify a null hypothesis, collect some data, then estimate the probability of getting the observed data if the null hypothesis were true. If the probability of getting a result like the observed one is low under the null hypothesis, you conclude that the null hypothesis is probably not true.

In the following, I use data on the morphological characteristics of Darwin's finches to illustrate the logic of inferential statistics described by McDonald. The appropriate statistical test to analyze the finch data is the independent-samples t -test (referred to hereafter as the t -test). In the concluding section, I explain the parallels between Fisher's tea experiment and the t -test.

The activity provided by HHMI BioInteractive (2014) includes a spreadsheet with data from a randomized sample of 100 medium ground finches collected and measured by Peter and Rosemary Grant in the Galápagos; 50 of the birds survived the 1977 drought on Daphne Major, and the other 50 died as a result of the drought. The spreadsheet records data on a number of the birds' traits, among them beak depth. We want to know whether the birds' beak depth made a difference in their survival during the drought. We will use the t -test to compare the mean beak depths of survivors and nonsurvivors, to see if there is a statistically significant difference between them.

The assumption we begin with is our null hypothesis, that beak depth made no difference in the survival of the birds. If that is the case, then we expect that the group of survivors will have a very similar mean beak depth to that of the nonsurvivors, and that any difference we see between the survivor and the nonsurvivor group is due to sampling error. (To understand this expectation, it helps to look at HHMI BioInteractive [2017] . Readers are encouraged to experiment with this interactive activity. The related student worksheet provides guided instruction in some foundational statistics concepts.)

Figure 4 , an image captured from HHMI BioInteractive (2017) , shows two graphs plotting means of random samples drawn from a large population. In each case, 500 samples were selected at random, their means were calculated, and the means were plotted on the graphs to produce the distributions shown. Each of the 500 sample means provides an estimate of the actual mean of the population, which is 50 kg. The key point here is that we can expect any two random samples taken from the same population to have similar means. If you compare distribution A, consisting of 500 samples of 25 individuals each, with distribution B, consisting of 500 samples of 100 individuals each, you can see that the estimates cluster around the population mean of 50 kg in each case, but more tightly so in B. So, if we could take 500 random samples of 50 birds from our finch population and plot them, what would we expect to see? We should see a similar normal distribution as in A and B, but with a range intermediate between A and B. So, if beak depth had no bearing on survival, our two samples should have very similar means, just as any two random samples drawn from the same population should have.

Figure 4. Means of random samples drawn from a large population. Image captured from HHMI BioInteractive (2017; copyright 2015 Howard Hughes Medical Institute, used with permission; https://www.BioInteractive.org).

Means of random samples drawn from a large population. Image captured from HHMI BioInteractive (2017 ; copyright 2015 Howard Hughes Medical Institute, used with permission; https://www.BioInteractive.org ).

It is possible to get two random samples that have a statistically significant difference just by chance, but it is unlikely. Figure 5 shows two groups randomly selected from a large population that, by chance, have a statistically significant difference between their means as determined by a t -test. If these two samples were selected for an experiment, one as a control and the other as a treatment group, and the independent variable actually had no effect, a t -test would incorrectly result in the rejection of the null hypothesis. Statisticians call this a Type I error.

Figure 5. Two groups, G1(n1 = 25, x¯1=51 kg, s1 = 8 kg) and G2(n2 = 25, x¯2=46 kg, s2 = 8 kg), were randomly selected from a large population, yet they still have a statistically significant difference in their means. This is a Type I error. The probability of getting a difference this large between these two random samples by chance is P = 0.032 (P < 0.05). Image captured from HHMI BioInteractive (2017; copyright 2015 Howard Hughes Medical Institute, used with permission; https://www.BioInteractive.org).

Two groups, G 1 ( n 1 = 25, x ¯ 1 = 51 kg ⁠ , s 1 = 8 kg) and G 2 ( n 2 = 25, x ¯ 2 = 46 kg ⁠ , s 2 = 8 kg), were randomly selected from a large population, yet they still have a statistically significant difference in their means. This is a Type I error. The probability of getting a difference this large between these two random samples by chance is P = 0.032 ( P < 0.05). Image captured from HHMI BioInteractive (2017 ; copyright 2015 Howard Hughes Medical Institute, used with permission; https://www.BioInteractive.org ).

From the data provided in HHMI BioInteractive (2017) , the nonsurviving finches have a mean beak depth of 9.11 mm, with a standard deviation of 0.88 mm. The surviving finches have a mean beak depth of 9.67 mm, with a standard deviation of 0.84 mm. Is this difference between the groups large enough to be considered statistically significant, or are they just random samples from the same population that, due to sampling error, have a difference in their means? To answer this question, we perform a t -test on our data. Inferential statistical procedures like the t -test have five basic steps. The steps in the t -test are applied to the finch data as follows.

Steps 1 & 2: Choose the Appropriate Statistical Test & State the Hypotheses

The t -test is appropriate for comparing the mean beak depths of two small samples of continuous data points. The null hypothesis states that any differences between the two groups will be small and attributable mainly to chance factors, in this case primarily genetic variation and sampling error. The alternative hypothesis states that some factor caused a difference between the two groups; in this case, differences in the mean beak depths of the survivors and nonsurvivors influenced their survival. The formal statements of the hypotheses are as follows.

Null hypothesis (H 0 ): There is no significant difference in mean beak depth between the nonsurviving finches and finches that survived the drought.

Alternative hypothesis (H 1 ): There is a significant difference in mean beak depth between the nonsurviving finches and finches that survived the drought.

Step 3: Choose the Decision Criterion

Next we determine what criterion to use in deciding whether the difference between means is large enough to be statistically significant. As Fisher wrote, we must “forecast all possible results of the experiment, and [decide] what interpretation shall be placed on each one of them” (1971, p. 12). The selection of the t -distribution as our null distribution forecasts all possible results. In order to interpret each of them, we need two numbers – a significance level and the degrees of freedom. The significance level, designated by the Greek letter alpha (α), is a choice you make as the scientist conducting the research. It represents the probability of incorrectly rejecting the null hypothesis, so it plays a similar role as Fisher's 1.4% probability that the lady guesses all the cups correctly by chance.

In other words, we are deciding how tolerant we will be of getting two random samples with a large difference just by chance and incorrectly rejecting the null hypothesis – a Type I error. The null assumes that any differences between the two groups will be small and attributable primarily to sampling error. It is customary to use α = 0.05. This means that even if there really is no significant difference between the two groups, 5% of the time when we perform this experiment, we will get a result that indicates that there is a significant difference. We will reject the null hypothesis, but we will have been fooled by randomness (as in Figure 5 ).

Why is it customary to use α = 0.05? Why not choose a smaller alpha level, like 0.01 or 0.001, to make it even less likely that we will incorrectly reject the null hypothesis and commit a Type I error? The reason is that a smaller alpha level will increase the probability of a Type II error, failing to reject the null hypothesis when there actually is a significant difference between the sample means. Choosing α = 0.05 allows us to strike a balance between the risks of Type I and Type II errors.

The second number we need to set the decision criterion is the degrees of freedom – the number of values in the final calculation of a statistic that are free to vary. For the independent-samples t -test, the degrees of freedom equals the sum of the number of measurements in the two groups minus 2. There are 50 birds in each of the two groups, so the degrees of freedom (df) = 50 + 50 − 2 = 98.

Now we take α = 0.05 and df = 98 to a table of critical t -values like Table 1 . Notice that df = 98 is not on our table. When this is the case, we choose the critical t -value associated with the next lowest level on the table, in this case df = 80, with a critical t -value of 1.990. If you examine the table, you will see that the lower the degrees of freedom, the higher the critical t -value. The higher the critical value, the harder it is to achieve statistical significance. So, by choosing the next lower value for degrees of freedom, we make it less likely that we will commit a Type I error and reject the null hypothesis when we should not do so. We want to avoid being fooled by a random occurrence if possible.

Ultimately, the t -test comes down to comparing two numbers, the critical t -value that we just determined (1.990) and the observed t -value we will soon calculate using the finch data. Figure 6 shows a t -distribution centered on t = 0; this is the most likely outcome when we compare the means of any two random samples taken from the same population. (Recall the example distributions from HHMI BioInteractive [2017] in Figure 4 .) To calculate a t -value, we subtract the means ( ⁠ x ¯ 1 − x ¯ 2 in Figure 7 ), so the t -value will equal zero when the two means are identical, and the value will be close to zero when their difference is small. The greater the difference between the two means, the farther to the left or right of center the observed t -value will fall. If x ¯ 1 in the formula shown in Figure 7 is significantly greater than x ¯ 2 ⁠ , then the calculated t -value will be greater than the critical value of 1.990. If x ¯ 1 is significantly less than x ¯ 2 ⁠ , then the calculated t -value will be less than −1.990.

Figure 6. The t-distribution, which assumes that the null is true. Modified from Statistics By Jim (https://statisticsbyjim.com/hypothesis-testing/t-tests-t-values-t-distributions-probabilities/).

The t -distribution, which assumes that the null is true. Modified from Statistics By Jim ( https://statisticsbyjim.com/hypothesis-testing/t-tests-t-values-t-distributions-probabilities/ ).

Equation for calculating a t -value.

The critical values for the finch study are marked as vertical lines on the graph in Figure 6 . If the observed t -value is less than −1.990 or greater than 1.990, we reject the null hypothesis and conclude that the difference is statistically significant – that is, that the difference is large enough that it is unlikely to have occurred by chance. We used α = 0.05, so 95% of the area under the curve falls in the center of the distribution between the two lines marking the critical values, and 5% of the area falls to the left and right of those lines. We have effectively divided the distribution into two regions: (1) the region under the center of the t -distribution, representing a small difference between the sample means that is compatible with the null hypothesis; and (2) the region under the extreme tails of the t -distribution, representing large differences between the sample means that are very unlikely to occur if the null hypothesis is true.

If we repeatedly take two random samples from the same population that should have no differences between them and compare their means with a t -test, 95% of the time the t -value we calculate will fall in the 95% area of the distribution. Five percent of the time, we will get a t -value that falls to the left or right of the critical value indicating statistical significance (as in Figure 5 ), even though both samples were chosen randomly from the same population and there is no actual difference between them.

Step 4: Calculate the t -Statistic

The t -statistic can be thought of as a ratio of “signal to noise.” The expression in the numerator, ( x ¯ 1 − x ¯ 2 ) ⁠ , the difference between the two means, is the signal. The greater the difference, the stronger the signal, the larger the t -value will be, and the more likely we will achieve statistical significance.

But statistical significance also depends on the noise: the variability in the two sample datasets ( Figure 8 ). The smaller the variability in the sample data, the more likely we are to find statistical significance. The sample variances, the squares of the standard deviations ( ⁠ s 1 2 and s 2 2 ⁠ ), represent the variability in the beak depth data: s 1 2 = ( 0.84 mm ) 2 = 0.71 mm 2 and s 2 2 = ( 0.88 mm ) 2 = 0.77 mm 2 ⁠ . Smaller variances make a smaller denominator, making the t -value larger and making it more likely that we will achieve statistical significance. Sample size also influences the outcome of the test. Smaller sample sizes will typically result in a larger denominator and more noise, and make it less likely that we find a statistically significant result. Larger sample sizes will increase the likelihood of achieving statistical significance.

Figure 8. Examples of sample distributions with differing degrees of variability (from Web Center for Social Research Methods, http://www.socialresearchmethods.net/kb/stat_t.php).

Examples of sample distributions with differing degrees of variability (from Web Center for Social Research Methods, http://www.socialresearchmethods.net/kb/stat_t.php ).

Using the formula for calculating the observed t -value in our finch case, we get t = 3.26. The calculations are as follows (SQRT = square root):

Step 5: Evaluate the Result

In the final step, we compare the calculated t -value of 3.26 with the critical t -value of 1.990. If the calculated t -value falls in the 5% region we marked in Figure 6 , we reject the null hypothesis because it indicates that getting a difference between the two means as large as we observed is unlikely to have occurred by chance, assuming under the null that our two groups of finches actually are just two random samples from the same population.

This is where the P -value comes into play. The P -value tells us how much uncertainty we have when we conclude that two samples have a significant difference between them. In other words, with a significance level of α = 0.05, there is a 5% or lower probability that the difference we found is attributable to sampling error.

Using a spreadsheet or statistical analysis software, we can get the actual P -value (just as Fisher was able to calculate for the tea experiment). For the finch data, P = 0.0015. As a reminder, the null hypothesis is that beak depth had no bearing on survival of the birds during the drought. So, if the null hypothesis is true, and we repeatedly drew two random samples from the same finch population to compare their means with a t -test, for every 1000 times we did this experiment we would get a significant difference only about one or two times by chance – not a very likely outcome. We conclude that the difference we observed between the survivors and nonsurvivors is statistically significant. It is unlikely to have occurred by chance and so, like Fisher in the tea experiment, we reject the null hypothesis and tentatively conclude that birds with larger beaks were, on average, better able to survive the drought.

How do we know that this isn't just a rare chance occurrence, like winning a lottery? How can we establish our conclusion with greater certainty? We repeat the experiment. If the significant difference we found was just a chance occurrence, it is unlikely to be repeated in subsequent experiments. The Grants have repeated the experiment. They have observed similar fluctuations in beak depth as droughts have alternated with rainy periods over multiple years. Figure 9 shows the fluctuations in mean beak depth of medium ground finches correlated with fluctuations in weather over a period of eight years.

Figure 9. Fluctuations in mean beak depth of medium ground finches correlated with fluctuations in weather over a period of eight years. FREEMAN, SCOTT; HERRON, JON C., EVOLUTIONARY ANALYSIS, 4th, ©2007. Reprinted by permission of Pearson Education, Inc., New York, New York. (http://bodell.mtchs.org/OnlineBio/BIOCD/text/chapter14/concept14.4.html).

Fluctuations in mean beak depth of medium ground finches correlated with fluctuations in weather over a period of eight years. FREEMAN, SCOTT; HERRON, JON C., EVOLUTIONARY ANALYSIS, 4th, ©2007. Reprinted by permission of Pearson Education, Inc., New York, New York. ( http://bodell.mtchs.org/OnlineBio/BIOCD/text/chapter14/concept14.4.html ).

What do Fisher's test of the lady tasting tea and the t -test on the Grants' finch data have in common? In both cases, we started by assuming that there was no effect. The lady cannot tell the difference between milk-first and tea-first; beak depth has no bearing on survival. These are null hypotheses. Given these assumptions, we selected an appropriate null distribution to predict the outcomes of random events we expect during the experiment: the distribution in Figure 2 for the tea experiment; t -distribution for the finches. Experiments were carried out and data were collected and analyzed. In each case, the outcome deviated from what the null distribution predicted, by an amount that we previously determined in the design of the experiment. Fisher arranged matters so that the lady had a 1.4% probability of getting all cups correct by guessing, an outcome that was very unlikely to occur by chance. This was the P -value for Fisher's experiment. With the finches, we chose an alpha level of 0.05 as a decision point, so that if the difference between the means, given the variation in the samples, was great enough to produce a calculated t -value with a probability less than or equal to 0.05, we could say that a difference that large was unlikely to have occurred by chance. The actual P -value was 0.0015. In each case the outcomes were very unlikely, so we rejected the null hypothesis and concluded that there likely was a causal relationship.

Students in my AP Biology class performed the activity described here as one part of a unit introducing evolution as a major theme of biology. Emphasis in the unit was placed on understanding fundamental concepts, and constructing evidence-based arguments and explanations. Student understanding of statistics was not directly assessed; however, students did show improvement in their explanations of natural selection as the cause of adaptive changes in populations, and their explanations were more likely to reference empirical evidence. Additional discussions of how HHMI resources were used in the unit can be found in Cooper (2016) and in Lucci and Cooper (2019) .

Scientists use statistics like those illustrated here to organize and analyze data so that they can make inferences from the dataset and use it as evidence ( AAAS, 2011 ; NGSS Lead States, 2013 ; College Board, 2019 ). Understanding how scientists use statistics is an important component of biological literacy, and students should have opportunities to use statistical methods like this in their science classes.

I thank Brad Williamson, who pointed out the significance of the “lady tasting tea” story, which inspired me to write this paper, and Sydney Bergman, who read an early draft and provided valuable feedback. Disclaimer: I have received support from HHMI to present professional development workshops for educators featuring the use of HHMI BioInteractive resources. This publication was prepared and submitted independent of any HHMI support.

Recipient(s) will receive an email with a link to 'Making Decisions with Data: Understanding Hypothesis Testing & Statistical Significance' and will not need an account to access the content.

Subject: Making Decisions with Data: Understanding Hypothesis Testing & Statistical Significance

(Optional message may have a maximum of 1000 characters.)

Citing articles via

Email alerts, affiliations.

Recent Content
Browse Issues
All Content
Info for Authors
Info for Librarians
Editorial Team
Online ISSN 1938-4211
Print ISSN 0002-7685
Copyright © 2024 National Association of Biology Teachers. All rights reserved.

Stay Informed

Disciplines.

Ancient World
Anthropology
Communication
Criminology & Criminal Justice
Film & Media Studies
Food & Wine
Browse All Disciplines
Browse All Courses
Book Authors
Booksellers
Instructions
Journal Authors
Journal Editors
Media & Journalists
Planned Giving

About UC Press

Press Releases
Seasonal Catalog
Acquisitions Editors
Customer Service
Exam/Desk Requests
Media Inquiries
Print-Disability
Rights & Permissions
UC Press Foundation
© Copyright 2024 by the Regents of the University of California. All rights reserved. Privacy policy Accessibility

This Feature Is Available To Subscribers Only

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Publications
Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

Advanced Search
Journal List
Open Cardiovasc Med J

Insights in Hypothesis Testing and Making Decisions in Biomedical Research

Varin sacha.

1 Collège de Villamont, Lausanne, Switzerland

Demosthenes B. Panagiotakos

2 School of Health Science and Education, Harokopio University, Athens, Greece

It is a fact that p values are commonly used for inference in biomedical and other social fields of research. Unfortunately, the role of p value is very often misused and misinterpreted; that is why it has been recommended the use of resampling methods, like the bootstrap method, to calculate the confidence interval, which provides more robust results for inference than does p value. In this review a discussion is made about the use of p values through hypothesis testing and its alternatives using resampling methods to develop confidence intervals of the tested statistic or effect measure.

BRIEF HISTORY OF HYPOTHESIS TESTING

At first it has to be clarified that a "significance test" is different to a "hypothesis test". Many textbooks, especially in social and biomedical sciences, mix these two approaches to a logically flawed mishmash, which is referred as "null-hypothesis significance test". However, null-hypothesis significance test is a combination of ideas developed in the 1920s and 1930s, primarily by Ronald Fisher (1925) and Jerzy Neyman & Egon Pearson (1933) [ 1 ]. These two testing approaches are not philosophically compatible even if they are technically related. Fisher developed tests of significance as an inferential tool. The main reason was to walk away from the subjectivism inherent to Bayesian inference ( i.e. , namely in the form of giving equal prior probabilities to hypotheses) and substitute a more objective approach. However, Fisher’s tests also depend on two other important elements: research methodology (Fisher pioneered experimental control, random allocation to groups, etc. ) and small samples. Neyman & Pearson liked Fisher’s approach, although lacked a strong mathematical foundation. As their theory progressed, the approach stopped being an improvement on Fisher’s approach and became a different approach. The main differences between Fisher and Neyman & Pearson approaches are both philosophical and technical. Philosophically, Neyman and Pearson’s approach assumes a known hypotheses, and it is based on repeated sampling from the same population, focuses on decision making, and aims to control decision errors in the long run. Thus, it can be considered as less inferential and more deductive. Technically, Neyman and Pearson’s approach uses Fisher’s tests of significance, but also incorporates other elements, like effect sizes, Type II errors, and the power of the statistical test. Neyman and Pearson also incorporated other methodological improvements, such as random sampling [ 2 - 15 ].

Significance test and hypothesis test are based on the assumption of a (statistical) null hypothesis, i.e. , a statement that there is no relationship, e.g. , no difference between treatment effects on an outcome. This is a mere technical requirement giving a statistical context that is required to apply probabilistic calculations. In reference to the approach suggested by Fisher, a "significance test" considers only the null hypothesis and gives a p value which is a continuous empirical measure of the "significance of the results" (given the considered null hypothesis). This measure has no particular meaning and it is not calibrated to some kind of relevance. It is just a value between 0 and 1, referring on how likely is to observe "more extreme results" given the null hypothesis. According to the approach suggested by Neyman & Pearson, a "hypothesis test" is actually a test about an alternative hypothesis, which refers to a "minimally relevant effect" (and not about "some non-zero effect" as the null hypothesis). These tests are designed to control error-rates and allow a balance on the expected cost/benefit ratios that are associated with the actions taken based on the test results. To perform such tests, it must be specified a minimally relevant effect and also acceptable error rates. After the experiment or the study is conducted, the decision is actually about rejecting (or not) a hypothesis. So either the "null hypothesis" is not rejected, which means that the assumed effect was not relevant, or the alternative hypothesis is accepted, which means that the effect was relevant. Note that there is no point where the "truthfulness" of an effect is discussed. This does not matter in statistical hypothesis testing. The only thing that matters is what actions are taken based on an effect that is considered relevant [ 2 - 15 ].

Major Problems Using the p Values as Result of a Hypothesis Test

Many investigators, in various research fields refer to Neyman & Pearson hypothesis tests and their associated p values. Indeed, the p value is a widely used tool for inference in studies. However, despite the numerous books, papers and other scientific literature published on this topic, there still seems to be serious misuses and misinterpretations of the p value. According to Daniel Goodman, "a p value is the right answer to the wrong question" [ 1 ]. A summary is given by Joseph Lawrence that presented at least four different major problems associated with the use of the p values [ 16 ]:

" P values are often misinterpreted as the probability of the null hypothesis, given the data, when in fact they are calculated assuming the null hypothesis to be true."
"Researchers often use p values to “dichotomize” results into “important” or “unimportant” depending on whether p is less or greater than a significance level, e.g. , 5%, respectively. However, there is not much difference between p -values of 0.049 and 0.051, so that the cut off of 0.05 is considered arbitrary."
" P values concentrate attention away from the magnitude of the actual effect sizes. For example, one could have a p value that is very small, but is associated with a clinically unimportant difference. This is especially prone to occur in cases where the sample size is large. Conversely, results of potentially great clinical interest are not necessarily ruled out if p > 0.05, especially in studies with small sample sizes. Therefore, one should not confuse statistical significance with practical or clinical importance."
"The null hypothesis is almost never exactly true. In fact it is hard to believed that the null hypothesis, H o : µ = µ 0 , is correct! Since the null hypothesis is almost surely false to begin with, it makes little sense to test it. Instead, it should rational to start with the question “by how much are the two treatments different?"

There are so many major problems related to p values that most statisticians now recommend against their use, in favour of, for example, confidence intervals. In a previous publication entitled “The value of p -value in biomedical research” alternatives for evaluating the observed evidence were briefly discussed [ 17 ]. Here, a thorough review on hypothesis testing is presented.

Hypothesis Testing Versus Confidence Intervals

Researchers from many fields are very familiar with calculating and interpreting the outcome of empirical research based solely on the p value [ 18 ]. The commonly suggested alternative to the use of the hypothesis tests is the use of confidence intervals [ 19 - 26 ]. As it has been suggested by Wood (2014), “ the idea of confidence intervals is to use the data to derive an interval within a specified level of confidence that the population parameter will lie with confidence " [ 19 ]. Two-sided hypothesis tests are dual to two-sided confidence intervals. A parameter value is in the (1-α)x100% confidence interval if-and-only-if the hypothesis test whose assumed value under the null hypothesis is that parameter value accepts the null at level α . The principle is called the duality of hypothesis testing and confidence interval [ 20 ]. Thus, there is a one-to-one relationship between one-sided tests and one-sided confidence intervals. In addition, there is an exact relationship only if the standard error used in both the confidence intervals and the statistical tests, is identical.

However, many statisticians nowadays avoid using any hypothesis tests, since their interpretations may vary and the derived p values cannot, generally, be interpreted in meaningful ways. Moreover, it is adopted that by calculating the confidence interval, researchers may have “insights” to the nature of their data and the evaluated associations, whereas p values tell absolutely nothing. Criticism against hypothesis testing, dating for most of them more than 50 years ago, suggests that "they (hypotheses tests) are not a contribution to science" (Savage, 1957 in Gerrodette, 2011, p. 404) or "a serious impediment to the interpretation of data" (Skipper & et al ., 1967, in Gerrodette, 2011, p. 404), or "worse than irrelevant" (Nelder, 1985 in Gerrodette, 2011, p. 404) or "completely devoid of practical utility" (Finney, 1989, in Gerrodette, 2011, p. 404) [ 1 ].

Nevertheless, and despite all the criticism, the hypothesis tests and their associated p values are still widely prevalent. According to Lesaffre (2008) [ 21 ], it is important to note that a 95% confidence interval bears more information than a p value, since the confidence interval has a much easier interpretation and allows better comparability of results across different trials. Moreover, in meta-analyses, the confidence interval is the preferred tool for making statistical inference. According to Wood (2104) [ 19 ], a (1-α)x100% confidence interval provides directly the strength of the effect, as well as the uncertainty due to sampling error, in an obvious way by providing the width of the interval. The information displayed is not trivial or obvious like the NHST conclusions may be, and misinterpretations seem far less likely than for NHSTs. Thus, the use of the confidence intervals has the potential to avoid many of the widely acknowledged problems of NHSTs and p values [ 19 ]. Moreover, several high-impact journals, especially in health sciences and other fields, as well as Societies ( e.g. , American Psychological Association’s (APA) Task Force on Statistical Inference (TFSI)) have strongly discouraged the use of p values to prefer point and interval estimates of the effect size ( i.e. , odds ratios, relative risks, etc ), instead of p values, as an expression of uncertainty resulting from limited sample size and also encouraging the use of Bayesian methodology [ 21 - 22 ]. It is not surprising to note that, a century following its introduction many researchers still poorly understand the exact meaning of p value, resulting in many miss-interpretations [ 17 ].

Advantages of The Confidence Interval Versus p Value

It is now common belief that researchers should be interested in defining the size of the effect of a measured outcome, rather than a simple indication of whether it is or not statistically significant [ 23 ]. On the basis of the sample data, confidence intervals present a range of alternative values in which the unknown population value for such an effect is likely to lie. Indeed, confidence intervals give different information and have different interpretation than p values, since they specify a range of alternative values for the actual effect size (since they present the results directly on the scale of the measurement), while p values don't. Moreover, confidence intervals make the extent of uncertainty salient, which a p value cannot do. Since the mid 1980’s, Gardner & Altman suggested that " a confidence interval produces a move from a single value estimate - such as the sample mean, difference between sample means, etc – to a range of values that are considered to be plausible for the population " [ 24 ].

Resampling Techniques

It is known from basic statistics that many statistical criteria ( e.g. , t-test) are asymptotically normally distributed, but the normal distribution may not be always a good approximation to their actual sampling distribution in the empirical samples derived from experiments, clinical trials or observational surveys. Indeed, the validity of the traditional statistical inference is mostly based on a theorem known as the Central Limit Theorem, which stipulates that, under fairly general conditions, the sampling distribution of the test statistic can be approximated by a normal distribution or under more limited assumptions by the t- or chi-square distributions. Based on these assumptions confidence intervals and p values are then calculated; however, with a considerable level of doubts and concerns.

The point of resampling method is to not rely on the Gaussian assumptions. Resampling is a methodology suggested in early 1940s in order to estimate the precision of statistics, like means, medians, proportions, odds ratios, relative risks, etc. , by using k -subsets of size m (< n) of the originally collected data ( i.e. , jackknife method) or drawing a random set of data with replacement from the original set ( i.e. , bootstrap method). Indeed, when the Gaussian assumptions are not true, the validity of the classical inferential statistics tends to be undermined. It is in these situations that the resampling methods really come to the rescue. The main idea of resampling is to obtain an empirical distribution of the test statistics based on what it is observed and use it to approximate the true, but unknown, distribution of the test statistic. An important advantage of this approach is that it could be applied for many statistics ( e.g. , means, median, etc. ) and effect size measures ( e.g. , correlation coefficients, odds ratios, relative risks, etc. ) with the use of computer software. Specifically, there are different types of resampling methods, i.e. , bootstrap, jackknife, cross-validation (also called rotation estimation and permutation test, or randomization exact test). In classical parametric test the observed statistics are compared to the theoretical sampling distributions, while in resampling methods we start from theoretical distributions, which makes them innovative approaches [ 25 ]. Among all resampling methods, bootstrap is certainly the most frequently used procedure [ 26 ]. So, the resampling methods can be a substantial improvement over the traditional inference, since a confidence interval for the true value of unknown statistic or effect size measure has a much more concrete interpretation than has the p value from a statistical test, although there is still no guarantee.

However, at this point it should be mentioned that it is often the sampling distribution of various effect sizes to be highly skewed, thus, the traditional confidence intervals will not work well, since they will always be skewed, too. Symmetrical confidence intervals are appropriate for a few things such as means and linear regression coefficients, but they are inappropriate for many other measures [ 27 ]. So, it is better not to assume a symmetric confidence interval for a measure of association, and to start from the assumption that they are not normally distributed. The empirical distribution derived for example from the bootstrap method does not assume that the distribution is symmetrical.

In conclusion, it could be recommend for inferencial purposes, to present the results from studies using confidence interval of the statistics and effect size measures of interest, rather than hypothesis test and its associated p value. Moreover, depending on the statistics of interest, bootstrap techniques or another resampling methods are also recommended, because these techniques are independent of the shape of the underlying distribution and can easily performed using software.

CONFLICT OF INTEREST

The authors confirm that this article content has no conflict of interest.

ACKNOWLEDGEMENTS

Declared None.

Bipolar Disorder
Therapy Center
When To See a Therapist
Types of Therapy
Best Online Therapy
Best Couples Therapy
Best Family Therapy
Managing Stress
Sleep and Dreaming
Understanding Emotions
Self-Improvement
Healthy Relationships
Student Resources
Personality Types
Verywell Mind Insights
2023 Verywell Mind 25
Mental Health in the Classroom
Editorial Process
Meet Our Review Board
Crisis Support

How to Write a Great Hypothesis

Hypothesis Format, Examples, and Tips

Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

Amy Morin, LCSW, is a psychotherapist and international bestselling author. Her books, including "13 Things Mentally Strong People Don't Do," have been translated into more than 40 languages. Her TEDx talk, "The Secret of Becoming Mentally Strong," is one of the most viewed talks of all time.

Verywell / Alex Dos Diaz

The Scientific Method

Hypothesis Format

Falsifiability of a hypothesis, operational definitions, types of hypotheses, hypotheses examples.

Collecting Data

Frequently Asked Questions

A hypothesis is a tentative statement about the relationship between two or more variables. It is a specific, testable prediction about what you expect to happen in a study.

One hypothesis example would be a study designed to look at the relationship between sleep deprivation and test performance might have a hypothesis that states: "This study is designed to assess the hypothesis that sleep-deprived people will perform worse on a test than individuals who are not sleep-deprived."

This article explores how a hypothesis is used in psychology research, how to write a good hypothesis, and the different types of hypotheses you might use.

The Hypothesis in the Scientific Method

In the scientific method , whether it involves research in psychology, biology, or some other area, a hypothesis represents what the researchers think will happen in an experiment. The scientific method involves the following steps:

Forming a question
Performing background research
Creating a hypothesis
Designing an experiment
Collecting data
Analyzing the results
Drawing conclusions
Communicating the results

The hypothesis is a prediction, but it involves more than a guess. Most of the time, the hypothesis begins with a question which is then explored through background research. It is only at this point that researchers begin to develop a testable hypothesis. Unless you are creating an exploratory study, your hypothesis should always explain what you expect to happen.

In a study exploring the effects of a particular drug, the hypothesis might be that researchers expect the drug to have some type of effect on the symptoms of a specific illness. In psychology, the hypothesis might focus on how a certain aspect of the environment might influence a particular behavior.

Remember, a hypothesis does not have to be correct. While the hypothesis predicts what the researchers expect to see, the goal of the research is to determine whether this guess is right or wrong. When conducting an experiment, researchers might explore a number of factors to determine which ones might contribute to the ultimate outcome.

In many cases, researchers may find that the results of an experiment do not support the original hypothesis. When writing up these results, the researchers might suggest other options that should be explored in future studies.

In many cases, researchers might draw a hypothesis from a specific theory or build on previous research. For example, prior research has shown that stress can impact the immune system. So a researcher might hypothesize: "People with high-stress levels will be more likely to contract a common cold after being exposed to the virus than people who have low-stress levels."

In other instances, researchers might look at commonly held beliefs or folk wisdom. "Birds of a feather flock together" is one example of folk wisdom that a psychologist might try to investigate. The researcher might pose a specific hypothesis that "People tend to select romantic partners who are similar to them in interests and educational level."

Elements of a Good Hypothesis

So how do you write a good hypothesis? When trying to come up with a hypothesis for your research or experiments, ask yourself the following questions:

Is your hypothesis based on your research on a topic?
Can your hypothesis be tested?
Does your hypothesis include independent and dependent variables?

Before you come up with a specific hypothesis, spend some time doing background research. Once you have completed a literature review, start thinking about potential questions you still have. Pay attention to the discussion section in the journal articles you read . Many authors will suggest questions that still need to be explored.

To form a hypothesis, you should take these steps:

Collect as many observations about a topic or problem as you can.
Evaluate these observations and look for possible causes of the problem.
Create a list of possible explanations that you might want to explore.
After you have developed some possible hypotheses, think of ways that you could confirm or disprove each hypothesis through experimentation. This is known as falsifiability.

In the scientific method , falsifiability is an important part of any valid hypothesis. In order to test a claim scientifically, it must be possible that the claim could be proven false.

Students sometimes confuse the idea of falsifiability with the idea that it means that something is false, which is not the case. What falsifiability means is that if something was false, then it is possible to demonstrate that it is false.

One of the hallmarks of pseudoscience is that it makes claims that cannot be refuted or proven false.

A variable is a factor or element that can be changed and manipulated in ways that are observable and measurable. However, the researcher must also define how the variable will be manipulated and measured in the study.

For example, a researcher might operationally define the variable " test anxiety " as the results of a self-report measure of anxiety experienced during an exam. A "study habits" variable might be defined by the amount of studying that actually occurs as measured by time.

These precise descriptions are important because many things can be measured in a number of different ways. One of the basic principles of any type of scientific research is that the results must be replicable. By clearly detailing the specifics of how the variables were measured and manipulated, other researchers can better understand the results and repeat the study if needed.

Some variables are more difficult than others to define. How would you operationally define a variable such as aggression ? For obvious ethical reasons, researchers cannot create a situation in which a person behaves aggressively toward others.

In order to measure this variable, the researcher must devise a measurement that assesses aggressive behavior without harming other people. In this situation, the researcher might utilize a simulated task to measure aggressiveness.

Hypothesis Checklist

Does your hypothesis focus on something that you can actually test?
Does your hypothesis include both an independent and dependent variable?
Can you manipulate the variables?
Can your hypothesis be tested without violating ethical standards?

The hypothesis you use will depend on what you are investigating and hoping to find. Some of the main types of hypotheses that you might use include:

Simple hypothesis : This type of hypothesis suggests that there is a relationship between one independent variable and one dependent variable.
Complex hypothesis : This type of hypothesis suggests a relationship between three or more variables, such as two independent variables and a dependent variable.
Null hypothesis : This hypothesis suggests no relationship exists between two or more variables.
Alternative hypothesis : This hypothesis states the opposite of the null hypothesis.
Statistical hypothesis : This hypothesis uses statistical analysis to evaluate a representative sample of the population and then generalizes the findings to the larger group.
Logical hypothesis : This hypothesis assumes a relationship between variables without collecting data or evidence.

A hypothesis often follows a basic format of "If {this happens} then {this will happen}." One way to structure your hypothesis is to describe what will happen to the dependent variable if you change the independent variable .

The basic format might be: "If {these changes are made to a certain independent variable}, then we will observe {a change in a specific dependent variable}."

A few examples of simple hypotheses:

"Students who eat breakfast will perform better on a math exam than students who do not eat breakfast."
Complex hypothesis: "Students who experience test anxiety before an English exam will get lower scores than students who do not experience test anxiety."
"Motorists who talk on the phone while driving will be more likely to make errors on a driving course than those who do not talk on the phone."

Examples of a complex hypothesis include:

"People with high-sugar diets and sedentary activity levels are more likely to develop depression."
"Younger people who are regularly exposed to green, outdoor areas have better subjective well-being than older adults who have limited exposure to green spaces."

Examples of a null hypothesis include:

"Children who receive a new reading intervention will have scores different than students who do not receive the intervention."
"There will be no difference in scores on a memory recall task between children and adults."

Examples of an alternative hypothesis:

"Children who receive a new reading intervention will perform better than students who did not receive the intervention."
"Adults will perform better on a memory task than children."

Collecting Data on Your Hypothesis

Once a researcher has formed a testable hypothesis, the next step is to select a research design and start collecting data. The research method depends largely on exactly what they are studying. There are two basic types of research methods: descriptive research and experimental research.

Descriptive Research Methods

Descriptive research such as case studies , naturalistic observations , and surveys are often used when it would be impossible or difficult to conduct an experiment . These methods are best used to describe different aspects of a behavior or psychological phenomenon.

Once a researcher has collected data using descriptive methods, a correlational study can then be used to look at how the variables are related. This type of research method might be used to investigate a hypothesis that is difficult to test experimentally.

Experimental Research Methods

Experimental methods are used to demonstrate causal relationships between variables. In an experiment, the researcher systematically manipulates a variable of interest (known as the independent variable) and measures the effect on another variable (known as the dependent variable).

Unlike correlational studies, which can only be used to determine if there is a relationship between two variables, experimental methods can be used to determine the actual nature of the relationship—whether changes in one variable actually cause another to change.

A Word From Verywell

The hypothesis is a critical part of any scientific exploration. It represents what researchers expect to find in a study or experiment. In situations where the hypothesis is unsupported by the research, the research still has value. Such research helps us better understand how different aspects of the natural world relate to one another. It also helps us develop new hypotheses that can then be tested in the future.

Some examples of how to write a hypothesis include:

"Staying up late will lead to worse test performance the next day."
"People who consume one apple each day will visit the doctor fewer times each year."
"Breaking study sessions up into three 20-minute sessions will lead to better test results than a single 60-minute study session."

The four parts of a hypothesis are:

The research question
The independent variable (IV)
The dependent variable (DV)
The proposed relationship between the IV and DV

Castillo M. The scientific method: a need for something better? . AJNR Am J Neuroradiol. 2013;34(9):1669-71. doi:10.3174/ajnr.A3401

Nevid J. Psychology: Concepts and Applications. Wadworth, 2013.

By Kendra Cherry, MSEd Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

How Do You Formulate (Important) Hypotheses?

Open Access
First Online: 03 December 2022

Cite this chapter

You have full access to this open access chapter

James Hiebert 6 ,
Jinfa Cai 7 ,
Stephen Hwang 7 ,
Anne K Morris 6 &
Charles Hohensee 6

Part of the book series: Research in Mathematics Education ((RME))

10k Accesses

Building on the ideas in Chap. 1, we describe formulating, testing, and revising hypotheses as a continuing cycle of clarifying what you want to study, making predictions about what you might find together with developing your reasons for these predictions, imagining tests of these predictions, revising your predictions and rationales, and so on. Many resources feed this process, including reading what others have found about similar phenomena, talking with colleagues, conducting pilot studies, and writing drafts as you revise your thinking. Although you might think you cannot predict what you will find, it is always possible—with enough reading and conversations and pilot studies—to make some good guesses. And, once you guess what you will find and write out the reasons for these guesses you are on your way to scientific inquiry. As you refine your hypotheses, you can assess their research importance by asking how connected they are to problems your research community really wants to solve.

You have full access to this open access chapter, Download chapter PDF

Part I. Getting Started

We want to begin by addressing a question you might have had as you read the title of this chapter. You are likely to hear, or read in other sources, that the research process begins by asking research questions . For reasons we gave in Chap. 1 , and more we will describe in this and later chapters, we emphasize formulating, testing, and revising hypotheses. However, it is important to know that asking and answering research questions involve many of the same activities, so we are not describing a completely different process.

We acknowledge that many researchers do not actually begin by formulating hypotheses. In other words, researchers rarely get a researchable idea by writing out a well-formulated hypothesis. Instead, their initial ideas for what they study come from a variety of sources. Then, after they have the idea for a study, they do lots of background reading and thinking and talking before they are ready to formulate a hypothesis. So, for readers who are at the very beginning and do not yet have an idea for a study, let’s back up. Where do research ideas come from?

There are no formulas or algorithms that spawn a researchable idea. But as you begin the process, you can ask yourself some questions. Your answers to these questions can help you move forward.

What are you curious about? What are you passionate about? What have you wondered about as an educator? These are questions that look inward, questions about yourself.

What do you think are the most pressing educational problems? Which problems are you in the best position to address? What change(s) do you think would help all students learn more productively? These are questions that look outward, questions about phenomena you have observed.

What are the main areas of research in the field? What are the big questions that are being asked? These are questions about the general landscape of the field.

What have you read about in the research literature that caught your attention? What have you read that prompted you to think about extending the profession’s knowledge about this? What have you read that made you ask, “I wonder why this is true?” These are questions about how you can build on what is known in the field.

What are some research questions or testable hypotheses that have been identified by other researchers for future research? This, too, is a question about how you can build on what is known in the field. Taking up such questions or hypotheses can help by providing some existing scaffolding that others have constructed.

What research is being done by your immediate colleagues or your advisor that is of interest to you? These are questions about topics for which you will likely receive local support.

Exercise 2.1

Brainstorm some answers for each set of questions. Record them. Then step back and look at the places of intersection. Did you have similar answers across several questions? Write out, as clearly as you can, the topic that captures your primary interest, at least at this point. We will give you a chance to update your responses as you study this book.

Part II. Paths from a General Interest to an Informed Hypothesis

There are many different paths you might take from conceiving an idea for a study, maybe even a vague idea, to formulating a prediction that leads to an informed hypothesis that can be tested. We will explore some of the paths we recommend.

We will assume you have completed Exercise 2.1 in Part I and have some written answers to the six questions that preceded it as well as a statement that describes your topic of interest. This very first statement could take several different forms: a description of a problem you want to study, a question you want to address, or a hypothesis you want to test. We recommend that you begin with one of these three forms, the one that makes most sense to you. There is an advantage to using all three and flexibly choosing the one that is most meaningful at the time and for a particular study. You can then move from one to the other as you think more about your research study and you develop your initial idea. To get a sense of how the process might unfold, consider the following alternative paths.

Beginning with a Prediction If You Have One

Sometimes, when you notice an educational problem or have a question about an educational situation or phenomenon, you quickly have an idea that might help solve the problem or answer the question. Here are three examples.

You are a teacher, and you noticed a problem with the way the textbook presented two related concepts in two consecutive lessons. Almost as soon as you noticed the problem, it occurred to you that the two lessons could be taught more effectively in the reverse order. You predicted better outcomes if the order was reversed, and you even had a preliminary rationale for why this would be true.

You are a graduate student and you read that students often misunderstand a particular aspect of graphing linear functions. You predicted that, by listening to small groups of students working together, you could hear new details that would help you understand this misconception.

You are a curriculum supervisor and you observed sixth-grade classrooms where students were learning about decimal fractions. After talking with several experienced teachers, you predicted that beginning with percentages might be a good way to introduce students to decimal fractions.

We begin with the path of making predictions because we see the other two paths as leading into this one at some point in the process (see Fig. 2.1 ). Starting with this path does not mean you did not sense a problem you wanted to solve or a question you wanted to answer.

The process flow diagram of initiation of hypothesis. It starts with a problem situation and leads to a prediction following the question to the hypothesis.

Three Pathways to Formulating Informed Hypotheses

Notice that your predictions can come from a variety of sources—your own experience, reading, and talking with colleagues. Most likely, as you write out your predictions you also think about the educational problem for which your prediction is a potential solution. Writing a clear description of the problem will be useful as you proceed. Notice also that it is easy to change each of your predictions into a question. When you formulate a prediction, you are actually answering a question, even though the question might be implicit. Making that implicit question explicit can generate a first draft of the research question that accompanies your prediction. For example, suppose you are the curriculum supervisor who predicts that teaching percentages first would be a good way to introduce decimal fractions. In an obvious shift in form, you could ask, “In what ways would teaching percentages benefit students’ initial learning of decimal fractions?”

The picture has a difference between a question and a prediction: a question simply asks what you will find whereas a prediction also says what you expect to find; written.

There are advantages to starting with the prediction form if you can make an educated guess about what you will find. Making a prediction forces you to think now about several things you will need to think about at some point anyway. It is better to think about them earlier rather than later. If you state your prediction clearly and explicitly, you can begin to ask yourself three questions about your prediction: Why do I expect to observe what I am predicting? Why did I make that prediction? (These two questions essentially ask what your rationale is for your prediction.) And, how can I test to see if it’s right? This is where the benefits of making predictions begin.

Asking yourself why you predicted what you did, and then asking yourself why you answered the first “why” question as you did, can be a powerful chain of thought that lays the groundwork for an increasingly accurate prediction and an increasingly well-reasoned rationale. For example, suppose you are the curriculum supervisor above who predicted that beginning by teaching percentages would be a good way to introduce students to decimal fractions. Why did you make this prediction? Maybe because students are familiar with percentages in everyday life so they could use what they know to anchor their thinking about hundredths. Why would that be helpful? Because if students could connect hundredths in percentage form with hundredths in decimal fraction form, they could bring their meaning of percentages into decimal fractions. But how would that help? If students understood that a decimal fraction like 0.35 meant 35 of 100, then they could use their understanding of hundredths to explore the meaning of tenths, thousandths, and so on. Why would that be useful? By continuing to ask yourself why you gave the previous answer, you can begin building your rationale and, as you build your rationale, you will find yourself revisiting your prediction, often making it more precise and explicit. If you were the curriculum supervisor and continued the reasoning in the previous sentences, you might elaborate your prediction by specifying the way in which percentages should be taught in order to have a positive effect on particular aspects of students’ understanding of decimal fractions.

Developing a Rationale for Your Predictions

Keeping your initial predictions in mind, you can read what others already know about the phenomenon. Your reading can now become targeted with a clear purpose.

By reading and talking with colleagues, you can develop more complete reasons for your predictions. It is likely that you will also decide to revise your predictions based on what you learn from your reading. As you develop sound reasons for your predictions, you are creating your rationales, and your predictions together with your rationales become your hypotheses. The more you learn about what is already known about your research topic, the more refined will be your predictions and the clearer and more complete your rationales. We will use the term more informed hypotheses to describe this evolution of your hypotheses.

The picture says you develop sound reasons for your predictions, you are creating your rationales, and your predictions together with your rationales become your hypotheses.

Developing more informed hypotheses is a good thing because it means: (1) you understand the reasons for your predictions; (2) you will be able to imagine how you can test your hypotheses; (3) you can more easily convince your colleagues that they are important hypotheses—they are hypotheses worth testing; and (4) at the end of your study, you will be able to more easily interpret the results of your test and to revise your hypotheses to demonstrate what you have learned by conducting the study.

Imagining Testing Your Hypotheses

Because we have tied together predictions and rationales to constitute hypotheses, testing hypotheses means testing predictions and rationales. Testing predictions means comparing empirical observations, or findings, with the predictions. Testing rationales means using these comparisons to evaluate the adequacy or soundness of the rationales.

Imagining how you might test your hypotheses does not mean working out the details for exactly how you would test them. Rather, it means thinking ahead about how you could do this. Recall the descriptor of scientific inquiry: “experience carefully planned in advance” (Fisher, 1935). Asking whether predictions are testable and whether rationales can be evaluated is simply planning in advance.

You might read that testing hypotheses means simply assessing whether predictions are correct or incorrect. In our view, it is more useful to think of testing as a means of gathering enough information to compare your findings with your predictions, revise your rationales, and propose more accurate predictions. So, asking yourself whether hypotheses can be tested means asking whether information could be collected to assess the accuracy of your predictions and whether the information will show you how to revise your rationales to sharpen your predictions.

Cycles of Building Rationales and Planning to Test Your Predictions

Scientific reasoning is a dialogue between the possible and the actual, an interplay between hypotheses and the logical expectations they give rise to: there is a restless to-and-fro motion of thought, the formulation and rectification of hypotheses (Medawar, 1982 , p.72).

As you ask yourself about how you could test your predictions, you will inevitably revise your rationales and sharpen your predictions. Your hypotheses will become more informed, more targeted, and more explicit. They will make clearer to you and others what, exactly, you plan to study.

When will you know that your hypotheses are clear and precise enough? Because of the way we define hypotheses, this question asks about both rationales and predictions. If a rationale you are building lets you make a number of quite different predictions that are equally plausible rather than a single, primary prediction, then your hypothesis needs further refinement by building a more complete and precise rationale. Also, if you cannot briefly describe to your colleagues a believable way to test your prediction, then you need to phrase it more clearly and precisely.

Each time you strengthen your rationales, you might need to adjust your predictions. And, each time you clarify your predictions, you might need to adjust your rationales. The cycle of going back and forth to keep your predictions and rationales tightly aligned has many payoffs down the road. Every decision you make from this point on will be in the interests of providing a transparent and convincing test of your hypotheses and explaining how the results of your test dictate specific revisions to your hypotheses. As you make these decisions (described in the succeeding chapters), you will probably return to clarify your hypotheses even further. But, you will be in a much better position, at each point, if you begin with well-informed hypotheses.

Beginning by Asking Questions to Clarify Your Interests

Instead of starting with predictions, a second path you might take devotes more time at the beginning to asking questions as you zero in on what you want to study. Some researchers suggest you start this way (e.g., Gournelos et al., 2019 ). Specifically, with this second path, the first statement you write to express your research interest would be a question. For example, you might ask, “Why do ninth-grade students change the way they think about linear equations after studying quadratic equations?” or “How do first graders solve simple arithmetic problems before they have been taught to add and subtract?”

The first phrasing of your question might be quite general or vague. As you think about your question and what you really want to know, you are likely to ask follow-up questions. These questions will almost always be more specific than your first question. The questions will also express more clearly what you want to know. So, the question “How do first graders solve simple arithmetic problems before they have been taught to add and subtract” might evolve into “Before first graders have been taught to solve arithmetic problems, what strategies do they use to solve arithmetic problems with sums and products below 20?” As you read and learn about what others already know about your questions, you will continually revise your questions toward clearer and more explicit and more precise versions that zero in on what you really want to know. The question above might become, “Before they are taught to solve arithmetic problems, what strategies do beginning first graders use to solve arithmetic problems with sums and products below 20 if they are read story problems and given physical counters to help them keep track of the quantities?”

Imagining Answers to Your Questions

If you monitor your own thinking as you ask questions, you are likely to begin forming some guesses about answers, even to the early versions of the questions. What do students learn about quadratic functions that influences changes in their proportional reasoning when dealing with linear functions? It could be that if you analyze the moments during instruction on quadratic equations that are extensions of the proportional reasoning involved in solving linear equations, there are times when students receive further experience reasoning proportionally. You might predict that these are the experiences that have a “backward transfer” effect (Hohensee, 2014 ).

These initial guesses about answers to your questions are your first predictions. The first predicted answers are likely to be hunches or fuzzy, vague guesses. This simply means you do not know very much yet about the question you are asking. Your first predictions, no matter how unfocused or tentative, represent the most you know at the time about the question you are asking. They help you gauge where you are in your thinking.

Shifting to the Hypothesis Formulation and Testing Path

Research questions can play an important role in the research process. They provide a succinct way of capturing your research interests and communicating them to others. When colleagues want to know about your work, they will often ask “What are your research questions?” It is good to have a ready answer.

However, research questions have limitations. They do not capture the three images of scientific inquiry presented in Chap. 1 . Due, in part, to this less expansive depiction of the process, research questions do not take you very far. They do not provide a guide that leads you through the phases of conducting a study.

Consequently, when you can imagine an answer to your research question, we recommend that you move onto the hypothesis formulation and testing path. Imagining an answer to your question means you can make plausible predictions. You can now begin clarifying the reasons for your predictions and transform your early predictions into hypotheses (predictions along with rationales). We recommend you do this as soon as you have guesses about the answers to your questions because formulating, testing, and revising hypotheses offers a tool that puts you squarely on the path of scientific inquiry. It is a tool that can guide you through the entire process of conducting a research study.

This does not mean you are finished asking questions. Predictions are often created as answers to questions. So, we encourage you to continue asking questions to clarify what you want to know. But your target shifts from only asking questions to also proposing predictions for the answers and developing reasons the answers will be accurate predictions. It is by predicting answers, and explaining why you made those predictions, that you become engaged in scientific inquiry.

Cycles of Refining Questions and Predicting Answers

An example might provide a sense of how this process plays out. Suppose you are reading about Vygotsky’s ( 1987 ) zone of proximal development (ZPD), and you realize this concept might help you understand why your high school students had trouble learning exponential functions. Maybe they were outside this zone when you tried to teach exponential functions. In order to recognize students who would benefit from instruction, you might ask, “How can I identify students who are within the ZPD around exponential functions?” What would you predict? Maybe students in this ZPD are those who already had knowledge of related functions. You could write out some reasons for this prediction, like “students who understand linear and quadratic functions are more likely to extend their knowledge to exponential functions.” But what kind of data would you need to test this? What would count as “understanding”? Are linear and quadratic the functions you should assess? Even if they are, how could you tell whether students who scored well on tests of linear and quadratic functions were within the ZPD of exponential functions? How, in the end, would you measure what it means to be in this ZPD? So, asking a series of reasonable questions raised some red flags about the way your initial question was phrased, and you decide to revise it.

You set the stage for revising your question by defining ZPD as the zone within which students can solve an exponential function problem by making only one additional conceptual connection between what they already know and exponential functions. Your revised question is, “Based on students’ knowledge of linear and quadratic functions, which students are within the ZPD of exponential functions?” This time you know what kind of data you need: the number of conceptual connections students need to bridge from their knowledge of related functions to exponential functions. How can you collect these data? Would you need to see into the minds of the students? Or, are there ways to test the number of conceptual connections someone makes to move from one topic to another? Do methods exist for gathering these data? You decide this is not realistic, so you now have a choice: revise the question further or move your research in a different direction.

Notice that we do not use the term research question for all these early versions of questions that begin clarifying for yourself what you want to study. These early versions are too vague and general to be called research questions. In this book, we save the term research question for a question that comes near the end of the work and captures exactly what you want to study . By the time you are ready to specify a research question, you will be thinking about your study in terms of hypotheses and tests. When your hypotheses are in final form and include clear predictions about what you will find, it will be easy to state the research questions that accompany your predictions.

To reiterate one of the key points of this chapter: hypotheses carry much more information than research questions. Using our definition, hypotheses include predictions about what the answer might be to the question plus reasons for why you think so. Unlike research questions, hypotheses capture all three images of scientific inquiry presented in Chap. 1 (planning, observing and explaining, and revising one’s thinking). Your hypotheses represent the most you know, at the moment, about your research topic. The same cannot be said for research questions.

Beginning with a Research Problem

When you wrote answers to the six questions at the end of Part I of this chapter, you might have identified a research interest by stating it as a problem. This is the third path you might take to begin your research. Perhaps your description of your problem might look something like this: “When I tried to teach my middle school students by presenting them with a challenging problem without showing them how to solve similar problems, they didn’t exert much effort trying to find a solution but instead waited for me to show them how to solve the problem.” You do not have a specific question in mind, and you do not have an idea for why the problem exists, so you do not have a prediction about how to solve it. Writing a statement of this problem as clearly as possible could be the first step in your research journey.

As you think more about this problem, it will feel natural to ask questions about it. For example, why did some students show more initiative than others? What could I have done to get them started? How could I have encouraged the students to keep trying without giving away the solution? You are now on the path of asking questions—not research questions yet, but questions that are helping you focus your interest.

As you continue to think about these questions, reflect on your own experience, and read what others know about this problem, you will likely develop some guesses about the answers to the questions. They might be somewhat vague answers, and you might not have lots of confidence they are correct, but they are guesses that you can turn into predictions. Now you are on the hypothesis-formulation-and-testing path. This means you are on the path of asking yourself why you believe the predictions are correct, developing rationales for the predictions, asking what kinds of empirical observations would test your predictions, and refining your rationales and predictions as you read the literature and talk with colleagues.

A simple diagram that summarizes the three paths we have described is shown in Fig. 2.1 . Each row of arrows represents one pathway for formulating an informed hypothesis. The dotted arrows in the first two rows represent parts of the pathways that a researcher may have implicitly travelled through already (without an intent to form a prediction) but that ultimately inform the researcher’s development of a question or prediction.

Part III. One Researcher’s Experience Launching a Scientific Inquiry

Martha was in her third year of her doctoral program and beginning to identify a topic for her dissertation. Based on (a) her experience as a high school mathematics teacher and a curriculum supervisor, (b) the reading she has done to this point, and (c) her conversations with her colleagues, she has developed an interest in what kinds of professional development experiences (let’s call them learning opportunities [LOs] for teachers) are most effective. Where does she go from here?

Exercise 2.2

Before you continue reading, please write down some suggestions for Martha about where she should start.

A natural thing for Martha to do at this point is to ask herself some additional questions, questions that specify further what she wants to learn: What kinds of LOs do most teachers experience? How do these experiences change teachers’ practices and beliefs? Are some LOs more effective than others? What makes them more effective?

To focus her questions and decide what she really wants to know, she continues reading but now targets her reading toward everything she can find that suggests possible answers to these questions. She also talks with her colleagues to get more ideas about possible answers to these or related questions. Over several weeks or months, she finds herself being drawn to questions about what makes LOs effective, especially for helping teachers teach more conceptually. She zeroes in on the question, “What makes LOs for teachers effective for improving their teaching for conceptual understanding?”

This question is more focused than her first questions, but it is still too general for Martha to define a research study. How does she know it is too general? She uses two criteria. First, she notices that the predictions she makes about the answers to the question are all over the place; they are not constrained by the reasons she has assembled for her predictions. One prediction is that LOs are more effective when they help teachers learn content. Martha makes this guess because previous research suggests that effective LOs for teachers include attention to content. But this rationale allows lots of different predictions. For example, LOs are more effective when they focus on the content teachers will teach; LOs are more effective when they focus on content beyond what teachers will teach so teachers see how their instruction fits with what their students will encounter later; and LOs are more effective when they are tailored to the level of content knowledge participants have when they begin the LOs. The rationale she can provide at this point does not point to a particular prediction.

A second measure Martha uses to decide her question is too general is that the predictions she can make regarding the answers seem very difficult to test. How could she test, for example, whether LOs should focus on content beyond what teachers will teach? What does “content beyond what teachers teach” mean? How could you tell whether teachers use their new knowledge of later content to inform their teaching?

Before anticipating what Martha’s next question might be, it is important to pause and recognize how predicting the answers to her questions moved Martha into a new phase in the research process. As she makes predictions, works out the reasons for them, and imagines how she might test them, she is immersed in scientific inquiry. This intellectual work is the main engine that drives the research process. Also notice that revisions in the questions asked, the predictions made, and the rationales built represent the updated thinking (Chap. 1 ) that occurs as Martha continues to define her study.

Based on all these considerations and her continued reading, Martha revises the question again. The question now reads, “Do LOs that engage middle school mathematics teachers in studying mathematics content help teachers teach this same content with more of a conceptual emphasis?” Although she feels like the question is more specific, she realizes that the answer to the question is either “yes” or “no.” This, by itself, is a red flag. Answers of “yes” or “no” would not contribute much to understanding the relationships between these LOs for teachers and changes in their teaching. Recall from Chap. 1 that understanding how things work, explaining why things work, is the goal of scientific inquiry.

Martha continues by trying to understand why she believes the answer is “yes.” When she tries to write out reasons for predicting “yes,” she realizes that her prediction depends on a variety of factors. If teachers already have deep knowledge of the content, the LOs might not affect them as much as other teachers. If the LOs do not help teachers develop their own conceptual understanding, they are not likely to change their teaching. By trying to build the rationale for her prediction—thus formulating a hypothesis—Martha realizes that the question still is not precise and clear enough.

Martha uses what she learned when developing the rationale and rephrases the question as follows: “ Under what conditions do LOs that engage middle school mathematics teachers in studying mathematics content help teachers teach this same content with more of a conceptual emphasis?” Through several additional cycles of thinking through the rationale for her predictions and how she might test them, Martha specifies her question even further: “Under what conditions do middle school teachers who lack conceptual knowledge of linear functions benefit from LOs that engage them in conceptual learning of linear functions as assessed by changes in their teaching toward a more conceptual emphasis on linear functions?”

Each version of Martha’s question has become more specific. This has occurred as she has (a) identified a starting condition for the teachers—they lack conceptual knowledge of linear functions, (b) specified the mathematics content as linear functions, and (c) included a condition or purpose of the LO—it is aimed at conceptual learning.

Because of the way Martha’s question is now phrased, her predictions will require thinking about the conditions that could influence what teachers learn from the LOs and how this learning could affect their teaching. She might predict that if teachers engaged in LOs that extended over multiple sessions, they would develop deeper understanding which would, in turn, prompt changes in their teaching. Or she might predict that if the LOs included examples of how their conceptual learning could translate into different instructional activities for their students, teachers would be more likely to change their teaching. Reasons for these predictions would likely come from research about the effects of professional development on teachers’ practice.

As Martha thinks about testing her predictions, she realizes it will probably be easier to measure the conditions under which teachers are learning than the changes in the conceptual emphasis in their instruction. She makes a note to continue searching the literature for ways to measure the “conceptualness” of teaching.

As she refines her predictions and expresses her reasons for the predictions, she formulates a hypothesis (in this case several hypotheses) that will guide her research. As she makes predictions and develops the rationales for these predictions, she will probably continue revising her question. She might decide, for example, that she is not interested in studying the condition of different numbers of LO sessions and so decides to remove this condition from consideration by including in her question something like “. . . over five 2-hour sessions . . .”

At this point, Martha has developed a research question, articulated a number of predictions, and developed rationales for them. Her current question is: “Under what conditions do middle school teachers who lack conceptual knowledge of linear functions benefit from five 2-hour LO sessions that engage them in conceptual learning of linear functions as assessed by changes in their teaching toward a more conceptual emphasis on linear functions?” Her hypothesis is:

Prediction: Participating teachers will show changes in their teaching with a greater emphasis on conceptual understanding, with larger changes on linear function topics directly addressed in the LOs than on other topics.

Brief Description of Rationale: (1) Past research has shown correlations between teachers’ specific mathematics knowledge of a topic and the quality of their teaching of that topic. This does not mean an increase in knowledge causes higher quality teaching but it allows for that possibility. (2) Transfer is usually difficult for teachers, but the examples developed during the LO sessions will help them use what they learned to teach for conceptual understanding. This is because the examples developed during the LO sessions are much like those that will be used by the teachers. So larger changes will be found when teachers are teaching the linear function topics addressed in the LOs.

Notice it is more straightforward to imagine how Martha could test this prediction because it is more precise than previous predictions. Notice also that by asking how to test a particular prediction, Martha will be faced with a decision about whether testing this prediction will tell her something she wants to learn. If not, she can return to the research question and consider how to specify it further and, perhaps, constrain further the conditions that could affect the data.

As Martha formulates her hypotheses and goes through multiple cycles of refining her question(s), articulating her predictions, and developing her rationales, she is constantly building the theoretical framework for her study. Because the theoretical framework is the topic for Chap. 3 , we will pause here and pick up Martha’s story in the next chapter. Spoiler alert: Martha’s experience contains some surprising twists and turns.

Before leaving Martha, however, we point out two aspects of the process in which she has been engaged. First, it can be useful to think about the process as identifying (1) the variables targeted in her predictions, (2) the mechanisms she believes explain the relationships among the variables, and (3) the definitions of all the terms that are special to her educational problem. By variables, we mean things that can be measured and, when measured, can take on different values. In Martha’s case, the variables are the conceptualness of teaching and the content topics addressed in the LOs. The mechanisms are cognitive processes that enable teachers to see the relevance of what they learn in PD to their own teaching and that enable the transfer of learning from one setting to another. Definitions are the precise descriptions of how the important ideas relevant to the research are conceptualized. In Martha’s case, definitions must be provided for terms like conceptual understanding, linear functions, LOs, each of the topics related to linear functions, instructional setting, and knowledge transfer.

A second aspect of the process is a practice that Martha acquired as part of her graduate program, a practice that can go unnoticed. Martha writes out, in full sentences, her thinking as she wrestles with her research question, her predictions of the answers, and the rationales for her predictions. Writing is a tool for organizing thinking and we recommend you use it throughout the scientific inquiry process. We say more about this at the end of the chapter.

Here are the questions Martha wrote as she developed a clearer sense of what question she wanted to answer and what answer she predicted. The list shows the increasing refinement that occurred as she continued to read, think, talk, and write.

Early questions: What kinds of LOs do most teachers experience? How do these experiences change teachers’ practices and beliefs? Are some LOs more effective than others? What makes them more effective?

First focused question: What makes LOs for teachers effective for improving their teaching for conceptual understanding?

Question after trying to predict the answer and imagining how to test the prediction: Do LOs that engage middle school mathematics teachers in studying mathematics content help teachers teach this same content with more of a conceptual emphasis?

Question after developing an initial rationale for her prediction: Under what conditions do LOs that engage middle school mathematics teachers in studying mathematics content help teachers teach this same content with more of a conceptual emphasis?

Question after developing a more precise prediction and richer rationale: Under what conditions do middle school teachers who lack conceptual knowledge of linear functions benefit from five 2-hour LO sessions that engage them in conceptual learning of linear functions as assessed by changes in their teaching toward a more conceptual emphasis on linear functions?

Part IV. An Illustrative Dialogue

The story of Martha described the major steps she took to refine her thinking. However, there is a lot of work that went on behind the scenes that wasn’t part of the story. For example, Martha had conversations with fellow students and professors that sharpened her thinking. What do these conversations look like? Because they are such an important part of the inquiry process, it will be helpful to “listen in” on the kinds of conversations that students might have with their advisors.

Here is a dialogue between a beginning student, Sam (S), and their advisor, Dr. Avery (A). They are meeting to discuss data Sam collected for a course project. The dialogue below is happening very early on in Sam’s conceptualization of the study, prior even to systematic reading of the literature.

Thanks for meeting with me today. As you know, I was able to collect some data for a course project a few weeks ago, but I’m having trouble analyzing the data, so I need your help. Let me try to explain the problem. As you know, I wanted to understand what middle-school teachers do to promote girls’ achievement in a mathematics class. I conducted four observations in each of three teachers’ classrooms. I also interviewed each teacher once about the four lessons I observed, and I interviewed two girls from each of the teachers’ classes. Obviously, I have a ton of data. But when I look at all these data, I don’t really know what I learned about my topic. When I was observing the teachers, I thought I might have observed some ways the teachers were promoting girls’ achievement, but then I wasn’t sure how to interpret my data. I didn’t know if the things I was observing were actually promoting girls’ achievement.

What were some of your observations?

Well, in a couple of my classroom observations, teachers called on girls to give an answer, even when the girls didn’t have their hands up. I thought that this might be a way that teachers were promoting the girls’ achievement. But then the girls didn’t say anything about that when I interviewed them and also the teachers didn’t do it in every class. So, it’s hard to know what effect, if any, this might have had on their learning or their motivation to learn. I didn’t want to ask the girls during the interview specifically about the teacher calling on them, and without the girls bringing it up themselves, I didn’t know if it had any effect.

Well, why didn’t you want to ask the girls about being called on?

Because I wanted to leave it as open as possible; I didn’t want to influence what they were going to say. I didn’t want to put words in their mouths. I wanted to know what they thought the teacher was doing that promoted their mathematical achievement and so I only asked the girls general questions, like “Do you think the teacher does things to promote girls’ mathematical achievement?” and “Can you describe specific experiences you have had that you believe do and do not promote your mathematical achievement?”

So then, how did they answer those general questions?

Well, with very general answers, such as that the teacher knows their names, offers review sessions, grades their homework fairly, gives them opportunities to earn extra credit, lets them ask questions, and always answers their questions. Nothing specific that helps me know what teaching actions specifically target girls’ mathematics achievement.

OK. Any ideas about what you might do next?

Well, I remember that when I was planning this data collection for my course, you suggested I might want to be more targeted and specific about what I was looking for. I can see now that more targeted questions would have made my data more interpretable in terms of connecting teaching actions to the mathematical achievement of girls. But I just didn’t want to influence what the girls would say.

Yes, I remember when you were planning your course project, you wanted to keep it open. You didn’t want to miss out on discovering something new and interesting. What do you think now about this issue?

Well, I still don’t want to put words in their mouths. I want to know what they think. But I see that if I ask really open questions, I have no guarantee they will talk about what I want them to talk about. I guess I still like the idea of an open study, but I see that it’s a risky approach. Leaving the questions too open meant I didn’t constrain their responses and there were too many ways they could interpret and answer the questions. And there are too many ways I could interpret their responses.

By this point in the dialogue, Sam has realized that open data (i.e., data not testing a specific prediction) is difficult to interpret. In the next part, Dr. Avery explains why collecting open data was not helping Sam achieve goals for her study that had motivated collecting open data in the first place.

Yes, I totally agree. Even for an experienced researcher, it can be difficult to make sense of this kind of open, messy data. However, if you design a study with a more specific focus, you can create questions for participants that are more targeted because you will be interested in their answers to these specific questions. Let’s reflect back on your data collection. What can you learn from it for the future?

When I think about it now, I realize that I didn’t think about the distinction between all the different constructs at play in my study, and I didn’t choose which one I was focusing on. One construct was the teaching moves that teachers think could be promoting achievement. Another is what teachers deliberately do to promote girls’ mathematics achievement, if anything. Another was the teaching moves that actually do support girls’ mathematics achievement. Another was what teachers were doing that supported girls’ mathematics achievement versus the mathematics achievement of all students. Another was students’ perception of what their teacher was doing to promote girls’ mathematics achievement. I now see that any one of these constructs could have been the focus of a study and that I didn’t really decide which of these was the focus of my course project prior to collecting data.

So, since you told me that the topic of this course project is probably what you’ll eventually want to study for your dissertation, which of these constructs are you most interested in?

I think I’m more interested in the teacher moves that teachers deliberately do to promote girls’ achievement. But I’m still worried about asking teachers directly and getting too specific about what they do because I don’t want to bias what they will say. And I chose qualitative methods and an exploratory design because I thought it would allow for a more open approach, an approach that helps me see what’s going on and that doesn’t bias or predetermine the results.

Well, it seems to me you are conflating three issues. One issue is how to conduct an unbiased study. Another issue is how specific to make your study. And the third issue is whether or not to choose an exploratory or qualitative study design. Those three issues are not the same. For example, designing a study that’s more open or more exploratory is not how researchers make studies fair and unbiased. In fact, it would be quite easy to create an open study that is biased. For example, you could ask very open questions and then interpret the responses in a way that unintentionally, and even unknowingly, aligns with what you were hoping the findings would say. Actually, you could argue that by adding more specificity and narrowing your focus, you’re creating constraints that prevent bias. The same goes for an exploratory or qualitative study; they can be biased or unbiased. So, let’s talk about what is meant by getting more specific. Within your new focus on what teachers deliberately do, there are many things that would be interesting to look at, such as teacher moves that address math anxiety, moves that allow girls to answer questions more frequently, moves that are specifically fitted to student thinking about specific mathematical content, and so on. What are one or two things that are most interesting to you? One way to answer this question is by thinking back to where your interest in this topic began.

In the preceding part of the dialogue, Dr. Avery explained how the goals Sam had for their study were not being met with open data. In the next part, Sam begins to articulate a prediction, which Sam and Dr. Avery then sharpen.

Actually, I became interested in this topic because of an experience I had in college when I was in a class of mostly girls. During whole class discussions, we were supposed to critically evaluate each other’s mathematical thinking, but we were too polite to do that. Instead, we just praised each other’s work. But it was so different in our small groups. It seemed easier to critique each other’s thinking and to push each other to better solutions in small groups. I began wondering how to get girls to be more critical of each other’s thinking in a whole class discussion in order to push everyone’s thinking.

Okay, this is great information. Why not use this idea to zoom-in on a more manageable and interpretable study? You could look specifically at how teachers support girls in critically evaluating each other’s thinking during whole class discussions. That would be a much more targeted and specific topic. Do you have predictions about what teachers could do in that situation, keeping in mind that you are looking specifically at girls’ mathematical achievement, not students in general?

Well, what I noticed was that small groups provided more social and emotional support for girls, whereas the whole class discussion did not provide that same support. The girls felt more comfortable critiquing each other’s thinking in small groups. So, I guess I predict that when the social and emotional supports that are present in small groups are extended to the whole class discussion, girls would be more willing to evaluate each other’s mathematical thinking critically during whole class discussion . I guess ultimately, I’d like to know how the whole class discussion could be used to enhance, rather than undermine, the social and emotional support that is present in the small groups.

Okay, then where would you start? Would you start with a study of what the teachers say they will do during whole class discussion and then observe if that happens during whole class discussion?

But part of my prediction also involves the small groups. So, I’d also like to include small groups in my study if possible. If I focus on whole groups, I won’t be exploring what I am interested in. My interest is broader than just the whole class discussion.

That makes sense, but there are many different things you could look at as part of your prediction, more than you can do in one study. For instance, if your prediction is that when the social and emotional supports that are present in small groups are extended to whole class discussions, girls would be more willing to evaluate each other’s mathematical thinking critically during whole class discussions , then you could ask the following questions: What are the social and emotional supports that are present in small groups?; In which small groups do they exist?; Is it groups that are made up only of girls?; Does every small group do this, and for groups that do this, when do these supports get created?; What kinds of small group activities that teachers ask them to work on are associated with these supports?; Do the same social and emotional supports that apply to small groups even apply to whole group discussion?

All your questions make me realize that my prediction about extending social and emotional supports to whole class discussions first requires me to have a better understanding of the social and emotional supports that exist in small groups. In fact, I first need to find out whether those supports commonly exist in small groups or is that just my experience working in small groups. So, I think I will first have to figure out what small groups do to support each other and then, in a later study, I could ask a teacher to implement those supports during whole class discussions and find out how you can do that. Yeah, now I’m seeing that.

The previous part of the dialogue illustrates how continuing to ask questions about one’s initial prediction is a good way to make it more and more precise (and researchable). In the next part, we see how developing a precise prediction has the added benefit of setting the researcher up for future studies.

Yes, I agree that for your first study, you should probably look at small groups. In other words, you should focus on only a part of your prediction for now, namely the part that says there are social and emotional supports in small groups that support girls in critiquing each other’s thinking . That begins to sharpen the focus of your prediction, but you’ll want to continue to refine it. For example, right now, the question that this prediction leads to is a question with a yes or no answer, but what you’ve said so far suggests to me that you are looking for more than that.

Yes, I want to know more than just whether there are supports. I’d like to know what kinds. That’s why I wanted to do a qualitative study.

Okay, this aligns more with my thinking about research as being prediction driven. It’s about collecting data that would help you revise your existing predictions into better ones. What I mean is that you would focus on collecting data that would allow you to refine your prediction, make it more nuanced, and go beyond what is already known. Does that make sense, and if so, what would that look like for your prediction?

Oh yes, I like that. I guess that would mean that, based on the data I collect for this next study, I could develop a more refined prediction that, for example, more specifically identifies and differentiates between different kinds of social and emotional supports that are present in small groups, or maybe that identifies the kinds of small groups that they occur in, or that predicts when and how frequently or infrequently they occur, or about the features of the small group tasks in which they occur, etc. I now realize that, although I chose qualitative research to make my study be more open, really the reason qualitative research fits my purposes is because it will allow me to explore fine-grained aspects of social and emotional supports that may exist for girls in small groups.

Yes, exactly! And then, based on the data you collect, you can include in your revised prediction those new fine-grained aspects. Furthermore, you will have a story to tell about your study in your written report, namely the story about your evolving prediction. In other words, your written report can largely tell how you filled out and refined your prediction as you learned more from carrying out the study. And even though you might not use them right away, you are also going to be able to develop new predictions that you would not have even thought of about social and emotional supports in small groups and your aim of extending them to whole-class discussions, had you not done this study. That will set you up to follow up on those new predictions in future studies. For example, you might have more refined ideas after you collect the data about the goals for critiquing student thinking in small groups versus the goals for critiquing student thinking during whole class discussion. You might even begin to think that some of the social and emotional supports you observe are not even replicable or even applicable to or appropriate for whole-class discussions, because the supports play different roles in different contexts. So, to summarize what I’m saying, what you look at in this study, even though it will be very focused, sets you up for a research program that will allow you to more fully investigate your broader interest in this topic, where each new study builds on your prior body of work. That’s why it is so important to be explicit about the best place to start this research, so that you can build on it.

I see what you are saying. We started this conversation talking about my course project data. What I think I should have done was figure out explicitly what I needed to learn with that study with the intention of then taking what I learned and using it as the basis for the next study. I didn’t do that, and so I didn’t collect data that pushed forward my thinking in ways that would guide my next study. It would be as if I was starting over with my next study.

Sam and Dr. Avery have just explored how specifying a prediction reveals additional complexities that could become fodder for developing a systematic research program. Next, we watch Sam beginning to recognize the level of specificity required for a prediction to be testable.

One thing that would have really helped would have been if you had had a specific prediction going into your data collection for your course project.

Well, I didn’t really have much of an explicit prediction in mind when I designed my methods.

Think back, you must have had some kind of prediction, even if it was implicit.

Well, yes, I guess I was predicting that teachers would enact moves that supported girls’ mathematical achievement. And I observed classrooms to identify those teacher moves, I interviewed teachers to ask them about the moves I observed, and I interviewed students to see if they mentioned those moves as promoting their mathematical achievement. The goal of my course project was to identify teacher moves that support girls’ mathematical achievement. And my specific research question was: What teacher moves support girls’ mathematical achievement?

So, really you were asking the teacher and students to show and tell you what those moves are and the effects of those moves, as a result putting the onus on your participants to provide the answers to your research question for you. I have an idea, let’s try a thought experiment. You come up with data collection methods for testing the prediction that there are social and emotional supports in small groups that support girls in critiquing each other’s thinking that still puts the onus on the participants. And then I’ll see if I can think of data collection methods that would not put the onus on the participants.

Hmm, well. .. I guess I could simply interview girls who participated in small groups and ask them “are there social and emotional supports that you use in small groups that support your group in critiquing each other’s thinking and if so, what are they?” In that case, I would be putting the onus on them to be aware of the social dynamics of small groups and to have thought about these constructs as much as I have. Okay now can you continue the thought experiment? What might the data collection methods look like if I didn’t put the onus on the participants?

First, I would pick a setting in which it was only girls at this point to reduce the number of variables. Then, personally I would want to observe a lot of groups of girls interacting in groups around tasks. I would be looking for instances when the conversation about students’ ideas was shut down and instances when the conversation about students’ ideas involved critiquing of ideas and building on each other’s thinking. I would also look at what happened just before and during those instances, such as: did the student continue to talk after their thinking was critiqued, did other students do anything to encourage the student to build on their own thinking (i.e., constructive criticism) or how did they support or shut down continued participation. In fact, now that I think about it, “critiquing each other’s thinking” can be defined in a number of different ways. I could mean just commenting on someone’s thinking, judging correctness and incorrectness, constructive criticism that moves the thinking forward, etc. If you put the onus on the participants to answer your research question, you are stuck with their definition, and they won’t have thought about this very much, if at all.

I think that what you are also saying is that my definitions would affect my data collection. If I think that critiquing each other’s thinking means that the group moves their thinking forward toward more valid and complete mathematical solutions, then I’m going to focus on different moves than if I define it another way, such as just making a comment on each other’s thinking and making each other feel comfortable enough to keep participating. In fact, am I going to look at individual instances of critiquing or look at entire sequences in which the critiquing leads to a goal? This seems like a unit of analysis question, and I would need to develop a more nuanced prediction that would make explicit what that unit of analysis is.

I agree, your definition of “critiquing each other’s thinking” could entirely change what you are predicting. One prediction could be based on defining critiquing as a one-shot event in which someone makes one comment on another person’s thinking. In this case the prediction would be that there are social and emotional supports in small groups that support girls in making an evaluative comment on another student’s thinking. Another prediction could be based on defining critiquing as a back-and-forth process in which the thinking gets built on and refined. In that case, the prediction would be something like that there are social and emotional supports in small groups that support girls in critiquing each other’s thinking in ways that do not shut down the conversation but that lead to sustained conversations that move each other toward more valid and complete solutions.

Well, I think I am more interested in the second prediction because it is more compatible with my long-term interests, which are that I’m interested in extending small group supports to whole class discussions. The second prediction is more appropriate for eventually looking at girls in whole class discussion. During whole class discussion, the teacher tries to get a sustained conversation going that moves the students’ thinking forward. So, if I learn about small group supports that lead to sustained conversations that move each other toward more valid and complete solutions , those supports might transfer to whole class discussions.

In the previous part of the dialogue, Dr. Avery and Sam showed how narrowing down a prediction to one that is testable requires making numerous important decisions, including how to define the constructs referred to in the prediction. In the final part of the dialogue, Dr. Avery and Sam begin to outline the reading Sam will have to do to develop a rationale for the specific prediction.

Do you see how your prediction and definitions are getting more and more specific? You now need to read extensively to further refine your prediction.

Well, I should probably read about micro dynamics of small group interactions, anything about interactions in small groups, and what is already known about small group interactions that support sustained conversations that move students’ thinking toward more valid and complete solutions. I guess I could also look at research on whole-class discussion methods that support sustained conversations that move the class to more mathematically valid and complete solutions, because it might give me ideas for what to look for in the small groups. I might also need to focus on research about how learners develop understandings about a particular subject matter so that I know what “more valid and complete solutions” look like. I also need to read about social and emotional supports but focus on how they support students cognitively, rather than in other ways.

Sounds good, let’s get together after you have processed some of this literature and we can talk about refining your prediction based on what you read and also the methods that will best suit testing that prediction.

Great! Thanks for meeting with me. I feel like I have a much better set of tools that push my own thinking forward and allow me to target something specific that will lead to more interpretable data.

Part V. Is It Always Possible to Formulate Hypotheses?

In Chap. 1 , we noted you are likely to read that research does not require formulating hypotheses. Some sources describe doing research without making predictions and developing rationales for these predictions. Some researchers say you cannot always make predictions—you do not know enough about the situation. In fact, some argue for the value of not making predictions (e.g., Glaser & Holton, 2004 ; Merton, 1968 ; Nemirovsky, 2011 ). These are important points of view, so we will devote this section to discussing them.

Can You Always Predict What You Will Find?

One reason some researchers say you do not need to make predictions is that it can be difficult to imagine what you will find. This argument comes up most often for descriptive studies. Suppose you want to describe the nature of a situation you do not know much about. Can you still make a prediction about what you will find? We believe that, although you do not know exactly what you will find, you probably have a hunch or, at a minimum, a very fuzzy idea. It would be unusual to ask a question about a situation you want to know about without at least a fuzzy inkling of what you might find. The original question just would not occur to you. We acknowledge you might have only a vague idea of what you will find and you might not have much confidence in your prediction. However, we expect if you monitor your own thinking you will discover you have developed a suspicion along the way, regardless how vague the suspicion might be. Through the cyclic process we discussed above, that suspicion or hunch gradually evolves and turns into a prediction.

The Benefits of Making Predictions Even When They Are Wrong: An Example from the 1970s

One of us was a graduate student at the University of Wisconsin in the late 1970s, assigned as a research assistant to a project that was investigating young children’s thinking about simple arithmetic. A new curriculum was being written, and the developers wanted to know how to introduce the earliest concepts and skills to kindergarten and first-grade children. The directors of the project did not know what to expect because, at the time, there was little research on five- and six-year-olds’ pre-instruction strategies for adding and subtracting.

After consulting what literature was available, talking with teachers, analyzing the nature of different types of addition and subtraction problems, and debating with each other, the research team formulated some hypotheses about children’s performance. Following the usual assumptions at the time and recognizing the new curriculum would introduce the concepts, the researchers predicted that, before instruction, most children would not be able to solve the problems. Based on the rationale that some young children did not yet recognize the simple form for written problems (e.g., 5 + 3 = ___), the researchers predicted that the best chance for success would be to read problems as stories (e.g., Jesse had 5 apples and then found 3 more. How many does she have now?). They reasoned that, even though children would have difficulty on all the problems, some story problems would be easier because the semantic structure is easier to follow. For example, they predicted the above story about adding 3 apples to 5 would be easier than a problem like, “Jesse had some apples in the refrigerator. She put in 2 more and now has 6. How many were in the refrigerator at the beginning?” Based on the rationale that children would need to count to solve the problems and that it can be difficult to keep track of the numbers, they predicted children would be more successful if they were given counters. Finally, accepting the common reasoning that larger numbers are more difficult than smaller numbers, they predicted children would be more successful if all the numbers in a problem were below 10.

Although these predictions were not very precise and the rationales were not strongly convincing, these hypotheses prompted the researchers to design the study to test their predictions. This meant they would collect data by presenting a variety of problems under a variety of conditions. Because the goal was to describe children’s thinking, problems were presented to students in individual interviews. Problems with different semantic structures were included, counters were available for some problems but not others, and some problems had sums to 9 whereas others had sums to 20 or more.

The punchline of this story is that gathering data under these conditions, prompted by the predictions, made all the difference in what the researchers learned. Contrary to predictions, children could solve addition and subtraction problems before instruction. Counters were important because almost all the solution strategies were based on counting which meant that memory was an issue because many strategies require counting in two ways simultaneously. For example, subtracting 4 from 7 was usually solved by counting down from 7 while counting up from 1 to 4 to keep track of counting down. Because children acted out the stories with their counters, the semantic structure of the story was also important. Stories that were easier to read and write were also easier to solve.

To make a very long story very short, other researchers were, at about the same time, reporting similar results about children’s pre-instruction arithmetic capabilities. A clear pattern emerged regarding the relative difficulty of different problem types (semantic structures) and the strategies children used to solve each type. As the data were replicated, the researchers recognized that kindergarten and first-grade teachers could make good use of this information when they introduced simple arithmetic. This is how Cognitively Guided Instruction (CGI) was born (Carpenter et al., 1989 ; Fennema et al., 1996 ).

To reiterate, the point of this example is that the study conducted to describe children’s thinking would have looked quite different if the researchers had made no predictions. They would have had no reason to choose the particular problems and present them under different conditions. The fact that some of the predictions were completely wrong is not the point. The predictions created the conditions under which the predictions were tested which, in turn, created learning opportunities for the researchers that would not have existed without the predictions. The lesson is that even research that aims to simply describe a phenomenon can benefit from hypotheses. As signaled in Chap. 1 , this also serves as another example of “failing productively.”

Suggestions for What to Do When You Do Not Have Predictions

There likely are exceptions to our claim about being able to make a prediction about what you will find. For example, there could be rare cases where researchers truly have no idea what they will find and can come up with no predictions and even no hunches. And, no research has been reported on related phenomena that would offer some guidance. If you find yourself in this position, we suggest one of three approaches: revise your question, conduct a pilot study, or choose another question.

Because there are many advantages to making predictions explicit and then writing out the reasons for these predictions, one approach is to adjust your question just enough to allow you to make a prediction. Perhaps you can build on descriptions that other researchers have provided for related situations and consider how you can extend this work. Building on previous descriptions will enable you to make predictions about the situation you want to describe.

A second approach is to conduct a small pilot study or, better, a series of small pilot studies to develop some preliminary ideas of what you might find. If you can identify a small sample of participants who are similar to those in your study, you can try out at least some of your research plans to help make and refine your predictions. As we detail later, you can also use pilot studies to check whether key aspects of your methods (e.g., tasks, interview questions, data collection methods) work as you expect.

A third approach is to return to your list of interests and choose one that has been studied previously. Sometimes this is the wisest choice. It is very difficult for beginning researchers to conduct research in brand-new areas where no hunches or predictions are possible. In addition, the contributions of this research can be limited. Recall the earlier story about one of us “failing productively” by completing a dissertation in a somewhat new area. If, after an exhaustive search, you find that no one has investigated the phenomenon in which you are interested or even related phenomena, it can be best to move in a different direction. You will read recommendations in other sources to find a “gap” in the research and develop a study to “fill the gap.” This can be helpful advice if the gap is very small. However, if the gap is large, too large to predict what you might find, the study will present severe challenges. It will be more productive to extend work that has already been done than to launch into an entirely new area.

Should You Always Try to Predict What You Will Find?

In short, our answer to the question in the heading is “yes.” But this calls for further explanation.

Suppose you want to observe a second-grade classroom in order to investigate how students talk about adding and subtracting whole numbers. You might think, “I don’t want to bias my thinking; I want to be completely open to what I see in the classroom.” Sam shared a similar point of view at the beginning of the dialogue: “I wanted to leave it as open as possible; I didn’t want to influence what they were going to say.” Some researchers say that beginning your research study by making predictions is inappropriate precisely because it will bias your observations and results. The argument is that by bringing a set of preconceptions, you will confirm what you expected to find and be blind to other observations and outcomes. The following quote illustrates this view: “The first step in gaining theoretical sensitivity is to enter the research setting with as few predetermined ideas as possible—especially logically deducted, a priori hypotheses. In this posture, the analyst is able to remain sensitive to the data by being able to record events and detect happenings without first having them filtered through and squared with pre-existing hypotheses and biases” (Glaser, 1978, pp. 2–3).

We take a different point of view. In fact, we believe there are several compelling reasons for making your predictions explicit.

Making Your Predictions Explicit Increases Your Chances of Productive Observations

Because your predictions are an extension of what is already known, they prepare you to identify more nuanced relationships that can advance our understanding of a phenomenon. For example, rather than simply noticing, in a general sense, that students talking about addition and subtraction leads them to better understandings, you might, based on your prediction, make the specific observation that talking about addition and subtraction in a particular way helps students to think more deeply about a particular concept related to addition and subtraction. Going into a study without predictions can bring less sensitivity rather than more to the study of a phenomenon. Drawing on knowledge about related phenomena by reading the literature and conducting pilot studies allows you to be much more sensitive and your observations to be more productive.

Making Your Predictions Explicit Allows You to Guard Against Biases

Some genres and methods of educational research are, in fact, rooted in philosophical traditions (e.g., Husserl, 1929/ 1973 ) that explicitly call for researchers to temporarily “bracket” or set aside existing theory as well as their prior knowledge and experience to better enter into the experience of the participants in the research. However, this does not mean ignoring one’s own knowledge and experience or turning a blind eye to what has been learned by others. Much more than the simplistic image of emptying one’s mind of preconceptions and implicit biases (arguably an impossible feat to begin with), the goal is to be as reflective as possible about one’s prior knowledge and conceptions and as transparent as possible about how they may guide observations and shape interpretations (Levitt et al., 2018 ).

We believe it is better to be honest about the predictions you are almost sure to have because then you can deliberately plan to minimize the chances they will influence what you find and how you interpret your results. For starters, it is important to recognize that acknowledging you have some guesses about what you will find does not make them more influential. Because you are likely to have them anyway, we recommend being explicit about what they are. It is easier to deal with biases that are explicit than those that lurk in the background and are not acknowledged.

What do we mean by “deal with biases”? Some journals require you to include a statement about your “positionality” with respect to the participants in your study and the observations you are making to gather data. Formulating clear hypotheses is, in our view, a direct response to this request. The reasons for your predictions are your explicit statements about your positionality. Often there are methodological strategies you can use to protect the study from undue influences of bias. In other words, making your vague predictions explicit can help you design your study so you minimize the bias of your findings.

Making Your Predictions Explicit Can Help You See What You Did Not Predict

Making your predictions explicit does not need to blind you to what is different than expected. It does not need to force you to see only what you want to see. Instead, it can actually increase your sensitivity to noticing features of the situation that are surprising, features you did not predict. Results can stand out when you did not expect to see them.

In contrast, not bringing your biases to consciousness might subtly shift your attention away from these unexpected results in ways that you are not aware of. This path can lead to claiming no biases and no unexpected findings without being conscious of them. You cannot observe everything, and some things inevitably will be overlooked. If you have predicted what you will see, you can design your study so that the unexpected results become more salient rather than less.

Returning to the example of observing a second-grade classroom, we note that the field already knows a great deal about how students talk about addition and subtraction. Being cognizant of what others have observed allows you to enter the classroom with some clear predictions about what will happen. The rationales for these predictions are based on all the related knowledge you have before stepping into the classroom, and the predictions and rationales help you to better deal with what you see. This is partly because you are likely to be surprised by the things you did not anticipate. There is almost always something that will surprise you because your predictions will almost always be incomplete or too general. This sensitivity to the unanticipated—the sense of surprise that sparks your curiosity—is an indication of your openness to the phenomenon you are studying.

Making Your Predictions Explicit Allows You to Plan in Advance

Recall from Chap. 1 the descriptor of scientific inquiry: “Experience carefully planned in advance.” If you make no predictions about what might happen, it is very difficult, if not impossible, to plan your study in advance. Again, you cannot observe everything, so you must make decisions about what you will observe. What kind of data will you plan to collect? Why would you collect these data instead of others? If you have no idea what to expect, on what basis will you make these consequential decisions? Even if your predictions are vague and your rationales for the predictions are a bit shaky, at least they provide a direction for your plan. They allow you to explain why you are planning this study and collecting these data. They allow you to “carefully plan in advance.”

Making Your Predictions Explicit Allows You to Put Your Rationales in Harm’s Way

Rationales are developed to justify the predictions. Rationales represent your best reasoning about the research problem you are studying. How can you tell whether your reasoning is sound? You can try it out with colleagues. However, the best way to test it is to put it in “harm’s way” (Cobb, Confrey, diSessa, Lehrer, & Schauble, 2003 p. 10). And the best approach to putting your reasoning in harm’s way is to test the predictions it generates. Regardless if you are conducting a qualitative or quantitative study, rationales can be improved only if they generate testable predictions. This is possible only if predictions are explicit and precise. As we described earlier, rationales are evaluated for their soundness and refined in light of the specific differences between predictions and empirical observations.

Making Your Predictions Explicit Forces You to Organize and Extend Your (and the Field’s) Thinking

By writing out your predictions (even hunches or fuzzy guesses) and by reflecting on why you have these predictions and making these reasons explicit for yourself, you are advancing your thinking about the questions you really want to answer. This means you are making progress toward formulating your research questions and your final hypotheses. Making more progress in your own thinking before you conduct your study increases the chances your study will be of higher quality and will be exactly the study you intended. Making predictions, developing rationales, and imagining tests are tools you can use to push your thinking forward before you even collect data.

Suppose you wonder how preservice teachers in your university’s teacher preparation program will solve particular kinds of math problems. You are interested in this question because you have noticed several PSTs solve them in unexpected ways. As you ask the question you want to answer, you make predictions about what you expect to see. When you reflect on why you made these predictions, you realize that some PSTs might use particular solution strategies because they were taught to use some of them in an earlier course, and they might believe you expect them to solve the problems in these ways. By being explicit about why you are making particular predictions, you realize that you might be answering a different question than you intend (“How much do PSTs remember from previous courses?” or even “To what extent do PSTs believe different instructors have similar expectations?”). Now you can either change your question or change the design of your study (i.e., the sample of students you will use) or both. You are advancing your thinking by being explicit about your predictions and why you are making them.

The Costs of Not Making Predictions

Avoiding making predictions, for whatever reason, comes with significant costs. It prevents you from learning very much about your research topic. It would require not reading related research, not talking with your colleagues, and not conducting pilot studies because, if you do, you are likely to find a prediction creeping into your thinking. Not doing these things would forego the benefits of advancing your thinking before you collect data. It would amount to conducting the study with as little forethought as possible.

Part VI. How Do You Formulate Important Hypotheses?

We provided a partial answer in Chap. 1 to the question of a hypothesis’ importance when we encouraged considering the ultimate goal to which a study’s findings might contribute. You might want to reread Part III of Chap. 1 where we offered our opinions about the purposes of doing research. We also recommend reading the March 2019 editorial in the Journal for Research in Mathematics Education (Cai et al., 2019b ) in which we address what constitutes important educational research.

As we argued in Chap. 1 and in the March 2019 editorial, a worthy ultimate goal for educational research is to improve the learning opportunities for all students. However, arguments can be made for other ultimate goals as well. To gauge the importance of your hypotheses, think about how clearly you can connect them to a goal the educational community considers important. In addition, given the descriptors of scientific inquiry proposed in Chap. 1 , think about how testing your hypotheses will help you (and the community) understand what you are studying. Will you have a better explanation for the phenomenon after your study than before?

Although we address the question of importance again, and in more detail, in Chap. 5 , it is useful to know here that you can determine the significance or importance of your hypotheses when you formulate them. The importance need not depend on the data you collect or the results you report. The importance can come from the fact that, based on the results of your study, you will be able to offer revised hypotheses that help the field better understand an important issue. In large part, it is these revised hypotheses rather than the data that determine a study’s importance.

A critical caveat to this discussion is that few hypotheses are self-evidently important. They are important only if you make the case for their importance. Even if you follow closely the guidelines we suggest for formulating an important hypothesis, you must develop an argument that convinces others. This argument will be presented in the research paper you write.

The picture has a few hypotheses that are self-evidently important. They are important only if you make the case for their importance; written.

Consider Martha’s hypothesis presented earlier. When we left Martha, she predicted that “Participating teachers will show changes in their teaching with a greater emphasis on conceptual understanding with larger changes on linear function topics directly addressed in the LOs than on other topics.” For researchers and educators not intimately familiar with this area of research, it is not apparent why someone should spend a year or more conducting a dissertation to test this prediction. Her rationale, summarized earlier, begins to describe why this could be an important hypothesis. But it is by writing a clear argument that explains her rationale to readers that she will convince them of its importance.

How Martha fills in her rationale so she can create a clear written argument for its importance is taken up in Chap. 3 . As we indicated, Martha’s work in this regard led her to make some interesting decisions, in part due to her own assessment of what was important.

Part VII. Beginning to Write the Research Paper for Your Study

It is common to think that researchers conduct a study and then, after the data are collected and analyzed, begin writing the paper about the study. We recommend an alternative, especially for beginning researchers. We believe it is better to write drafts of the paper at the same time you are planning and conducting your study. The paper will gradually evolve as you work through successive phases of the scientific inquiry process. Consequently, we will call this paper your evolving research paper .

The picture has, we believe it is better to write drafts of the paper at the same time you are planning and conducting your study; written.

You will use your evolving research paper to communicate your study, but you can also use writing as a tool for thinking and organizing your thinking while planning and conducting the study. Used as a tool for thinking, you can write drafts of your ideas to check on the clarity of your thinking, and then you can step back and reflect on how to clarify it further. Be sure to avoid jargon and general terms that are not well defined. Ask yourself whether someone not in your field, maybe a sibling, a parent, or a friend, would be able to understand what you mean. You are likely to write multiple drafts with lots of scribbling, crossing out, and revising.

Used as a tool for communicating, writing the best version of what you know before moving to the next phase will help you record your decisions and the reasons for them before you forget important details. This best-version-for-now paper also provides the basis for your thinking about the next phase of your scientific inquiry.

At this point in the process, you will be writing your (research) questions, the answers you predict, and the rationales for your predictions. The predictions you make should be direct answers to your research questions and should flow logically from (or be directly supported by) the rationales you present. In addition, you will have a written statement of the study’s purpose or, said another way, an argument for the importance of the hypotheses you will be testing. It is in the early sections of your paper that you will convince your audience about the importance of your hypotheses.

In our experience, presenting research questions is a more common form of stating the goal of a research study than presenting well-formulated hypotheses. Authors sometimes present a hypothesis, often as a simple prediction of what they might find. The hypothesis is then forgotten and not used to guide the analysis or interpretations of the findings. In other words, authors seldom use hypotheses to do the kind of work we describe. This means that many research articles you read will not treat hypotheses as we suggest. We believe these are missed opportunities to present research in a more compelling and informative way. We intend to provide enough guidance in the remaining chapters for you to feel comfortable organizing your evolving research paper around formulating, testing, and revising hypotheses.

While we were editing one of the leading research journals in mathematics education ( JRME ), we conducted a study of reviewers’ critiques of papers submitted to the journal. Two of the five most common concerns were: (1) the research questions were unclear, and (2) the answers to the questions did not make a substantial contribution to the field. These are likely to be major concerns for the reviewers of all research journals. We hope the knowledge and skills you have acquired working through this chapter will allow you to write the opening to your evolving research paper in a way that addresses these concerns. Much of the chapter should help make your research questions clear, and the prior section on formulating “important hypotheses” will help you convey the contribution of your study.

Exercise 2.3

Look back at your answers to the sets of questions before part II of this chapter.

Think about how you would argue for the importance of your current interest.

Write your interest in the form of (1) a research problem, (2) a research question, and (3) a prediction with the beginnings of a rationale. You will update these as you read the remaining chapters.

Part VIII. The Heart of Scientific Inquiry

In this chapter, we have described the process of formulating hypotheses. This process is at the heart of scientific inquiry. It is where doing research begins. Conducting research always involves formulating, testing, and revising hypotheses. This is true regardless of your research questions and whether you are using qualitative, quantitative, or mixed methods. Without engaging in this process in a deliberate, intense, relentless way, your study will reveal less than it could. By engaging in this process, you are maximizing what you, and others, can learn from conducting your study.

In the next chapter, we build on the ideas we have developed in the first two chapters to describe the purpose and nature of theoretical frameworks . The term theoretical framework, along with closely related terms like conceptual framework, can be somewhat mysterious for beginning researchers and can seem like a requirement for writing a paper rather than an aid for conducting research. We will show how theoretical frameworks grow from formulating hypotheses—from developing rationales for the predicted answers to your research questions. We will propose some practical suggestions for building theoretical frameworks and show how useful they can be. In addition, we will continue Martha’s story from the point at which we paused earlier—developing her theoretical framework.

Cai, J., Morris, A., Hohensee, C., Hwang, S., Robison, V., Cirillo, M., Kramer, S. L., & Hiebert, J. (2019b). Posing significant research questions. Journal for Research in Mathematics Education, 50 (2), 114–120. https://doi.org/10.5951/jresematheduc.50.2.0114

Article Google Scholar

Carpenter, T. P., Fennema, E., Peterson, P. L., Chiang, C. P., & Loef, M. (1989). Using knowledge of children’s mathematics thinking in classroom teaching: An experimental study. American Educational Research Journal, 26 (4), 385–531.

Fennema, E., Carpenter, T. P., Franke, M. L., Levi, L., Jacobs, V. R., & Empson, S. B. (1996). A longitudinal study of learning to use children’s thinking in mathematics instruction. Journal for Research in Mathematics Education, 27 (4), 403–434.

Glaser, B. G., & Holton, J. (2004). Remodeling grounded theory. Forum: Qualitative Social Research, 5(2). https://www.qualitative-research.net/index.php/fqs/article/view/607/1316

Gournelos, T., Hammonds, J. R., & Wilson, M. A. (2019). Doing academic research: A practical guide to research methods and analysis . Routledge.

Book Google Scholar

Hohensee, C. (2014). Backward transfer: An investigation of the influence of quadratic functions instruction on students’ prior ways of reasoning about linear functions. Mathematical Thinking and Learning, 16 (2), 135–174.

Husserl, E. (1973). Cartesian meditations: An introduction to phenomenology (D. Cairns, Trans.). Martinus Nijhoff. (Original work published 1929).

Google Scholar

Levitt, H. M., Bamberg, M., Creswell, J. W., Frost, D. M., Josselson, R., & Suárez-Orozco, C. (2018). Journal article reporting standards for qualitative primary, qualitative meta-analytic, and mixed methods research in psychology: The APA Publications and Communications Board Task Force report. American Psychologist, 73 (1), 26–46.

Medawar, P. (1982). Pluto’s republic [no typo]. Oxford University Press.

Merton, R. K. (1968). Social theory and social structure (Enlarged edition). Free Press.

Nemirovsky, R. (2011). Episodic feelings and transfer of learning. Journal of the Learning Sciences, 20 (2), 308–337. https://doi.org/10.1080/10508406.2011.528316

Vygotsky, L. (1987). The development of scientific concepts in childhood: The design of a working hypothesis. In A. Kozulin (Ed.), Thought and language (pp. 146–209). The MIT Press.

Download references

Author information

Authors and affiliations.

School of Education, University of Delaware, Newark, DE, USA

James Hiebert, Anne K Morris & Charles Hohensee

Department of Mathematical Sciences, University of Delaware, Newark, DE, USA

Jinfa Cai & Stephen Hwang

You can also search for this author in PubMed Google Scholar

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/ ), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this chapter

Hiebert, J., Cai, J., Hwang, S., Morris, A.K., Hohensee, C. (2023). How Do You Formulate (Important) Hypotheses?. In: Doing Research: A New Researcher’s Guide. Research in Mathematics Education. Springer, Cham. https://doi.org/10.1007/978-3-031-19078-0_2

Download citation

DOI : https://doi.org/10.1007/978-3-031-19078-0_2

Published : 03 December 2022

Publisher Name : Springer, Cham

Print ISBN : 978-3-031-19077-3

Online ISBN : 978-3-031-19078-0

eBook Packages : Education Education (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Publish with us

Policies and ethics

Find a journal
Track your research

Choose Your Test

Sat / act prep online guides and tips, what is a hypothesis and how do i write one.

General Education

Think about something strange and unexplainable in your life. Maybe you get a headache right before it rains, or maybe you think your favorite sports team wins when you wear a certain color. If you wanted to see whether these are just coincidences or scientific fact, you would form a hypothesis, then create an experiment to see whether that hypothesis is true or not.

But what is a hypothesis, anyway? If you’re not sure about what a hypothesis is--or how to test for one!--you’re in the right place. This article will teach you everything you need to know about hypotheses, including:

Defining the term “hypothesis”
Providing hypothesis examples
Giving you tips for how to write your own hypothesis

So let’s get started!

What Is a Hypothesis?

Merriam Webster defines a hypothesis as “an assumption or concession made for the sake of argument.” In other words, a hypothesis is an educated guess . Scientists make a reasonable assumption--or a hypothesis--then design an experiment to test whether it’s true or not. Keep in mind that in science, a hypothesis should be testable. You have to be able to design an experiment that tests your hypothesis in order for it to be valid.

As you could assume from that statement, it’s easy to make a bad hypothesis. But when you’re holding an experiment, it’s even more important that your guesses be good...after all, you’re spending time (and maybe money!) to figure out more about your observation. That’s why we refer to a hypothesis as an educated guess--good hypotheses are based on existing data and research to make them as sound as possible.

Hypotheses are one part of what’s called the scientific method . Every (good) experiment or study is based in the scientific method. The scientific method gives order and structure to experiments and ensures that interference from scientists or outside influences does not skew the results. It’s important that you understand the concepts of the scientific method before holding your own experiment. Though it may vary among scientists, the scientific method is generally made up of six steps (in order):

Observation
Asking questions
Forming a hypothesis
Analyze the data
Communicate your results

You’ll notice that the hypothesis comes pretty early on when conducting an experiment. That’s because experiments work best when they’re trying to answer one specific question. And you can’t conduct an experiment until you know what you’re trying to prove!

Independent and Dependent Variables

After doing your research, you’re ready for another important step in forming your hypothesis: identifying variables. Variables are basically any factor that could influence the outcome of your experiment . Variables have to be measurable and related to the topic being studied.

There are two types of variables: independent variables and dependent variables. I ndependent variables remain constant . For example, age is an independent variable; it will stay the same, and researchers can look at different ages to see if it has an effect on the dependent variable.

Speaking of dependent variables... dependent variables are subject to the influence of the independent variable , meaning that they are not constant. Let’s say you want to test whether a person’s age affects how much sleep they need. In that case, the independent variable is age (like we mentioned above), and the dependent variable is how much sleep a person gets.

Variables will be crucial in writing your hypothesis. You need to be able to identify which variable is which, as both the independent and dependent variables will be written into your hypothesis. For instance, in a study about exercise, the independent variable might be the speed at which the respondents walk for thirty minutes, and the dependent variable would be their heart rate. In your study and in your hypothesis, you’re trying to understand the relationship between the two variables.

Elements of a Good Hypothesis

The best hypotheses start by asking the right questions . For instance, if you’ve observed that the grass is greener when it rains twice a week, you could ask what kind of grass it is, what elevation it’s at, and if the grass across the street responds to rain in the same way. Any of these questions could become the backbone of experiments to test why the grass gets greener when it rains fairly frequently.

As you’re asking more questions about your first observation, make sure you’re also making more observations . If it doesn’t rain for two weeks and the grass still looks green, that’s an important observation that could influence your hypothesis. You'll continue observing all throughout your experiment, but until the hypothesis is finalized, every observation should be noted.

Finally, you should consult secondary research before writing your hypothesis . Secondary research is comprised of results found and published by other people. You can usually find this information online or at your library. Additionally, m ake sure the research you find is credible and related to your topic. If you’re studying the correlation between rain and grass growth, it would help you to research rain patterns over the past twenty years for your county, published by a local agricultural association. You should also research the types of grass common in your area, the type of grass in your lawn, and whether anyone else has conducted experiments about your hypothesis. Also be sure you’re checking the quality of your research . Research done by a middle school student about what minerals can be found in rainwater would be less useful than an article published by a local university.

Writing Your Hypothesis

Once you’ve considered all of the factors above, you’re ready to start writing your hypothesis. Hypotheses usually take a certain form when they’re written out in a research report.

When you boil down your hypothesis statement, you are writing down your best guess and not the question at hand . This means that your statement should be written as if it is fact already, even though you are simply testing it.

The reason for this is that, after you have completed your study, you'll either accept or reject your if-then or your null hypothesis. All hypothesis testing examples should be measurable and able to be confirmed or denied. You cannot confirm a question, only a statement!

In fact, you come up with hypothesis examples all the time! For instance, when you guess on the outcome of a basketball game, you don’t say, “Will the Miami Heat beat the Boston Celtics?” but instead, “I think the Miami Heat will beat the Boston Celtics.” You state it as if it is already true, even if it turns out you’re wrong. You do the same thing when writing your hypothesis.

Additionally, keep in mind that hypotheses can range from very specific to very broad. These hypotheses can be specific, but if your hypothesis testing examples involve a broad range of causes and effects, your hypothesis can also be broad.

The Two Types of Hypotheses

Now that you understand what goes into a hypothesis, it’s time to look more closely at the two most common types of hypothesis: the if-then hypothesis and the null hypothesis.

#1: If-Then Hypotheses

First of all, if-then hypotheses typically follow this formula:

If ____ happens, then ____ will happen.

The goal of this type of hypothesis is to test the causal relationship between the independent and dependent variable. It’s fairly simple, and each hypothesis can vary in how detailed it can be. We create if-then hypotheses all the time with our daily predictions. Here are some examples of hypotheses that use an if-then structure from daily life:

If I get enough sleep, I’ll be able to get more work done tomorrow.
If the bus is on time, I can make it to my friend’s birthday party.
If I study every night this week, I’ll get a better grade on my exam.

In each of these situations, you’re making a guess on how an independent variable (sleep, time, or studying) will affect a dependent variable (the amount of work you can do, making it to a party on time, or getting better grades).

You may still be asking, “What is an example of a hypothesis used in scientific research?” Take one of the hypothesis examples from a real-world study on whether using technology before bed affects children’s sleep patterns. The hypothesis read s:

“We hypothesized that increased hours of tablet- and phone-based screen time at bedtime would be inversely correlated with sleep quality and child attention.”

It might not look like it, but this is an if-then statement. The researchers basically said, “If children have more screen usage at bedtime, then their quality of sleep and attention will be worse.” The sleep quality and attention are the dependent variables and the screen usage is the independent variable. (Usually, the independent variable comes after the “if” and the dependent variable comes after the “then,” as it is the independent variable that affects the dependent variable.) This is an excellent example of how flexible hypothesis statements can be, as long as the general idea of “if-then” and the independent and dependent variables are present.

#2: Null Hypotheses

Your if-then hypothesis is not the only one needed to complete a successful experiment, however. You also need a null hypothesis to test it against. In its most basic form, the null hypothesis is the opposite of your if-then hypothesis . When you write your null hypothesis, you are writing a hypothesis that suggests that your guess is not true, and that the independent and dependent variables have no relationship .

One null hypothesis for the cell phone and sleep study from the last section might say:

“If children have more screen usage at bedtime, their quality of sleep and attention will not be worse.”

In this case, this is a null hypothesis because it’s asking the opposite of the original thesis!

Conversely, if your if-then hypothesis suggests that your two variables have no relationship, then your null hypothesis would suggest that there is one. So, pretend that there is a study that is asking the question, “Does the amount of followers on Instagram influence how long people spend on the app?” The independent variable is the amount of followers, and the dependent variable is the time spent. But if you, as the researcher, don’t think there is a relationship between the number of followers and time spent, you might write an if-then hypothesis that reads:

“If people have many followers on Instagram, they will not spend more time on the app than people who have less.”

In this case, the if-then suggests there isn’t a relationship between the variables. In that case, one of the null hypothesis examples might say:

“If people have many followers on Instagram, they will spend more time on the app than people who have less.”

You then test both the if-then and the null hypothesis to gauge if there is a relationship between the variables, and if so, how much of a relationship.

4 Tips to Write the Best Hypothesis

If you’re going to take the time to hold an experiment, whether in school or by yourself, you’re also going to want to take the time to make sure your hypothesis is a good one. The best hypotheses have four major elements in common: plausibility, defined concepts, observability, and general explanation.

#1: Plausibility

At first glance, this quality of a hypothesis might seem obvious. When your hypothesis is plausible, that means it’s possible given what we know about science and general common sense. However, improbable hypotheses are more common than you might think.

Imagine you’re studying weight gain and television watching habits. If you hypothesize that people who watch more than twenty hours of television a week will gain two hundred pounds or more over the course of a year, this might be improbable (though it’s potentially possible). Consequently, c ommon sense can tell us the results of the study before the study even begins.

Improbable hypotheses generally go against science, as well. Take this hypothesis example:

“If a person smokes one cigarette a day, then they will have lungs just as healthy as the average person’s.”

This hypothesis is obviously untrue, as studies have shown again and again that cigarettes negatively affect lung health. You must be careful that your hypotheses do not reflect your own personal opinion more than they do scientifically-supported findings. This plausibility points to the necessity of research before the hypothesis is written to make sure that your hypothesis has not already been disproven.

#2: Defined Concepts

The more advanced you are in your studies, the more likely that the terms you’re using in your hypothesis are specific to a limited set of knowledge. One of the hypothesis testing examples might include the readability of printed text in newspapers, where you might use words like “kerning” and “x-height.” Unless your readers have a background in graphic design, it’s likely that they won’t know what you mean by these terms. Thus, it’s important to either write what they mean in the hypothesis itself or in the report before the hypothesis.

Here’s what we mean. Which of the following sentences makes more sense to the common person?

If the kerning is greater than average, more words will be read per minute.

If the space between letters is greater than average, more words will be read per minute.

For people reading your report that are not experts in typography, simply adding a few more words will be helpful in clarifying exactly what the experiment is all about. It’s always a good idea to make your research and findings as accessible as possible.

Good hypotheses ensure that you can observe the results.

#3: Observability

In order to measure the truth or falsity of your hypothesis, you must be able to see your variables and the way they interact. For instance, if your hypothesis is that the flight patterns of satellites affect the strength of certain television signals, yet you don’t have a telescope to view the satellites or a television to monitor the signal strength, you cannot properly observe your hypothesis and thus cannot continue your study.

Some variables may seem easy to observe, but if you do not have a system of measurement in place, you cannot observe your hypothesis properly. Here’s an example: if you’re experimenting on the effect of healthy food on overall happiness, but you don’t have a way to monitor and measure what “overall happiness” means, your results will not reflect the truth. Monitoring how often someone smiles for a whole day is not reasonably observable, but having the participants state how happy they feel on a scale of one to ten is more observable.

In writing your hypothesis, always keep in mind how you'll execute the experiment.

#4: Generalizability

Perhaps you’d like to study what color your best friend wears the most often by observing and documenting the colors she wears each day of the week. This might be fun information for her and you to know, but beyond you two, there aren’t many people who could benefit from this experiment. When you start an experiment, you should note how generalizable your findings may be if they are confirmed. Generalizability is basically how common a particular phenomenon is to other people’s everyday life.

Let’s say you’re asking a question about the health benefits of eating an apple for one day only, you need to realize that the experiment may be too specific to be helpful. It does not help to explain a phenomenon that many people experience. If you find yourself with too specific of a hypothesis, go back to asking the big question: what is it that you want to know, and what do you think will happen between your two variables?

Hypothesis Testing Examples

We know it can be hard to write a good hypothesis unless you’ve seen some good hypothesis examples. We’ve included four hypothesis examples based on some made-up experiments. Use these as templates or launch pads for coming up with your own hypotheses.

Experiment #1: Students Studying Outside (Writing a Hypothesis)

You are a student at PrepScholar University. When you walk around campus, you notice that, when the temperature is above 60 degrees, more students study in the quad. You want to know when your fellow students are more likely to study outside. With this information, how do you make the best hypothesis possible?

You must remember to make additional observations and do secondary research before writing your hypothesis. In doing so, you notice that no one studies outside when it’s 75 degrees and raining, so this should be included in your experiment. Also, studies done on the topic beforehand suggested that students are more likely to study in temperatures less than 85 degrees. With this in mind, you feel confident that you can identify your variables and write your hypotheses:

If-then: “If the temperature in Fahrenheit is less than 60 degrees, significantly fewer students will study outside.”

Null: “If the temperature in Fahrenheit is less than 60 degrees, the same number of students will study outside as when it is more than 60 degrees.”

These hypotheses are plausible, as the temperatures are reasonably within the bounds of what is possible. The number of people in the quad is also easily observable. It is also not a phenomenon specific to only one person or at one time, but instead can explain a phenomenon for a broader group of people.

To complete this experiment, you pick the month of October to observe the quad. Every day (except on the days where it’s raining)from 3 to 4 PM, when most classes have released for the day, you observe how many people are on the quad. You measure how many people come and how many leave. You also write down the temperature on the hour.

After writing down all of your observations and putting them on a graph, you find that the most students study on the quad when it is 70 degrees outside, and that the number of students drops a lot once the temperature reaches 60 degrees or below. In this case, your research report would state that you accept or “failed to reject” your first hypothesis with your findings.

Experiment #2: The Cupcake Store (Forming a Simple Experiment)

Let’s say that you work at a bakery. You specialize in cupcakes, and you make only two colors of frosting: yellow and purple. You want to know what kind of customers are more likely to buy what kind of cupcake, so you set up an experiment. Your independent variable is the customer’s gender, and the dependent variable is the color of the frosting. What is an example of a hypothesis that might answer the question of this study?

Here’s what your hypotheses might look like:

If-then: “If customers’ gender is female, then they will buy more yellow cupcakes than purple cupcakes.”

Null: “If customers’ gender is female, then they will be just as likely to buy purple cupcakes as yellow cupcakes.”

This is a pretty simple experiment! It passes the test of plausibility (there could easily be a difference), defined concepts (there’s nothing complicated about cupcakes!), observability (both color and gender can be easily observed), and general explanation ( this would potentially help you make better business decisions ).

Experiment #3: Backyard Bird Feeders (Integrating Multiple Variables and Rejecting the If-Then Hypothesis)

While watching your backyard bird feeder, you realized that different birds come on the days when you change the types of seeds. You decide that you want to see more cardinals in your backyard, so you decide to see what type of food they like the best and set up an experiment.

However, one morning, you notice that, while some cardinals are present, blue jays are eating out of your backyard feeder filled with millet. You decide that, of all of the other birds, you would like to see the blue jays the least. This means you'll have more than one variable in your hypothesis. Your new hypotheses might look like this:

If-then: “If sunflower seeds are placed in the bird feeders, then more cardinals will come than blue jays. If millet is placed in the bird feeders, then more blue jays will come than cardinals.”

Null: “If either sunflower seeds or millet are placed in the bird, equal numbers of cardinals and blue jays will come.”

Through simple observation, you actually find that cardinals come as often as blue jays when sunflower seeds or millet is in the bird feeder. In this case, you would reject your “if-then” hypothesis and “fail to reject” your null hypothesis . You cannot accept your first hypothesis, because it’s clearly not true. Instead you found that there was actually no relation between your different variables. Consequently, you would need to run more experiments with different variables to see if the new variables impact the results.

Experiment #4: In-Class Survey (Including an Alternative Hypothesis)

You’re about to give a speech in one of your classes about the importance of paying attention. You want to take this opportunity to test a hypothesis you’ve had for a while:

If-then: If students sit in the first two rows of the classroom, then they will listen better than students who do not.

Null: If students sit in the first two rows of the classroom, then they will not listen better or worse than students who do not.

You give your speech and then ask your teacher if you can hand out a short survey to the class. On the survey, you’ve included questions about some of the topics you talked about. When you get back the results, you’re surprised to see that not only do the students in the first two rows not pay better attention, but they also scored worse than students in other parts of the classroom! Here, both your if-then and your null hypotheses are not representative of your findings. What do you do?

This is when you reject both your if-then and null hypotheses and instead create an alternative hypothesis . This type of hypothesis is used in the rare circumstance that neither of your hypotheses is able to capture your findings . Now you can use what you’ve learned to draft new hypotheses and test again!

Key Takeaways: Hypothesis Writing

The more comfortable you become with writing hypotheses, the better they will become. The structure of hypotheses is flexible and may need to be changed depending on what topic you are studying. The most important thing to remember is the purpose of your hypothesis and the difference between the if-then and the null . From there, in forming your hypothesis, you should constantly be asking questions, making observations, doing secondary research, and considering your variables. After you have written your hypothesis, be sure to edit it so that it is plausible, clearly defined, observable, and helpful in explaining a general phenomenon.

Writing a hypothesis is something that everyone, from elementary school children competing in a science fair to professional scientists in a lab, needs to know how to do. Hypotheses are vital in experiments and in properly executing the scientific method . When done correctly, hypotheses will set up your studies for success and help you to understand the world a little better, one experiment at a time.

What’s Next?

If you’re studying for the science portion of the ACT, there’s definitely a lot you need to know. We’ve got the tools to help, though! Start by checking out our ultimate study guide for the ACT Science subject test. Once you read through that, be sure to download our recommended ACT Science practice tests , since they’re one of the most foolproof ways to improve your score. (And don’t forget to check out our expert guide book , too.)

If you love science and want to major in a scientific field, you should start preparing in high school . Here are the science classes you should take to set yourself up for success.

If you’re trying to think of science experiments you can do for class (or for a science fair!), here’s a list of 37 awesome science experiments you can do at home

Ashley Sufflé Robinson has a Ph.D. in 19th Century English Literature. As a content writer for PrepScholar, Ashley is passionate about giving college-bound students the in-depth information they need to get into the school of their dreams.

Student and Parent Forum

Our new student and parent forum, at ExpertHub.PrepScholar.com , allow you to interact with your peers and the PrepScholar staff. See how other students and parents are navigating high school, college, and the college admissions process. Ask questions; get answers.

Ask a Question Below

Have any questions about this article or other topics? Ask below and we'll reply!

Improve With Our Famous Guides

For All Students

The 5 Strategies You Must Be Using to Improve 160+ SAT Points

How to Get a Perfect 1600, by a Perfect Scorer

Series: How to Get 800 on Each SAT Section:

Score 800 on SAT Math

Score 800 on SAT Reading

Score 800 on SAT Writing

Series: How to Get to 600 on Each SAT Section:

Score 600 on SAT Math

Score 600 on SAT Reading

Score 600 on SAT Writing

Free Complete Official SAT Practice Tests

What SAT Target Score Should You Be Aiming For?

15 Strategies to Improve Your SAT Essay

The 5 Strategies You Must Be Using to Improve 4+ ACT Points

How to Get a Perfect 36 ACT, by a Perfect Scorer

Series: How to Get 36 on Each ACT Section:

36 on ACT English

36 on ACT Math

36 on ACT Reading

36 on ACT Science

Series: How to Get to 24 on Each ACT Section:

24 on ACT English

24 on ACT Math

24 on ACT Reading

24 on ACT Science

What ACT target score should you be aiming for?

ACT Vocabulary You Must Know

ACT Writing: 15 Tips to Raise Your Essay Score

How to Get Into Harvard and the Ivy League

How to Get a Perfect 4.0 GPA

How to Write an Amazing College Essay

What Exactly Are Colleges Looking For?

Is the ACT easier than the SAT? A Comprehensive Guide

Should you retake your SAT or ACT?

When should you take the SAT or ACT?

Stay Informed

Get the latest articles and test prep tips!

Looking for Graduate School Test Prep?

Check out our top-rated graduate blogs here:

GRE Online Prep Blog

GMAT Online Prep Blog

TOEFL Online Prep Blog

Holly R. "I am absolutely overjoyed and cannot thank you enough for helping me!”

school Campus Bookshelves
menu_book Bookshelves
perm_media Learning Objects
login Login
how_to_reg Request Instructor Account
hub Instructor Commons
Download Page (PDF)
Download Full Book (PDF)
Periodic Table
Physics Constants
Scientific Calculator
Reference & Cite
Tools expand_more
Readability

selected template will load here

This action is not available.

9.1: Null and Alternative Hypotheses

Last updated
Save as PDF
Page ID 23459

The actual test begins by considering two hypotheses . They are called the null hypothesis and the alternative hypothesis . These hypotheses contain opposing viewpoints.

$H_0$: The null hypothesis: It is a statement of no difference between the variables—they are not related. This can often be considered the status quo and as a result if you cannot accept the null it requires some action.

$H_a$: The alternative hypothesis: It is a claim about the population that is contradictory to $H_0$ and what we conclude when we reject $H_0$. This is usually what the researcher is trying to prove.

Since the null and alternative hypotheses are contradictory, you must examine evidence to decide if you have enough evidence to reject the null hypothesis or not. The evidence is in the form of sample data.

After you have determined which hypothesis the sample supports, you make a decision. There are two options for a decision. They are "reject $H_0$" if the sample information favors the alternative hypothesis or "do not reject $H_0$" or "decline to reject $H_0$" if the sample information is insufficient to reject the null hypothesis.

$H_{0}$ always has a symbol with an equal in it. $H_{a}$ never has a symbol with an equal in it. The choice of symbol depends on the wording of the hypothesis test. However, be aware that many researchers (including one of the co-authors in research work) use = in the null hypothesis, even with > or < as the symbol in the alternative hypothesis. This practice is acceptable because we only make the decision to reject or not reject the null hypothesis.

Example $\PageIndex{1}$

$H_{0}$: No more than 30% of the registered voters in Santa Clara County voted in the primary election. $p \leq 30$
$H_{a}$: More than 30% of the registered voters in Santa Clara County voted in the primary election. $p > 30$

Exercise $\PageIndex{1}$

A medical trial is conducted to test whether or not a new medicine reduces cholesterol by 25%. State the null and alternative hypotheses.

$H_{0}$: The drug reduces cholesterol by 25%. $p = 0.25$
$H_{a}$: The drug does not reduce cholesterol by 25%. $p \neq 0.25$

Example $\PageIndex{2}$

We want to test whether the mean GPA of students in American colleges is different from 2.0 (out of 4.0). The null and alternative hypotheses are:

$H_{0}: \mu = 2.0$
$H_{a}: \mu \neq 2.0$

Exercise $\PageIndex{2}$

We want to test whether the mean height of eighth graders is 66 inches. State the null and alternative hypotheses. Fill in the correct symbol $(=, \neq, \geq, <, \leq, >)$ for the null and alternative hypotheses.

$H_{0}: \mu \_ 66$
$H_{a}: \mu \_ 66$
$H_{0}: \mu = 66$
$H_{a}: \mu \neq 66$

Example $\PageIndex{3}$

We want to test if college students take less than five years to graduate from college, on the average. The null and alternative hypotheses are:

$H_{0}: \mu \geq 5$
$H_{a}: \mu < 5$

Exercise $\PageIndex{3}$

We want to test if it takes fewer than 45 minutes to teach a lesson plan. State the null and alternative hypotheses. Fill in the correct symbol ( =, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

$H_{0}: \mu \_ 45$
$H_{a}: \mu \_ 45$
$H_{0}: \mu \geq 45$
$H_{a}: \mu < 45$

Example $\PageIndex{4}$

In an issue of U. S. News and World Report , an article on school standards stated that about half of all students in France, Germany, and Israel take advanced placement exams and a third pass. The same article stated that 6.6% of U.S. students take advanced placement exams and 4.4% pass. Test if the percentage of U.S. students who take advanced placement exams is more than 6.6%. State the null and alternative hypotheses.

$H_{0}: p \leq 0.066$
$H_{a}: p > 0.066$

Exercise $\PageIndex{4}$

On a state driver’s test, about 40% pass the test on the first try. We want to test if more than 40% pass on the first try. Fill in the correct symbol ($=, \neq, \geq, <, \leq, >$) for the null and alternative hypotheses.

$H_{0}: p \_ 0.40$
$H_{a}: p \_ 0.40$
$H_{0}: p = 0.40$
$H_{a}: p > 0.40$

COLLABORATIVE EXERCISE

Bring to class a newspaper, some news magazines, and some Internet articles . In groups, find articles from which your group can write null and alternative hypotheses. Discuss your hypotheses with the rest of the class.

In a hypothesis test , sample data is evaluated in order to arrive at a decision about some type of claim. If certain conditions about the sample are satisfied, then the claim can be evaluated for a population. In a hypothesis test, we:

Evaluate the null hypothesis , typically denoted with $H_{0}$. The null is not rejected unless the hypothesis test shows otherwise. The null statement must always contain some form of equality $(=, \leq \text{or} \geq)$
Always write the alternative hypothesis , typically denoted with $H_{a}$ or $H_{1}$, using less than, greater than, or not equals symbols, i.e., $(\neq, >, \text{or} <)$.
If we reject the null hypothesis, then we can assume there is enough evidence to support the alternative hypothesis.
Never state that a claim is proven true or false. Keep in mind the underlying fact that hypothesis testing is based on probability laws; therefore, we can talk only in terms of non-absolute certainties.

Formula Review

$H_{0}$ and $H_{a}$ are contradictory.

If $\alpha \leq p$-value, then do not reject $H_{0}$.
If$\alpha > p$-value, then reject $H_{0}$.

$\alpha$ is preconceived. Its value is set before the hypothesis test starts. The $p$-value is calculated from the data.References

Data from the National Institute of Mental Health. Available online at http://www.nimh.nih.gov/publicat/depression.cfm .

College of Continuing & Professional Studies

Request Info
Search Undergraduate Programs
Construction Management Bachelor's Degree
Construction Management Minor
Construction Management Certificate
Health and Wellbeing Sciences Bachelor's Degree
Health Services Management Bachelor's Degree
Health Services Management Minor
Health Services Management Certificate
Inter-College Program Bachelor's Degree
Long Term Care Management Minor
Long Term Care Certificate
Multidisciplinary Studies Bachelor's Degree
Applied Business Certificate
Information Technology Infrastructure Bachelor's Degree
Information Technology Infrastructure Minor
Information Technology Infrastructure Certificate
Current CCAPS Undergrads
Nonadmitted Guest Students
Search Graduate Programs
Addictions Counseling
Applied Sciences Leadership
Arts and Cultural Leadership
Biological Sciences
Civic Engagement
Horticulture
Integrated Behavioral Health
Sexual Health
Human Sexuality
Leadership for Science Professionals
Regulatory Affairs for Food Professionals
Sex Therapy
Transgender and Gender Diverse Health
Inclusivity
Professional Development Programs
Search Courses and Certificates
Agile and Scrum
Business Process
HR and Talent Development
Project Management
Writing and Communications
Upcoming Webinars
Webinar Archive Search
Annual Concrete Conference
Annual Institute for Building Officials
Income Tax Short Course
Minnesota Power Systems Conference
Minnesota Water Resources Conference
Structural Engineering Webinar Series
Resources for Professionals in Transition
Discounts and Financial Aid
Workforce Development
Lifelong Learning
Intensive English Program
Academic English Program
Supporting Multilingual Students
Student Resources
Student English Language Support
Professional English
Customized Training
ESL Testing
College in the Schools
Post-Secondary Enrollment Options
Dean Search
Mission and Values
Student Stories
Leadership Team
Diversity, Equity, and Inclusion

only a person's hands are visible, the right hand holding an electronic pencil writing on a tablet, the left hand poised over a laptop keyboard; we also see data symbols like charts and graphs as if they are projected in front of the person

Business Analyst Career Guide: What You Should Know

Companies and organizations across all industries rely on knowledgeable business analysts to help them make informed business decisions. If you are interested in a career path that allows you to put your critical thinking and analytical skills to use, then working as a business analyst could be an ideal fit. As modern businesses continue to evolve and collect more data than ever, the need for business analysts to derive valuable insights from that data will continue to increase by 10% (more detail below).

The University of Minnesota's certificate in business analysis is a great starting point for those interested in a business analyst career. This fully online program can be completed in as little as three months and provides the foundation you need to identify key business metrics, confidently analyze financial data, and develop actionable insights to drive business growth.

Still wondering whether a career in a business analyst role is right for you? With a better understanding of the key responsibilities of a business analyst, as well as the skills required and growth opportunities available, you can more confidently decide if this is the path best suited to your professional goals.

Core Responsibilities of a Business Analyst

What exactly is a business analyst, and what do these professionals do as part of their daily work? Specifically, a business analyst is a professional who helps a business, company, or organization make data-driven decisions.

The exact roles of somebody in this position can vary significantly, depending on the given industry and company by which an analyst is employed. However, some of the most common business analyst responsibilities include:

making sense of large amounts of data (including data visualization) using any number of data analysis techniques and business analyst tools.
identifying problems and issues within a business and proposing solutions.
communicating results and findings to others within a business.
forecasting potential outcomes of business decisions.
aligning business activities and decisions with the overall company goals and mission.
collaborating with developers and other team members.

Business analysts also tend to be responsible for leading and spearheading special projects within a company, especially during periods of transition or change. During these times, business analysts have a particularly important responsibility to carry out responsible change management while effectively collaborating with and coaching other team members.

Skill Set Required for Business Analysts

To execute common business analyst responsibilities, these professionals must possess several technical and soft skills as well. On the technical side of things, business analysts need to be proficient in various different tools and programs employed in data analysis (such as PowerBI or SAS). Proficiency in database software would serve business analysts well in this type of role.

Other skills business analysts should have include:

Mind mapping
SWOT (strength, weakness, opportunity, and threat) analysis
PESTLE (political, economic, social, technological, legal, and environmental) analysis
Wireframing to communicate vision for a product
Use of a customer relationship management system

In addition to hard skills, business analysts must possess some soft skills that are crucial to success on the job. For example, business analysts need strong cross-functional team collaboration skills, meaning they should be able to work independently while remaining team players. Likewise, solid problem-solving skills go a long way in this line of work, as business analysts are constantly identifying problems and brainstorming ways to solve them for the sake of the business.

Last but not least, effective communication and people skills come in handy. Whether presenting findings to higher-ups or being able to "translate" complex jargon into novice terms, verbal and written communication are a must in the business analyst field.

Industries and Sectors Hiring Business Analysts

Businesses across all industries require skilled and knowledgeable business analysts. To make sense of increasing quantities of data and use it to make sound business decisions, companies are turning to strategic business planning professionals and business analysts.

This is perhaps particularly true in industries where data collection has seen an increase in recent years. Examples of these include manufacturing and transportation. According to the United States Bureau of Labor Statistics (BLS), 35 percent of business analysts or management analysts work in professional, scientific, and technical services. From there, the most common industries hiring these professionals include:

Government (17 percent)
Finance and insurance (12 percent)
Management of companies and enterprises (4 percent)

It is worth noting, too, that an estimated 14 percent of business analysts are self-employed, meaning they may work as independent contractors for any number of private clients.

Career Path and Growth Opportunities

So, how do you get started working as a business analyst, and what does the typical progression look like in this career? In most cases, businesses prefer hiring candidates with at least a bachelor's degree in business or a related field. From there, having additional credentials (such as a certificate in business analysis) could help you stand out from other job applicants while potentially qualifying you for more jobs in the field.

According to the BLS, the median business/management analyst salary in 2022 was $95,290 per year , with the highest 10 percent of earners in this field making more than $167,650. Meanwhile, the demand for these professionals continues to rise, with the projected job outlook expected to grow by 10 percent between 2022 and 2032 alone. That's much faster than the national average for all occupations.

Impact of Technology on Business Analysis

There's no denying the role evolving technology plays in the business analyst profession. In many ways, innovations in technology and software are making the job of the business analyst easier in the sense that data can be more readily processed, analyzed, interpreted, and even visualized. At the same time, however, the role of the business analyst has become progressively complex with these new advancements as the ability to collect more and more data has risen. Today, business analysts are expected to work with a greater volume of data than ever before—so knowing how to use the latest software and tools to process and analyze data is a must.

Networking and Professional Development

Even with the necessary credentials and skills, aspiring business analysts also need to be committed to networking and professional development if they seek success in this career path. As is the case in numerous industries, encountering opportunities for growth and advancement in business analysis is very much about who you know. Going out of your way to build professional connections could help improve career prospects down the road.

The same applies to ongoing professional development. To stay ahead of the latest advancements and innovations in this dynamic field, business analysts need to be proactive about learning new skills and staying on top of change. With this in mind, a lifelong commitment to learning and growing is a must if you want to find success as a business analyst.

Learn More, Today

Working as a business analyst could be a rewarding career path for those who enjoy making sense of vast sets of data while making a real difference when it comes to strategic business planning and business process optimization.

Whether you are looking to develop business analyst skills or in need of formal business analyst certificate to take your career to the next level, the University of Minnesota's Business Analysis Certificate could help you achieve your goals. With courses developed in alignment with the Guide to Business Analysis Body of Knowledge (BABOK™), we are proud to be an Endorsed Education Provider (EEP) with the IIBA®.

To learn more about our online business analysis certificate, reach out to our team. If you're ready to get the ball rolling, you can also enroll today.

Additional Sources

Business Analyst Career Path: What's the Trajectory? (Forage)
Business Analyst Career Explained (Villanova University)
What Does a Business Analyst Do? An Overview of Roles and Responsibilities (Indeed)

20 Educational Paths to Jump-Start Your Entrepreneur Career

Seven Reasons to Get a BA Certificate

Jack of All Trades

Decision-Making

Decision strategies: 4 steps to success, what is important is making the decision rather than obsessing over it..

Posted March 23, 2024 | Reviewed by Ray Parker

Research shows decisions involve balancing thinking things through and trusting your gut feeling.
A four-step approach can help make stress-free decisions.
By journaling, you can learn how to manage stress and potentially identify your intuition's "edge."

Whether at home or in the workplace, the choices we make range from snap decisions to thoughtful, strategic ones. Styles include trusting your instincts, weighing the pros and cons, and asking the advice of friends. Or choose to not decide hoping that a particular situation will resolve itself. What is the most successful strategy? According to a research report in the European Management Journal :

“Rationality and intuition are important dimensions of the strategic decision process...the interplay between rationality and intuition [was] based on a sample of 103 strategic decisions made by service firms in Greece.” (Thanos 2023)

Why Relying on Intuition or Trusting Instincts Is a Valid Decision-Making Strategy

In researching a book on decision-making for women, the power of intuition was evident. Sometimes referred to as a sixth sense or women’s intuition, researchers have documented the value of this strategy with men as well.

In Frontiers in Psychology, "If it feels right, do it," a preliminary investigation regarding high-level coaches and intuition determined:

“Initially intuitive than deliberate decision-making was a particular feature, offering participants an immediate check on the accuracy and validity of the decision....Irrespective of how they may best be developed, intuition and analysis are both important components of expertise...." (Collins 2016)

While it may seem that relying on intuition is risky, experience often gives substance to intuition.

When children want something, they ask. As adults, we often become tangled in the confusion of what we want for ourselves and what we think others would like for us. We tend to forget the simplicity of stating what we wish.

4 Steps to Making Stress-Free Decisions

When faced with a major decision, these steps might be helpful:

Define what you want to achieve. Assess the pros and cons or what you perceive as risks and benefits. Consider alternatives if you are concerned about the opinions of others. Make a choice and follow through without second-guessing yourself.

1. Be honest with yourself.

Think of what you want. If you know the answer, then why not just say so? You might consider the feelings or opinions of others, whether family, friends, or colleagues. Despite your consideration, you might be sabotaging yourself.

2. Define the pros and cons

Assess the situation by making a pros and cons checklist. Write all the reasons that a decision will benefit you alone. Then, write the reasons that your decision might make others uncomfortable or unsupportive.

3. Consider alternatives

Ask yourself if there is a way to please yourself and others. If not, is there a compromise? In decision-making groups, women who hid their feelings later admitted that they were afraid of making the wrong decision. Very often, when asked what they meant by "the wrong decision," they said they were afraid that their decision would not please others.

4. Make a decision and follow through

Once you have made your decision known, follow through instead of second-guessing yourself or asking your friends for approval or their opinions.

The Value of Keeping a Record

Using a journal will help guide you and may give you an idea as to the patterns of decision-making that are stressful and how to handle these stresses. While logical steps to decision-making combined with intuition are valuable, it’s your intuition that may give you the edge.

What about the times you were wrong when you trusted your instincts? It can happen, and for this reason, intuition combined with a logical process is beneficial.

C. Thanos, "The complementary effects of rationality and intuition on strategic decision quality," European Management Journal , Volume 41, Issue 3 , June 2023, Pages 366-374

Collings, Howie, Carson, “If It Feels Right, Do It”: Intuitive Decision Making in a Sample of High-Level Sport Coaches, Frontiers in Psychology, 2016, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4830814/

Rita Watson, MPH , is an associate fellow at Yale's Ezra Stiles College, a former columnist for The Providence Journal, and the author of Italian Kisses: Rose-Colored Words and Love from the Old Country .

Find a Therapist
Find a Treatment Center
Find a Psychiatrist
Find a Support Group
Find Teletherapy
United States
Brooklyn, NY
Chicago, IL
Houston, TX
Los Angeles, CA
New York, NY
Portland, OR
San Diego, CA
San Francisco, CA
Seattle, WA
Washington, DC
Asperger's
Bipolar Disorder
Chronic Pain
Eating Disorders
Passive Aggression
Personality
Goal Setting
Positive Psychology
Stopping Smoking
Low Sexual Desire
Relationships
Child Development
Therapy Center NEW
Diagnosis Dictionary
Types of Therapy

Understanding what emotional intelligence looks like and the steps needed to improve it could light a path to a more emotionally adept world.

Coronavirus Disease 2019
Affective Forecasting
Neuroscience

SUGGESTED TOPICS
The Magazine
Newsletters
Managing Yourself
Managing Teams
Work-life Balance
The Big Idea
Data & Visuals
Reading Lists
Case Selections
HBR Learning
Topic Feeds
Account Settings
Email Preferences

Are You a Micromanager?

Julia Milner

For new leaders, it’s a trap that’s easy to fall into.

New managers who are still building confidence and exploring the best way to lead can unintentionally develop controlling behaviors, hoping to live up to the expectations of their roles. Unfortunately, these behaviors usually have opposite effect by negatively impacting employee morale and performance. To avoid micromanagement behaviors, check in with yourself by asking these three questions:

Are you always giving your team “advice”? While there’s nothing wrong with giving your team members advice in situations that truly require it — high-stakes projects, urgent issues, or new processes that require more hands-on guidance — your goal should be to help people develop solutions on their own.
Do you need to approve every decision your team members make? If you’re on every email thread and in every Slack channel just to give your nod of approval, you likely need a better process. Try figuring out which stage you need to weigh in, instead of weighing in at every stage.
Do you consider feedback a one-way street? Feedback should be a two-way conversation. Recognize people’s strengths, treat mistakes like learning opportunities, and seek out opinions from your team to show them you value their perspectives.

Nobody wants to be a micromanager, but when you’re new to leading a team, it’s a trap that’s easy to fall into. The pressure to prove yourself to your direct reports while simultaneously delivering strong outcomes to the organization can sometimes result in an overly hands-on leadership style.

Julia Milner is a professor of leadership at EDHEC Business School in Nice, France and has been named in the World’s Top 40 Business Professors under 40. She has extensive experience as a management consultant and coach working internationally with executives and organizations on how to create empowering leadership and organizational cultures. Julia is host of a YouTube channel on leadership and careers, and has given two Tedx talks on how to be a great leader and how to turn regrets into change . Connect with Julia on LinkedIn.

Partner Center

IMAGES

13 Different Types of Hypothesis (2024)
How to Write a Strong Hypothesis in 6 Simple Steps
How to Write a Hypothesis
How to Write a Hypothesis
How to Write a Hypothesis
How to Write a Hypothesis: The Ultimate Guide with Examples

VIDEO

HOW TO: Create Hypothesis-enabled readings in D2L Brightspace using the new assignment workflow
Hypothesis Testing & Decision Making.. explained
Concept of Hypothesis
What Is A Hypothesis?
Research Questions Vs. Research Hypothesis ? በአማርኛ
The Permanent Income Hypothesis

COMMENTS

Hypothesis Testing
There are 5 main steps in hypothesis testing: State your research hypothesis as a null hypothesis and alternate hypothesis (H o) and (H a or H 1 ). Collect data in a way designed to test the hypothesis. Perform an appropriate statistical test. Decide whether to reject or fail to reject your null hypothesis. Present the findings in your results ...
How to Write a Strong Hypothesis
4. Refine your hypothesis. You need to make sure your hypothesis is specific and testable. There are various ways of phrasing a hypothesis, but all the terms you use should have clear definitions, and the hypothesis should contain: The relevant variables; The specific group being studied; The predicted outcome of the experiment or analysis; 5.
A Beginner's Guide to Hypothesis Testing in Business
3. One-Sided vs. Two-Sided Testing. When it's time to test your hypothesis, it's important to leverage the correct testing method. The two most common hypothesis testing methods are one-sided and two-sided tests, or one-tailed and two-tailed tests, respectively. Typically, you'd leverage a one-sided test when you have a strong conviction ...
Guide: Hypothesis Testing
In summary, hypothesis testing is a versatile tool that adds rigor, reduces risk, and enhances the decision-making and innovation processes across various sectors and functions. This graphical representation makes it easier to grasp how the p-value is used to quantify the risk involved in making a decision based on a hypothesis test.
Hypothesis Testing
Make a Decision: p-value: This is the probability of observing the data, given that the null hypothesis is true. A small p-value (typically ≤ 0.05) suggests the data is inconsistent with the null hypothesis. Decision Rule: If the p-value is less than or equal to α, you reject the null hypothesis in favor of the alternative. 2.1.
Hypothesis Tests: A Comprehensive Guide
Introduction to Hypotheses Tests. Hypothesis testing is a statistical tool used to make decisions based on data. It involves making assumptions about a population parameter and testing its validity using a population sample. Hypothesis tests help us draw conclusions and make informed decisions in various fields like business, research, and science.
S.3 Hypothesis Testing
The general idea of hypothesis testing involves: Making an initial assumption. Collecting evidence (data). Based on the available evidence (data), deciding whether to reject or not reject the initial assumption. Every hypothesis test — regardless of the population parameter involved — requires the above three steps.
9.4: Making Decisions
At this point, our hypothesis test is essentially complete: (1) we choose an α level (e.g., α=.05, (2) come up with some test statistic (e.g., X) that does a good job (in some meaningful sense) of comparing H0 to H1, (3) figure out the sampling distribution of the test statistic on the assumption that the null hypothesis is true (in this case ...
Making Decisions with Data: Understanding Hypothesis Testing
Statistical methods are indispensable to the practice of science. But statistical hypothesis testing can seem daunting, with P-values, null hypotheses, and the concept of statistical significance. This article explains the concepts associated with statistical hypothesis testing using the story of "the lady tasting tea," then walks the reader through an application of the independent ...
9.1: Introduction to Hypothesis Testing
In hypothesis testing, the goal is to see if there is sufficient statistical evidence to reject a presumed null hypothesis in favor of a conjectured alternative hypothesis.The null hypothesis is usually denoted $H_0$ while the alternative hypothesis is usually denoted $H_1$. An hypothesis test is a statistical decision; the conclusion will either be to reject the null hypothesis in favor ...
Hypothesis Testing for Decision-Making
In hypothesis testing, the decision about rejecting or not rejecting the null hypothesis depends upon the value of test statistic. A test statistic is a random variable X whose value is tested against the critical value to arrive at a decision.. If a random sample of size n is drawn from the normal population with mean, μ and variance, σ 2, then the sampling distribution of mean will also be ...
Statistical Hypothesis Testing Overview
Hypothesis testing is a crucial procedure to perform when you want to make inferences about a population using a random sample. These inferences include estimating population properties such as the mean, differences between means, proportions, and the relationships between variables. This post provides an overview of statistical hypothesis testing.
Insights in Hypothesis Testing and Making Decisions in Biomedical
Two-sided hypothesis tests are dual to two-sided confidence intervals. A parameter value is in the (1-α)x100% confidence interval if-and-only-if the hypothesis test whose assumed value under the null hypothesis is that parameter value accepts the null at level α. The principle is called the duality of hypothesis testing and confidence ...
8.4: Making Decisions
There's one more thing to point out about the hypothesis test that we've just constructed. If we take a moment to think about the statistical hypotheses we've been using, H 0: θ θ =.5. H 1: θ θ ≠.5. we notice that the alternative hypothesis covers both the possibility that θ θ <.5 and the possibility that θ θ >.5.
Hypothesis Testing
Now, as you might know that the area under the curve denotes the total probability, the area beyond the critical value will be 5%. Or we can say that we have a confidence interval of 95%. If you decrease your critical value to 0.01, you make your Hypothesis Test more strict as you now want the new p-value to be even lower (less than 0.01) if ...
The Hypothesis Test: Your Guide to Making Data-Driven Decisions
In summary, hypothesis testing is a statistical procedure used to evaluate the validity of a claim about a population parameter. It involves stating the null and alternative hypotheses, collecting ...
How to Write a Great Hypothesis
What is a hypothesis and how can you write a great one for your research? A hypothesis is a tentative statement about the relationship between two or more variables that can be tested empirically. Find out how to formulate a clear, specific, and testable hypothesis with examples and tips from Verywell Mind, a trusted source of psychology and mental health information.
How Do You Formulate (Important) Hypotheses?
Building on the ideas in Chap. 1, we describe formulating, testing, and revising hypotheses as a continuing cycle of clarifying what you want to study, making predictions about what you might find together with developing your reasons for these predictions, imagining tests of these predictions, revising your predictions and rationales, and so ...
What Is a Hypothesis and How Do I Write One?
Merriam Webster defines a hypothesis as "an assumption or concession made for the sake of argument.". In other words, a hypothesis is an educated guess. Scientists make a reasonable assumption--or a hypothesis--then design an experiment to test whether it's true or not.
20 Pros and Cons of Hypothesis Testing
Pros of Hypothesis Testing. Objective Decision Making: Hypothesis testing provides a structured approach to decision making. Instead of relying on intuition or subjective judgments, decisions are based on objective criteria and test results. For instance, when determining if a new drug is effective, hypothesis testing will rely on the collected ...
Hypothesis Testing: What, How and Why [+ 5 Learning Resources]
Final Word. Hypothesis testing helps verify an assumption and then develop statistical data based on the assessment. It is being utilized in many sectors, from manufacturing and agriculture to clinical trials and IT. This method is not only accurate but also helps you make data-driven decisions for your organization.
How can you optimize business outcomes with hypothesis testing?
6. Hypothesis testing is a powerful tool for data analysis that can help you make informed decisions and optimize your business outcomes. It allows you to compare different scenarios, measure the ...
9.1: Null and Alternative Hypotheses
Review. In a hypothesis test, sample data is evaluated in order to arrive at a decision about some type of claim.If certain conditions about the sample are satisfied, then the claim can be evaluated for a population. In a hypothesis test, we: Evaluate the null hypothesis, typically denoted with $H_{0}$.The null is not rejected unless the hypothesis test shows otherwise.
Business Analysts Help Make Data-Driven Decisions
Companies and organizations across all industries rely on knowledgeable business analysts to help them make informed business decisions. If you are interested in a career path that allows you to put your critical thinking and analytical skills to use, then working as a business analyst could be an ideal fit. As modern businesses continue to ...
Decision Strategies: 4 Steps to Success
Key points. Research shows decisions involve balancing thinking things through and trusting your gut feeling. A 4-step approach can help make stress-free decisions.
Are You a Micromanager?
Do you consider feedback a one-way street? Feedback should be a two-way conversation. Recognize people's strengths, treat mistakes like learning opportunities, and seek out opinions from your ...