News alert: UC Berkeley has announced its next university librarian

Secondary menu

  • Log in to your Library account
  • Hours and Maps
  • Connect from Off Campus
  • UC Berkeley Home

Search form

Research methods--quantitative, qualitative, and more: overview.

  • Quantitative Research
  • Qualitative Research
  • Data Science Methods (Machine Learning, AI, Big Data)
  • Text Mining and Computational Text Analysis
  • Evidence Synthesis/Systematic Reviews
  • Get Data, Get Help!

About Research Methods

This guide provides an overview of research methods, how to choose and use them, and supports and resources at UC Berkeley. 

As Patten and Newhart note in the book Understanding Research Methods , "Research methods are the building blocks of the scientific enterprise. They are the "how" for building systematic knowledge. The accumulation of knowledge through research is by its nature a collective endeavor. Each well-designed study provides evidence that may support, amend, refute, or deepen the understanding of existing knowledge...Decisions are important throughout the practice of research and are designed to help researchers collect evidence that includes the full spectrum of the phenomenon under study, to maintain logical rules, and to mitigate or account for possible sources of bias. In many ways, learning research methods is learning how to see and make these decisions."

The choice of methods varies by discipline, by the kind of phenomenon being studied and the data being used to study it, by the technology available, and more.  This guide is an introduction, but if you don't see what you need here, always contact your subject librarian, and/or take a look to see if there's a library research guide that will answer your question. 

Suggestions for changes and additions to this guide are welcome! 

START HERE: SAGE Research Methods

Without question, the most comprehensive resource available from the library is SAGE Research Methods.  HERE IS THE ONLINE GUIDE  to this one-stop shopping collection, and some helpful links are below:

  • SAGE Research Methods
  • Little Green Books  (Quantitative Methods)
  • Little Blue Books  (Qualitative Methods)
  • Dictionaries and Encyclopedias  
  • Case studies of real research projects
  • Sample datasets for hands-on practice
  • Streaming video--see methods come to life
  • Methodspace- -a community for researchers
  • SAGE Research Methods Course Mapping

Library Data Services at UC Berkeley

Library Data Services Program and Digital Scholarship Services

The LDSP offers a variety of services and tools !  From this link, check out pages for each of the following topics:  discovering data, managing data, collecting data, GIS data, text data mining, publishing data, digital scholarship, open science, and the Research Data Management Program.

Be sure also to check out the visual guide to where to seek assistance on campus with any research question you may have!

Library GIS Services

Other Data Services at Berkeley

D-Lab Supports Berkeley faculty, staff, and graduate students with research in data intensive social science, including a wide range of training and workshop offerings Dryad Dryad is a simple self-service tool for researchers to use in publishing their datasets. It provides tools for the effective publication of and access to research data. Geospatial Innovation Facility (GIF) Provides leadership and training across a broad array of integrated mapping technologies on campu Research Data Management A UC Berkeley guide and consulting service for research data management issues

General Research Methods Resources

Here are some general resources for assistance:

  • Assistance from ICPSR (must create an account to access): Getting Help with Data , and Resources for Students
  • Wiley Stats Ref for background information on statistics topics
  • Survey Documentation and Analysis (SDA) .  Program for easy web-based analysis of survey data.

Consultants

  • D-Lab/Data Science Discovery Consultants Request help with your research project from peer consultants.
  • Research data (RDM) consulting Meet with RDM consultants before designing the data security, storage, and sharing aspects of your qualitative project.
  • Statistics Department Consulting Services A service in which advanced graduate students, under faculty supervision, are available to consult during specified hours in the Fall and Spring semesters.

Related Resourcex

  • IRB / CPHS Qualitative research projects with human subjects often require that you go through an ethics review.
  • OURS (Office of Undergraduate Research and Scholarships) OURS supports undergraduates who want to embark on research projects and assistantships. In particular, check out their "Getting Started in Research" workshops
  • Sponsored Projects Sponsored projects works with researchers applying for major external grants.
  • Next: Quantitative Research >>
  • Last Updated: Apr 3, 2023 3:14 PM
  • URL: https://guides.lib.berkeley.edu/researchmethods

2.2 Research Methods

Learning objectives.

By the end of this section, you should be able to:

  • Recall the 6 Steps of the Scientific Method
  • Differentiate between four kinds of research methods: surveys, field research, experiments, and secondary data analysis.
  • Explain the appropriateness of specific research approaches for specific topics.

Sociologists examine the social world, see a problem or interesting pattern, and set out to study it. They use research methods to design a study. Planning the research design is a key step in any sociological study. Sociologists generally choose from widely used methods of social investigation: primary source data collection such as survey, participant observation, ethnography, case study, unobtrusive observations, experiment, and secondary data analysis , or use of existing sources. Every research method comes with plusses and minuses, and the topic of study strongly influences which method or methods are put to use. When you are conducting research think about the best way to gather or obtain knowledge about your topic, think of yourself as an architect. An architect needs a blueprint to build a house, as a sociologist your blueprint is your research design including your data collection method.

When entering a particular social environment, a researcher must be careful. There are times to remain anonymous and times to be overt. There are times to conduct interviews and times to simply observe. Some participants need to be thoroughly informed; others should not know they are being observed. A researcher wouldn’t stroll into a crime-ridden neighborhood at midnight, calling out, “Any gang members around?”

Making sociologists’ presence invisible is not always realistic for other reasons. That option is not available to a researcher studying prison behaviors, early education, or the Ku Klux Klan. Researchers can’t just stroll into prisons, kindergarten classrooms, or Klan meetings and unobtrusively observe behaviors or attract attention. In situations like these, other methods are needed. Researchers choose methods that best suit their study topics, protect research participants or subjects, and that fit with their overall approaches to research.

As a research method, a survey collects data from subjects who respond to a series of questions about behaviors and opinions, often in the form of a questionnaire or an interview. The survey is one of the most widely used scientific research methods. The standard survey format allows individuals a level of anonymity in which they can express personal ideas.

At some point, most people in the United States respond to some type of survey. The 2020 U.S. Census is an excellent example of a large-scale survey intended to gather sociological data. Since 1790, United States has conducted a survey consisting of six questions to received demographical data pertaining to residents. The questions pertain to the demographics of the residents who live in the United States. Currently, the Census is received by residents in the United Stated and five territories and consists of 12 questions.

Not all surveys are considered sociological research, however, and many surveys people commonly encounter focus on identifying marketing needs and strategies rather than testing a hypothesis or contributing to social science knowledge. Questions such as, “How many hot dogs do you eat in a month?” or “Were the staff helpful?” are not usually designed as scientific research. The Nielsen Ratings determine the popularity of television programming through scientific market research. However, polls conducted by television programs such as American Idol or So You Think You Can Dance cannot be generalized, because they are administered to an unrepresentative population, a specific show’s audience. You might receive polls through your cell phones or emails, from grocery stores, restaurants, and retail stores. They often provide you incentives for completing the survey.

Sociologists conduct surveys under controlled conditions for specific purposes. Surveys gather different types of information from people. While surveys are not great at capturing the ways people really behave in social situations, they are a great method for discovering how people feel, think, and act—or at least how they say they feel, think, and act. Surveys can track preferences for presidential candidates or reported individual behaviors (such as sleeping, driving, or texting habits) or information such as employment status, income, and education levels.

A survey targets a specific population , people who are the focus of a study, such as college athletes, international students, or teenagers living with type 1 (juvenile-onset) diabetes. Most researchers choose to survey a small sector of the population, or a sample , a manageable number of subjects who represent a larger population. The success of a study depends on how well a population is represented by the sample. In a random sample , every person in a population has the same chance of being chosen for the study. As a result, a Gallup Poll, if conducted as a nationwide random sampling, should be able to provide an accurate estimate of public opinion whether it contacts 2,000 or 10,000 people.

After selecting subjects, the researcher develops a specific plan to ask questions and record responses. It is important to inform subjects of the nature and purpose of the survey up front. If they agree to participate, researchers thank subjects and offer them a chance to see the results of the study if they are interested. The researcher presents the subjects with an instrument, which is a means of gathering the information.

A common instrument is a questionnaire. Subjects often answer a series of closed-ended questions . The researcher might ask yes-or-no or multiple-choice questions, allowing subjects to choose possible responses to each question. This kind of questionnaire collects quantitative data —data in numerical form that can be counted and statistically analyzed. Just count up the number of “yes” and “no” responses or correct answers, and chart them into percentages.

Questionnaires can also ask more complex questions with more complex answers—beyond “yes,” “no,” or checkbox options. These types of inquiries use open-ended questions that require short essay responses. Participants willing to take the time to write those answers might convey personal religious beliefs, political views, goals, or morals. The answers are subjective and vary from person to person. How do you plan to use your college education?

Some topics that investigate internal thought processes are impossible to observe directly and are difficult to discuss honestly in a public forum. People are more likely to share honest answers if they can respond to questions anonymously. This type of personal explanation is qualitative data —conveyed through words. Qualitative information is harder to organize and tabulate. The researcher will end up with a wide range of responses, some of which may be surprising. The benefit of written opinions, though, is the wealth of in-depth material that they provide.

An interview is a one-on-one conversation between the researcher and the subject, and it is a way of conducting surveys on a topic. However, participants are free to respond as they wish, without being limited by predetermined choices. In the back-and-forth conversation of an interview, a researcher can ask for clarification, spend more time on a subtopic, or ask additional questions. In an interview, a subject will ideally feel free to open up and answer questions that are often complex. There are no right or wrong answers. The subject might not even know how to answer the questions honestly.

Questions such as “How does society’s view of alcohol consumption influence your decision whether or not to take your first sip of alcohol?” or “Did you feel that the divorce of your parents would put a social stigma on your family?” involve so many factors that the answers are difficult to categorize. A researcher needs to avoid steering or prompting the subject to respond in a specific way; otherwise, the results will prove to be unreliable. The researcher will also benefit from gaining a subject’s trust, from empathizing or commiserating with a subject, and from listening without judgment.

Surveys often collect both quantitative and qualitative data. For example, a researcher interviewing people who are incarcerated might receive quantitative data, such as demographics – race, age, sex, that can be analyzed statistically. For example, the researcher might discover that 20 percent of incarcerated people are above the age of 50. The researcher might also collect qualitative data, such as why people take advantage of educational opportunities during their sentence and other explanatory information.

The survey can be carried out online, over the phone, by mail, or face-to-face. When researchers collect data outside a laboratory, library, or workplace setting, they are conducting field research, which is our next topic.

Field Research

The work of sociology rarely happens in limited, confined spaces. Rather, sociologists go out into the world. They meet subjects where they live, work, and play. Field research refers to gathering primary data from a natural environment. To conduct field research, the sociologist must be willing to step into new environments and observe, participate, or experience those worlds. In field work, the sociologists, rather than the subjects, are the ones out of their element.

The researcher interacts with or observes people and gathers data along the way. The key point in field research is that it takes place in the subject’s natural environment, whether it’s a coffee shop or tribal village, a homeless shelter or the DMV, a hospital, airport, mall, or beach resort.

While field research often begins in a specific setting , the study’s purpose is to observe specific behaviors in that setting. Field work is optimal for observing how people think and behave. It seeks to understand why they behave that way. However, researchers may struggle to narrow down cause and effect when there are so many variables floating around in a natural environment. And while field research looks for correlation, its small sample size does not allow for establishing a causal relationship between two variables. Indeed, much of the data gathered in sociology do not identify a cause and effect but a correlation .

Sociology in the Real World

Beyoncé and lady gaga as sociological subjects.

Sociologists have studied Lady Gaga and Beyoncé and their impact on music, movies, social media, fan participation, and social equality. In their studies, researchers have used several research methods including secondary analysis, participant observation, and surveys from concert participants.

In their study, Click, Lee & Holiday (2013) interviewed 45 Lady Gaga fans who utilized social media to communicate with the artist. These fans viewed Lady Gaga as a mirror of themselves and a source of inspiration. Like her, they embrace not being a part of mainstream culture. Many of Lady Gaga’s fans are members of the LGBTQ community. They see the “song “Born This Way” as a rallying cry and answer her calls for “Paws Up” with a physical expression of solidarity—outstretched arms and fingers bent and curled to resemble monster claws.”

Sascha Buchanan (2019) made use of participant observation to study the relationship between two fan groups, that of Beyoncé and that of Rihanna. She observed award shows sponsored by iHeartRadio, MTV EMA, and BET that pit one group against another as they competed for Best Fan Army, Biggest Fans, and FANdemonium. Buchanan argues that the media thus sustains a myth of rivalry between the two most commercially successful Black women vocal artists.

Participant Observation

In 2000, a comic writer named Rodney Rothman wanted an insider’s view of white-collar work. He slipped into the sterile, high-rise offices of a New York “dot com” agency. Every day for two weeks, he pretended to work there. His main purpose was simply to see whether anyone would notice him or challenge his presence. No one did. The receptionist greeted him. The employees smiled and said good morning. Rothman was accepted as part of the team. He even went so far as to claim a desk, inform the receptionist of his whereabouts, and attend a meeting. He published an article about his experience in The New Yorker called “My Fake Job” (2000). Later, he was discredited for allegedly fabricating some details of the story and The New Yorker issued an apology. However, Rothman’s entertaining article still offered fascinating descriptions of the inside workings of a “dot com” company and exemplified the lengths to which a writer, or a sociologist, will go to uncover material.

Rothman had conducted a form of study called participant observation , in which researchers join people and participate in a group’s routine activities for the purpose of observing them within that context. This method lets researchers experience a specific aspect of social life. A researcher might go to great lengths to get a firsthand look into a trend, institution, or behavior. A researcher might work as a waitress in a diner, experience homelessness for several weeks, or ride along with police officers as they patrol their regular beat. Often, these researchers try to blend in seamlessly with the population they study, and they may not disclose their true identity or purpose if they feel it would compromise the results of their research.

At the beginning of a field study, researchers might have a question: “What really goes on in the kitchen of the most popular diner on campus?” or “What is it like to be homeless?” Participant observation is a useful method if the researcher wants to explore a certain environment from the inside.

Field researchers simply want to observe and learn. In such a setting, the researcher will be alert and open minded to whatever happens, recording all observations accurately. Soon, as patterns emerge, questions will become more specific, observations will lead to hypotheses, and hypotheses will guide the researcher in analyzing data and generating results.

In a study of small towns in the United States conducted by sociological researchers John S. Lynd and Helen Merrell Lynd, the team altered their purpose as they gathered data. They initially planned to focus their study on the role of religion in U.S. towns. As they gathered observations, they realized that the effect of industrialization and urbanization was the more relevant topic of this social group. The Lynds did not change their methods, but they revised the purpose of their study.

This shaped the structure of Middletown: A Study in Modern American Culture , their published results (Lynd & Lynd, 1929).

The Lynds were upfront about their mission. The townspeople of Muncie, Indiana, knew why the researchers were in their midst. But some sociologists prefer not to alert people to their presence. The main advantage of covert participant observation is that it allows the researcher access to authentic, natural behaviors of a group’s members. The challenge, however, is gaining access to a setting without disrupting the pattern of others’ behavior. Becoming an inside member of a group, organization, or subculture takes time and effort. Researchers must pretend to be something they are not. The process could involve role playing, making contacts, networking, or applying for a job.

Once inside a group, some researchers spend months or even years pretending to be one of the people they are observing. However, as observers, they cannot get too involved. They must keep their purpose in mind and apply the sociological perspective. That way, they illuminate social patterns that are often unrecognized. Because information gathered during participant observation is mostly qualitative, rather than quantitative, the end results are often descriptive or interpretive. The researcher might present findings in an article or book and describe what he or she witnessed and experienced.

This type of research is what journalist Barbara Ehrenreich conducted for her book Nickel and Dimed . One day over lunch with her editor, Ehrenreich mentioned an idea. How can people exist on minimum-wage work? How do low-income workers get by? she wondered. Someone should do a study . To her surprise, her editor responded, Why don’t you do it?

That’s how Ehrenreich found herself joining the ranks of the working class. For several months, she left her comfortable home and lived and worked among people who lacked, for the most part, higher education and marketable job skills. Undercover, she applied for and worked minimum wage jobs as a waitress, a cleaning woman, a nursing home aide, and a retail chain employee. During her participant observation, she used only her income from those jobs to pay for food, clothing, transportation, and shelter.

She discovered the obvious, that it’s almost impossible to get by on minimum wage work. She also experienced and observed attitudes many middle and upper-class people never think about. She witnessed firsthand the treatment of working class employees. She saw the extreme measures people take to make ends meet and to survive. She described fellow employees who held two or three jobs, worked seven days a week, lived in cars, could not pay to treat chronic health conditions, got randomly fired, submitted to drug tests, and moved in and out of homeless shelters. She brought aspects of that life to light, describing difficult working conditions and the poor treatment that low-wage workers suffer.

The book she wrote upon her return to her real life as a well-paid writer, has been widely read and used in many college classrooms.

Ethnography

Ethnography is the immersion of the researcher in the natural setting of an entire social community to observe and experience their everyday life and culture. The heart of an ethnographic study focuses on how subjects view their own social standing and how they understand themselves in relation to a social group.

An ethnographic study might observe, for example, a small U.S. fishing town, an Inuit community, a village in Thailand, a Buddhist monastery, a private boarding school, or an amusement park. These places all have borders. People live, work, study, or vacation within those borders. People are there for a certain reason and therefore behave in certain ways and respect certain cultural norms. An ethnographer would commit to spending a determined amount of time studying every aspect of the chosen place, taking in as much as possible.

A sociologist studying a tribe in the Amazon might watch the way villagers go about their daily lives and then write a paper about it. To observe a spiritual retreat center, an ethnographer might sign up for a retreat and attend as a guest for an extended stay, observe and record data, and collate the material into results.

Institutional Ethnography

Institutional ethnography is an extension of basic ethnographic research principles that focuses intentionally on everyday concrete social relationships. Developed by Canadian sociologist Dorothy E. Smith (1990), institutional ethnography is often considered a feminist-inspired approach to social analysis and primarily considers women’s experiences within male- dominated societies and power structures. Smith’s work is seen to challenge sociology’s exclusion of women, both academically and in the study of women’s lives (Fenstermaker, n.d.).

Historically, social science research tended to objectify women and ignore their experiences except as viewed from the male perspective. Modern feminists note that describing women, and other marginalized groups, as subordinates helps those in authority maintain their own dominant positions (Social Sciences and Humanities Research Council of Canada n.d.). Smith’s three major works explored what she called “the conceptual practices of power” and are still considered seminal works in feminist theory and ethnography (Fensternmaker n.d.).

Sociological Research

The making of middletown: a study in modern u.s. culture.

In 1924, a young married couple named Robert and Helen Lynd undertook an unprecedented ethnography: to apply sociological methods to the study of one U.S. city in order to discover what “ordinary” people in the United States did and believed. Choosing Muncie, Indiana (population about 30,000) as their subject, they moved to the small town and lived there for eighteen months.

Ethnographers had been examining other cultures for decades—groups considered minorities or outsiders—like gangs, immigrants, and the poor. But no one had studied the so-called average American.

Recording interviews and using surveys to gather data, the Lynds objectively described what they observed. Researching existing sources, they compared Muncie in 1890 to the Muncie they observed in 1924. Most Muncie adults, they found, had grown up on farms but now lived in homes inside the city. As a result, the Lynds focused their study on the impact of industrialization and urbanization.

They observed that Muncie was divided into business and working class groups. They defined business class as dealing with abstract concepts and symbols, while working class people used tools to create concrete objects. The two classes led different lives with different goals and hopes. However, the Lynds observed, mass production offered both classes the same amenities. Like wealthy families, the working class was now able to own radios, cars, washing machines, telephones, vacuum cleaners, and refrigerators. This was an emerging material reality of the 1920s.

As the Lynds worked, they divided their manuscript into six chapters: Getting a Living, Making a Home, Training the Young, Using Leisure, Engaging in Religious Practices, and Engaging in Community Activities.

When the study was completed, the Lynds encountered a big problem. The Rockefeller Foundation, which had commissioned the book, claimed it was useless and refused to publish it. The Lynds asked if they could seek a publisher themselves.

Middletown: A Study in Modern American Culture was not only published in 1929 but also became an instant bestseller, a status unheard of for a sociological study. The book sold out six printings in its first year of publication, and has never gone out of print (Caplow, Hicks, & Wattenberg. 2000).

Nothing like it had ever been done before. Middletown was reviewed on the front page of the New York Times. Readers in the 1920s and 1930s identified with the citizens of Muncie, Indiana, but they were equally fascinated by the sociological methods and the use of scientific data to define ordinary people in the United States. The book was proof that social data was important—and interesting—to the U.S. public.

Sometimes a researcher wants to study one specific person or event. A case study is an in-depth analysis of a single event, situation, or individual. To conduct a case study, a researcher examines existing sources like documents and archival records, conducts interviews, engages in direct observation and even participant observation, if possible.

Researchers might use this method to study a single case of a foster child, drug lord, cancer patient, criminal, or rape victim. However, a major criticism of the case study as a method is that while offering depth on a topic, it does not provide enough evidence to form a generalized conclusion. In other words, it is difficult to make universal claims based on just one person, since one person does not verify a pattern. This is why most sociologists do not use case studies as a primary research method.

However, case studies are useful when the single case is unique. In these instances, a single case study can contribute tremendous insight. For example, a feral child, also called “wild child,” is one who grows up isolated from human beings. Feral children grow up without social contact and language, which are elements crucial to a “civilized” child’s development. These children mimic the behaviors and movements of animals, and often invent their own language. There are only about one hundred cases of “feral children” in the world.

As you may imagine, a feral child is a subject of great interest to researchers. Feral children provide unique information about child development because they have grown up outside of the parameters of “normal” growth and nurturing. And since there are very few feral children, the case study is the most appropriate method for researchers to use in studying the subject.

At age three, a Ukranian girl named Oxana Malaya suffered severe parental neglect. She lived in a shed with dogs, and she ate raw meat and scraps. Five years later, a neighbor called authorities and reported seeing a girl who ran on all fours, barking. Officials brought Oxana into society, where she was cared for and taught some human behaviors, but she never became fully socialized. She has been designated as unable to support herself and now lives in a mental institution (Grice 2011). Case studies like this offer a way for sociologists to collect data that may not be obtained by any other method.

Experiments

You have probably tested some of your own personal social theories. “If I study at night and review in the morning, I’ll improve my retention skills.” Or, “If I stop drinking soda, I’ll feel better.” Cause and effect. If this, then that. When you test the theory, your results either prove or disprove your hypothesis.

One way researchers test social theories is by conducting an experiment , meaning they investigate relationships to test a hypothesis—a scientific approach.

There are two main types of experiments: lab-based experiments and natural or field experiments. In a lab setting, the research can be controlled so that more data can be recorded in a limited amount of time. In a natural or field- based experiment, the time it takes to gather the data cannot be controlled but the information might be considered more accurate since it was collected without interference or intervention by the researcher.

As a research method, either type of sociological experiment is useful for testing if-then statements: if a particular thing happens (cause), then another particular thing will result (effect). To set up a lab-based experiment, sociologists create artificial situations that allow them to manipulate variables.

Classically, the sociologist selects a set of people with similar characteristics, such as age, class, race, or education. Those people are divided into two groups. One is the experimental group and the other is the control group. The experimental group is exposed to the independent variable(s) and the control group is not. To test the benefits of tutoring, for example, the sociologist might provide tutoring to the experimental group of students but not to the control group. Then both groups would be tested for differences in performance to see if tutoring had an effect on the experimental group of students. As you can imagine, in a case like this, the researcher would not want to jeopardize the accomplishments of either group of students, so the setting would be somewhat artificial. The test would not be for a grade reflected on their permanent record of a student, for example.

And if a researcher told the students they would be observed as part of a study on measuring the effectiveness of tutoring, the students might not behave naturally. This is called the Hawthorne effect —which occurs when people change their behavior because they know they are being watched as part of a study. The Hawthorne effect is unavoidable in some research studies because sociologists have to make the purpose of the study known. Subjects must be aware that they are being observed, and a certain amount of artificiality may result (Sonnenfeld 1985).

A real-life example will help illustrate the process. In 1971, Frances Heussenstamm, a sociology professor at California State University at Los Angeles, had a theory about police prejudice. To test her theory, she conducted research. She chose fifteen students from three ethnic backgrounds: Black, White, and Hispanic. She chose students who routinely drove to and from campus along Los Angeles freeway routes, and who had had perfect driving records for longer than a year.

Next, she placed a Black Panther bumper sticker on each car. That sticker, a representation of a social value, was the independent variable. In the 1970s, the Black Panthers were a revolutionary group actively fighting racism. Heussenstamm asked the students to follow their normal driving patterns. She wanted to see whether seeming support for the Black Panthers would change how these good drivers were treated by the police patrolling the highways. The dependent variable would be the number of traffic stops/citations.

The first arrest, for an incorrect lane change, was made two hours after the experiment began. One participant was pulled over three times in three days. He quit the study. After seventeen days, the fifteen drivers had collected a total of thirty-three traffic citations. The research was halted. The funding to pay traffic fines had run out, and so had the enthusiasm of the participants (Heussenstamm, 1971).

Secondary Data Analysis

While sociologists often engage in original research studies, they also contribute knowledge to the discipline through secondary data analysis . Secondary data does not result from firsthand research collected from primary sources, but are the already completed work of other researchers or data collected by an agency or organization. Sociologists might study works written by historians, economists, teachers, or early sociologists. They might search through periodicals, newspapers, or magazines, or organizational data from any period in history.

Using available information not only saves time and money but can also add depth to a study. Sociologists often interpret findings in a new way, a way that was not part of an author’s original purpose or intention. To study how women were encouraged to act and behave in the 1960s, for example, a researcher might watch movies, televisions shows, and situation comedies from that period. Or to research changes in behavior and attitudes due to the emergence of television in the late 1950s and early 1960s, a sociologist would rely on new interpretations of secondary data. Decades from now, researchers will most likely conduct similar studies on the advent of mobile phones, the Internet, or social media.

Social scientists also learn by analyzing the research of a variety of agencies. Governmental departments and global groups, like the U.S. Bureau of Labor Statistics or the World Health Organization (WHO), publish studies with findings that are useful to sociologists. A public statistic like the foreclosure rate might be useful for studying the effects of a recession. A racial demographic profile might be compared with data on education funding to examine the resources accessible by different groups.

One of the advantages of secondary data like old movies or WHO statistics is that it is nonreactive research (or unobtrusive research), meaning that it does not involve direct contact with subjects and will not alter or influence people’s behaviors. Unlike studies requiring direct contact with people, using previously published data does not require entering a population and the investment and risks inherent in that research process.

Using available data does have its challenges. Public records are not always easy to access. A researcher will need to do some legwork to track them down and gain access to records. To guide the search through a vast library of materials and avoid wasting time reading unrelated sources, sociologists employ content analysis , applying a systematic approach to record and value information gleaned from secondary data as they relate to the study at hand.

Also, in some cases, there is no way to verify the accuracy of existing data. It is easy to count how many drunk drivers, for example, are pulled over by the police. But how many are not? While it’s possible to discover the percentage of teenage students who drop out of high school, it might be more challenging to determine the number who return to school or get their GED later.

Another problem arises when data are unavailable in the exact form needed or do not survey the topic from the precise angle the researcher seeks. For example, the average salaries paid to professors at a public school is public record. But these figures do not necessarily reveal how long it took each professor to reach the salary range, what their educational backgrounds are, or how long they’ve been teaching.

When conducting content analysis, it is important to consider the date of publication of an existing source and to take into account attitudes and common cultural ideals that may have influenced the research. For example, when Robert S. Lynd and Helen Merrell Lynd gathered research in the 1920s, attitudes and cultural norms were vastly different then than they are now. Beliefs about gender roles, race, education, and work have changed significantly since then. At the time, the study’s purpose was to reveal insights about small U.S. communities. Today, it is an illustration of 1920s attitudes and values.

As an Amazon Associate we earn from qualifying purchases.

This book may not be used in the training of large language models or otherwise be ingested into large language models or generative AI offerings without OpenStax's permission.

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution License and you must attribute OpenStax.

Access for free at https://openstax.org/books/introduction-sociology-3e/pages/1-introduction
  • Authors: Tonja R. Conerly, Kathleen Holmes, Asha Lal Tamang
  • Publisher/website: OpenStax
  • Book title: Introduction to Sociology 3e
  • Publication date: Jun 3, 2021
  • Location: Houston, Texas
  • Book URL: https://openstax.org/books/introduction-sociology-3e/pages/1-introduction
  • Section URL: https://openstax.org/books/introduction-sociology-3e/pages/2-2-research-methods

© Jan 18, 2024 OpenStax. Textbook content produced by OpenStax is licensed under a Creative Commons Attribution License . The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.

Research Methods In Psychology

Saul Mcleod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul Mcleod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Learn about our Editorial Process

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

Research methods in psychology are systematic procedures used to observe, describe, predict, and explain behavior and mental processes. They include experiments, surveys, case studies, and naturalistic observations, ensuring data collection is objective and reliable to understand and explain psychological phenomena.

research methods3

Hypotheses are statements about the prediction of the results, that can be verified or disproved by some investigation.

There are four types of hypotheses :
  • Null Hypotheses (H0 ) – these predict that no difference will be found in the results between the conditions. Typically these are written ‘There will be no difference…’
  • Alternative Hypotheses (Ha or H1) – these predict that there will be a significant difference in the results between the two conditions. This is also known as the experimental hypothesis.
  • One-tailed (directional) hypotheses – these state the specific direction the researcher expects the results to move in, e.g. higher, lower, more, less. In a correlation study, the predicted direction of the correlation can be either positive or negative.
  • Two-tailed (non-directional) hypotheses – these state that a difference will be found between the conditions of the independent variable but does not state the direction of a difference or relationship. Typically these are always written ‘There will be a difference ….’

All research has an alternative hypothesis (either a one-tailed or two-tailed) and a corresponding null hypothesis.

Once the research is conducted and results are found, psychologists must accept one hypothesis and reject the other. 

So, if a difference is found, the Psychologist would accept the alternative hypothesis and reject the null.  The opposite applies if no difference is found.

Sampling techniques

Sampling is the process of selecting a representative group from the population under study.

Sample Target Population

A sample is the participants you select from a target population (the group you are interested in) to make generalizations about.

Representative means the extent to which a sample mirrors a researcher’s target population and reflects its characteristics.

Generalisability means the extent to which their findings can be applied to the larger population of which their sample was a part.

  • Volunteer sample : where participants pick themselves through newspaper adverts, noticeboards or online.
  • Opportunity sampling : also known as convenience sampling , uses people who are available at the time the study is carried out and willing to take part. It is based on convenience.
  • Random sampling : when every person in the target population has an equal chance of being selected. An example of random sampling would be picking names out of a hat.
  • Systematic sampling : when a system is used to select participants. Picking every Nth person from all possible participants. N = the number of people in the research population / the number of people needed for the sample.
  • Stratified sampling : when you identify the subgroups and select participants in proportion to their occurrences.
  • Snowball sampling : when researchers find a few participants, and then ask them to find participants themselves and so on.
  • Quota sampling : when researchers will be told to ensure the sample fits certain quotas, for example they might be told to find 90 participants, with 30 of them being unemployed.

Experiments always have an independent and dependent variable .

  • The independent variable is the one the experimenter manipulates (the thing that changes between the conditions the participants are placed into). It is assumed to have a direct effect on the dependent variable.
  • The dependent variable is the thing being measured, or the results of the experiment.

variables

Operationalization of variables means making them measurable/quantifiable. We must use operationalization to ensure that variables are in a form that can be easily tested.

For instance, we can’t really measure ‘happiness’, but we can measure how many times a person smiles within a two-hour period. 

By operationalizing variables, we make it easy for someone else to replicate our research. Remember, this is important because we can check if our findings are reliable.

Extraneous variables are all variables which are not independent variable but could affect the results of the experiment.

It can be a natural characteristic of the participant, such as intelligence levels, gender, or age for example, or it could be a situational feature of the environment such as lighting or noise.

Demand characteristics are a type of extraneous variable that occurs if the participants work out the aims of the research study, they may begin to behave in a certain way.

For example, in Milgram’s research , critics argued that participants worked out that the shocks were not real and they administered them as they thought this was what was required of them. 

Extraneous variables must be controlled so that they do not affect (confound) the results.

Randomly allocating participants to their conditions or using a matched pairs experimental design can help to reduce participant variables. 

Situational variables are controlled by using standardized procedures, ensuring every participant in a given condition is treated in the same way

Experimental Design

Experimental design refers to how participants are allocated to each condition of the independent variable, such as a control or experimental group.
  • Independent design ( between-groups design ): each participant is selected for only one group. With the independent design, the most common way of deciding which participants go into which group is by means of randomization. 
  • Matched participants design : each participant is selected for only one group, but the participants in the two groups are matched for some relevant factor or factors (e.g. ability; sex; age).
  • Repeated measures design ( within groups) : each participant appears in both groups, so that there are exactly the same participants in each group.
  • The main problem with the repeated measures design is that there may well be order effects. Their experiences during the experiment may change the participants in various ways.
  • They may perform better when they appear in the second group because they have gained useful information about the experiment or about the task. On the other hand, they may perform less well on the second occasion because of tiredness or boredom.
  • Counterbalancing is the best way of preventing order effects from disrupting the findings of an experiment, and involves ensuring that each condition is equally likely to be used first and second by the participants.

If we wish to compare two groups with respect to a given independent variable, it is essential to make sure that the two groups do not differ in any other important way. 

Experimental Methods

All experimental methods involve an iv (independent variable) and dv (dependent variable)..

  • Field experiments are conducted in the everyday (natural) environment of the participants. The experimenter still manipulates the IV, but in a real-life setting. It may be possible to control extraneous variables, though such control is more difficult than in a lab experiment.
  • Natural experiments are when a naturally occurring IV is investigated that isn’t deliberately manipulated, it exists anyway. Participants are not randomly allocated, and the natural event may only occur rarely.

Case studies are in-depth investigations of a person, group, event, or community. It uses information from a range of sources, such as from the person concerned and also from their family and friends.

Many techniques may be used such as interviews, psychological tests, observations and experiments. Case studies are generally longitudinal: in other words, they follow the individual or group over an extended period of time. 

Case studies are widely used in psychology and among the best-known ones carried out were by Sigmund Freud . He conducted very detailed investigations into the private lives of his patients in an attempt to both understand and help them overcome their illnesses.

Case studies provide rich qualitative data and have high levels of ecological validity. However, it is difficult to generalize from individual cases as each one has unique characteristics.

Correlational Studies

Correlation means association; it is a measure of the extent to which two variables are related. One of the variables can be regarded as the predictor variable with the other one as the outcome variable.

Correlational studies typically involve obtaining two different measures from a group of participants, and then assessing the degree of association between the measures. 

The predictor variable can be seen as occurring before the outcome variable in some sense. It is called the predictor variable, because it forms the basis for predicting the value of the outcome variable.

Relationships between variables can be displayed on a graph or as a numerical score called a correlation coefficient.

types of correlation. Scatter plot. Positive negative and no correlation

  • If an increase in one variable tends to be associated with an increase in the other, then this is known as a positive correlation .
  • If an increase in one variable tends to be associated with a decrease in the other, then this is known as a negative correlation .
  • A zero correlation occurs when there is no relationship between variables.

After looking at the scattergraph, if we want to be sure that a significant relationship does exist between the two variables, a statistical test of correlation can be conducted, such as Spearman’s rho.

The test will give us a score, called a correlation coefficient . This is a value between 0 and 1, and the closer to 1 the score is, the stronger the relationship between the variables. This value can be both positive e.g. 0.63, or negative -0.63.

Types of correlation. Strong, weak, and perfect positive correlation, strong, weak, and perfect negative correlation, no correlation. Graphs or charts ...

A correlation between variables, however, does not automatically mean that the change in one variable is the cause of the change in the values of the other variable. A correlation only shows if there is a relationship between variables.

Correlation does not always prove causation, as a third variable may be involved. 

causation correlation

Interview Methods

Interviews are commonly divided into two types: structured and unstructured.

A fixed, predetermined set of questions is put to every participant in the same order and in the same way. 

Responses are recorded on a questionnaire, and the researcher presets the order and wording of questions, and sometimes the range of alternative answers.

The interviewer stays within their role and maintains social distance from the interviewee.

There are no set questions, and the participant can raise whatever topics he/she feels are relevant and ask them in their own way. Questions are posed about participants’ answers to the subject

Unstructured interviews are most useful in qualitative research to analyze attitudes and values.

Though they rarely provide a valid basis for generalization, their main advantage is that they enable the researcher to probe social actors’ subjective point of view. 

Questionnaire Method

Questionnaires can be thought of as a kind of written interview. They can be carried out face to face, by telephone, or post.

The choice of questions is important because of the need to avoid bias or ambiguity in the questions, ‘leading’ the respondent or causing offense.

  • Open questions are designed to encourage a full, meaningful answer using the subject’s own knowledge and feelings. They provide insights into feelings, opinions, and understanding. Example: “How do you feel about that situation?”
  • Closed questions can be answered with a simple “yes” or “no” or specific information, limiting the depth of response. They are useful for gathering specific facts or confirming details. Example: “Do you feel anxious in crowds?”

Its other practical advantages are that it is cheaper than face-to-face interviews and can be used to contact many respondents scattered over a wide area relatively quickly.

Observations

There are different types of observation methods :
  • Covert observation is where the researcher doesn’t tell the participants they are being observed until after the study is complete. There could be ethical problems or deception and consent with this particular observation method.
  • Overt observation is where a researcher tells the participants they are being observed and what they are being observed for.
  • Controlled : behavior is observed under controlled laboratory conditions (e.g., Bandura’s Bobo doll study).
  • Natural : Here, spontaneous behavior is recorded in a natural setting.
  • Participant : Here, the observer has direct contact with the group of people they are observing. The researcher becomes a member of the group they are researching.  
  • Non-participant (aka “fly on the wall): The researcher does not have direct contact with the people being observed. The observation of participants’ behavior is from a distance

Pilot Study

A pilot  study is a small scale preliminary study conducted in order to evaluate the feasibility of the key s teps in a future, full-scale project.

A pilot study is an initial run-through of the procedures to be used in an investigation; it involves selecting a few people and trying out the study on them. It is possible to save time, and in some cases, money, by identifying any flaws in the procedures designed by the researcher.

A pilot study can help the researcher spot any ambiguities (i.e. unusual things) or confusion in the information given to participants or problems with the task devised.

Sometimes the task is too hard, and the researcher may get a floor effect, because none of the participants can score at all or can complete the task – all performances are low.

The opposite effect is a ceiling effect, when the task is so easy that all achieve virtually full marks or top performances and are “hitting the ceiling”.

Research Design

In cross-sectional research , a researcher compares multiple segments of the population at the same time

Sometimes, we want to see how people change over time, as in studies of human development and lifespan. Longitudinal research is a research design in which data-gathering is administered repeatedly over an extended period of time.

In cohort studies , the participants must share a common factor or characteristic such as age, demographic, or occupation. A cohort study is a type of longitudinal study in which researchers monitor and observe a chosen population over an extended period.

Triangulation means using more than one research method to improve the study’s validity.

Reliability

Reliability is a measure of consistency, if a particular measurement is repeated and the same result is obtained then it is described as being reliable.

  • Test-retest reliability :  assessing the same person on two different occasions which shows the extent to which the test produces the same answers.
  • Inter-observer reliability : the extent to which there is an agreement between two or more observers.

Meta-Analysis

A meta-analysis is a systematic review that involves identifying an aim and then searching for research studies that have addressed similar aims/hypotheses.

This is done by looking through various databases, and then decisions are made about what studies are to be included/excluded.

Strengths: Increases the conclusions’ validity as they’re based on a wider range.

Weaknesses: Research designs in studies can vary, so they are not truly comparable.

Peer Review

A researcher submits an article to a journal. The choice of the journal may be determined by the journal’s audience or prestige.

The journal selects two or more appropriate experts (psychologists working in a similar field) to peer review the article without payment. The peer reviewers assess: the methods and designs used, originality of the findings, the validity of the original research findings and its content, structure and language.

Feedback from the reviewer determines whether the article is accepted. The article may be: Accepted as it is, accepted with revisions, sent back to the author to revise and re-submit or rejected without the possibility of submission.

The editor makes the final decision whether to accept or reject the research report based on the reviewers comments/ recommendations.

Peer review is important because it prevent faulty data from entering the public domain, it provides a way of checking the validity of findings and the quality of the methodology and is used to assess the research rating of university departments.

Peer reviews may be an ideal, whereas in practice there are lots of problems. For example, it slows publication down and may prevent unusual, new work being published. Some reviewers might use it as an opportunity to prevent competing researchers from publishing work.

Some people doubt whether peer review can really prevent the publication of fraudulent research.

The advent of the internet means that a lot of research and academic comment is being published without official peer reviews than before, though systems are evolving on the internet where everyone really has a chance to offer their opinions and police the quality of research.

Types of Data

  • Quantitative data is numerical data e.g. reaction time or number of mistakes. It represents how much or how long, how many there are of something. A tally of behavioral categories and closed questions in a questionnaire collect quantitative data.
  • Qualitative data is virtually any type of information that can be observed and recorded that is not numerical in nature and can be in the form of written or verbal communication. Open questions in questionnaires and accounts from observational studies collect qualitative data.
  • Primary data is first-hand data collected for the purpose of the investigation.
  • Secondary data is information that has been collected by someone other than the person who is conducting the research e.g. taken from journals, books or articles.

Validity means how well a piece of research actually measures what it sets out to, or how well it reflects the reality it claims to represent.

Validity is whether the observed effect is genuine and represents what is actually out there in the world.

  • Concurrent validity is the extent to which a psychological measure relates to an existing similar measure and obtains close results. For example, a new intelligence test compared to an established test.
  • Face validity : does the test measure what it’s supposed to measure ‘on the face of it’. This is done by ‘eyeballing’ the measuring or by passing it to an expert to check.
  • Ecological validit y is the extent to which findings from a research study can be generalized to other settings / real life.
  • Temporal validity is the extent to which findings from a research study can be generalized to other historical times.

Features of Science

  • Paradigm – A set of shared assumptions and agreed methods within a scientific discipline.
  • Paradigm shift – The result of the scientific revolution: a significant change in the dominant unifying theory within a scientific discipline.
  • Objectivity – When all sources of personal bias are minimised so not to distort or influence the research process.
  • Empirical method – Scientific approaches that are based on the gathering of evidence through direct observation and experience.
  • Replicability – The extent to which scientific procedures and findings can be repeated by other researchers.
  • Falsifiability – The principle that a theory cannot be considered scientific unless it admits the possibility of being proved untrue.

Statistical Testing

A significant result is one where there is a low probability that chance factors were responsible for any observed difference, correlation, or association in the variables tested.

If our test is significant, we can reject our null hypothesis and accept our alternative hypothesis.

If our test is not significant, we can accept our null hypothesis and reject our alternative hypothesis. A null hypothesis is a statement of no effect.

In Psychology, we use p < 0.05 (as it strikes a balance between making a type I and II error) but p < 0.01 is used in tests that could cause harm like introducing a new drug.

A type I error is when the null hypothesis is rejected when it should have been accepted (happens when a lenient significance level is used, an error of optimism).

A type II error is when the null hypothesis is accepted when it should have been rejected (happens when a stringent significance level is used, an error of pessimism).

Ethical Issues

  • Informed consent is when participants are able to make an informed judgment about whether to take part. It causes them to guess the aims of the study and change their behavior.
  • To deal with it, we can gain presumptive consent or ask them to formally indicate their agreement to participate but it may invalidate the purpose of the study and it is not guaranteed that the participants would understand.
  • Deception should only be used when it is approved by an ethics committee, as it involves deliberately misleading or withholding information. Participants should be fully debriefed after the study but debriefing can’t turn the clock back.
  • All participants should be informed at the beginning that they have the right to withdraw if they ever feel distressed or uncomfortable.
  • It causes bias as the ones that stayed are obedient and some may not withdraw as they may have been given incentives or feel like they’re spoiling the study. Researchers can offer the right to withdraw data after participation.
  • Participants should all have protection from harm . The researcher should avoid risks greater than those experienced in everyday life and they should stop the study if any harm is suspected. However, the harm may not be apparent at the time of the study.
  • Confidentiality concerns the communication of personal information. The researchers should not record any names but use numbers or false names though it may not be possible as it is sometimes possible to work out who the researchers were.

Print Friendly, PDF & Email

Research Methods: What are research methods?

  • What are research methods?
  • Searching specific databases

What are research methods

Research methods are the strategies, processes or techniques utilized in the collection of data or evidence for analysis in order to uncover new information or create better understanding of a topic.

There are different types of research methods which use different tools for data collection.

Types of research

  • Qualitative Research
  • Quantitative Research
  • Mixed Methods Research

Qualitative Research gathers data about lived experiences, emotions or behaviours, and the meanings individuals attach to them. It assists in enabling researchers to gain a better understanding of complex concepts, social interactions or cultural phenomena. This type of research is useful in the exploration of how or why things have occurred, interpreting events and describing actions.

Quantitative Research gathers numerical data which can be ranked, measured or categorised through statistical analysis. It assists with uncovering patterns or relationships, and for making generalisations. This type of research is useful for finding out how many, how much, how often, or to what extent.

Mixed Methods Research integrates both Q ualitative and Quantitative Research . It provides a holistic approach combining and analysing the statistical data with deeper contextualised insights. Using Mixed Methods also enables Triangulation,  or verification, of the data from two or more sources.

Finding Mixed Methods research in the Databases 

“mixed model*” OR “mixed design*” OR “multiple method*” OR multimethod* OR triangulat*

Data collection tools

Sage research methods.

  • SAGE research methods online This link opens in a new window Research methods tool to help researchers gather full-text resources, design research projects, understand a particular method and write up their research. Includes access to collections of video, business cases and eBooks,

Help and Information

Help and information

  • Next: Finding qualitative research >>
  • Last Updated: Apr 5, 2024 2:16 PM
  • URL: https://libguides.newcastle.edu.au/researchmethods

Pfeiffer Library

Research Methodologies

  • What are research designs?
  • What are research methodologies?

What are research methods?

Quantitative research methods, qualitative research methods, mixed method approach, selecting the best research method.

  • Additional Sources

Research methods are different from research methodologies because they are the ways in which you will collect the data for your research project.  The best method for your project largely depends on your topic, the type of data you will need, and the people or items from which you will be collecting data.  The following boxes below contain a list of quantitative, qualitative, and mixed research methods.

  • Closed-ended questionnaires/survey: These types of questionnaires or surveys are like "multiple choice" tests, where participants must select from a list of premade answers.  According to the content of the question, they must select the one that they agree with the most.  This approach is the simplest form of quantitative research because the data is easy to combine and quantify.
  • Structured interviews: These are a common research method in market research because the data can be quantified.  They are strictly designed for little "wiggle room" in the interview process so that the data will not be skewed.  You can conduct structured interviews in-person, online, or over the phone (Dawson, 2019).

Constructing Questionnaires

When constructing your questions for a survey or questionnaire, there are things you can do to ensure that your questions are accurate and easy to understand (Dawson, 2019):

  • Keep the questions brief and simple.
  • Eliminate any potential bias from your questions.  Make sure that they do not word things in a way that favor one perspective over another.
  • If your topic is very sensitive, you may want to ask indirect questions rather than direct ones.  This prevents participants from being intimidated and becoming unwilling to share their true responses.
  • If you are using a closed-ended question, try to offer every possible answer that a participant could give to that question.
  • Do not ask questions that assume something of the participant.  The question "How often do you exercise?" assumes that the participant exercises (when they may not), so you would want to include a question that asks if they exercise at all before asking them how often.
  • Try and keep the questionnaire as short as possible.  The longer a questionnaire takes, the more likely the participant will not complete it or get too tired to put truthful answers.
  • Promise confidentiality to your participants at the beginning of the questionnaire.

Quantitative Research Measures

When you are considering a quantitative approach to your research, you need to identify why types of measures you will use in your study.  This will determine what type of numbers you will be using to collect your data.  There are four levels of measurement:

  • Nominal: These are numbers where the order of the numbers do not matter.  They aim to identify separate information.  One example is collecting zip codes from research participants.  The order of the numbers does not matter, but the series of numbers in each zip code indicate different information (Adamson and Prion, 2013).
  • Ordinal: Also known as rankings because the order of these numbers matter.  This is when items are given a specific rank according to specific criteria.  A common example of ordinal measurements include ranking-based questionnaires, where participants are asked to rank items from least favorite to most favorite.  Another common example is a pain scale, where a patient is asked to rank their pain on a scale from 1 to 10 (Adamson and Prion, 2013).
  • Interval: This is when the data are ordered and the distance between the numbers matters to the researcher (Adamson and Prion, 2013).  The distance between each number is the same.  An example of interval data is test grades.
  • Ratio: This is when the data are ordered and have a consistent distance between numbers, but has a "zero point."  This means that there could be a measurement of zero of whatever you are measuring in your study (Adamson and Prion, 2013).  An example of ratio data is measuring the height of something because the "zero point" remains constant in all measurements.  The height of something could also be zero.

Focus Groups

This is when a select group of people gather to talk about a particular topic.  They can also be called discussion groups or group interviews (Dawson, 2019).  They are usually lead by a moderator  to help guide the discussion and ask certain questions.  It is critical that a moderator allows everyone in the group to get a chance to speak so that no one dominates the discussion.  The data that are gathered from focus groups tend to be thoughts, opinions, and perspectives about an issue.

Advantages of Focus Groups

  • Only requires one meeting to get different types of responses.
  • Less researcher bias due to participants being able to speak openly.
  • Helps participants overcome insecurities or fears about a topic.
  • The researcher can also consider the impact of participant interaction.

Disadvantages of Focus Groups

  • Participants may feel uncomfortable to speak in front of an audience, especially if the topic is sensitive or controversial.
  • Since participation is voluntary, not every participant may contribute equally to the discussion.
  • Participants may impact what others say or think.
  • A researcher may feel intimidated by running a focus group on their own.
  • A researcher may need extra funds/resources to provide a safe space to host the focus group.
  • Because the data is collective, it may be difficult to determine a participant's individual thoughts about the research topic.

Observation

There are two ways to conduct research observations:

  • Direct Observation: The researcher observes a participant in an environment.  The researcher often takes notes or uses technology to gather data, such as a voice recorder or video camera.  The researcher does not interact or interfere with the participants.  This approach is often used in psychology and health studies (Dawson, 2019).
  • Participant Observation:  The researcher interacts directly with the participants to get a better understanding of the research topic.  This is a common research method when trying to understand another culture or community.  It is important to decide if you will conduct a covert (participants do not know they are part of the research) or overt (participants know the researcher is observing them) observation because it can be unethical in some situations (Dawson, 2019).

Open-Ended Questionnaires

These types of questionnaires are the opposite of "multiple choice" questionnaires because the answer boxes are left open for the participant to complete.  This means that participants can write short or extended answers to the questions.  Upon gathering the responses, researchers will often "quantify" the data by organizing the responses into different categories.  This can be time consuming because the researcher needs to read all responses carefully.

Semi-structured Interviews

This is the most common type of interview where researchers aim to get specific information so they can compare it to other interview data.  This requires asking the same questions for each interview, but keeping their responses flexible.  This means including follow-up questions if a subject answers a certain way.  Interview schedules are commonly used to aid the interviewers, which list topics or questions that will be discussed at each interview (Dawson, 2019).

Theoretical Analysis

Often used for nonhuman research, theoretical analysis is a qualitative approach where the researcher applies a theoretical framework to analyze something about their topic.  A theoretical framework gives the researcher a specific "lens" to view the topic and think about it critically. it also serves as context to guide the entire study.  This is a popular research method for analyzing works of literature, films, and other forms of media.  You can implement more than one theoretical framework with this method, as many theories complement one another.

Common theoretical frameworks for qualitative research are (Grant and Osanloo, 2014):

  • Behavioral theory
  • Change theory
  • Cognitive theory
  • Content analysis
  • Cross-sectional analysis
  • Developmental theory
  • Feminist theory
  • Gender theory
  • Marxist theory
  • Queer theory
  • Systems theory
  • Transformational theory

Unstructured Interviews

These are in-depth interviews where the researcher tries to understand an interviewee's perspective on a situation or issue.  They are sometimes called life history interviews.  It is important not to bombard the interviewee with too many questions so they can freely disclose their thoughts (Dawson, 2019).

  • Open-ended and closed-ended questionnaires: This approach means implementing elements of both questionnaire types into your data collection.  Participants may answer some questions with premade answers and write their own answers to other questions.  The advantage to this method is that you benefit from both types of data collection to get a broader understanding of you participants.  However, you must think carefully about how you will analyze this data to arrive at a conclusion.

Other mixed method approaches that incorporate quantitative and qualitative research methods depend heavily on the research topic.  It is strongly recommended that you collaborate with your academic advisor before finalizing a mixed method approach.

How do you determine which research method would be best for your proposal?  This heavily depends on your research objective.  According to Dawson (2019), there are several questions to ask yourself when determining the best research method for your project:

  • Are you good with numbers and mathematics?
  • Would you be interested in conducting interviews with human subjects?
  • Would you enjoy creating a questionnaire for participants to complete?
  • Do you prefer written communication or face-to-face interaction?
  • What skills or experiences do you have that might help you with your research?  Do you have any experiences from past research projects that can help with this one?
  • How much time do you have to complete the research?  Some methods take longer to collect data than others.
  • What is your budget?  Do you have adequate funding to conduct the research in the method you  want?
  • How much data do you need?  Some research topics need only a small amount of data while others may need significantly larger amounts.
  • What is the purpose of your research? This can provide a good indicator as to what research method will be most appropriate.
  • << Previous: What are research methodologies?
  • Next: Additional Sources >>
  • Last Updated: Aug 2, 2022 2:36 PM
  • URL: https://library.tiffin.edu/researchmethodologies

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

  • Knowledge Base
  • Methodology

Research Methods | Definition, Types, Examples

Research methods are specific procedures for collecting and analysing data. Developing your research methods is an integral part of your research design . When planning your methods, there are two key decisions you will make.

First, decide how you will collect data . Your methods depend on what type of data you need to answer your research question :

  • Qualitative vs quantitative : Will your data take the form of words or numbers?
  • Primary vs secondary : Will you collect original data yourself, or will you use data that have already been collected by someone else?
  • Descriptive vs experimental : Will you take measurements of something as it is, or will you perform an experiment?

Second, decide how you will analyse the data .

  • For quantitative data, you can use statistical analysis methods to test relationships between variables.
  • For qualitative data, you can use methods such as thematic analysis to interpret patterns and meanings in the data.

Table of contents

Methods for collecting data, examples of data collection methods, methods for analysing data, examples of data analysis methods, frequently asked questions about methodology.

Data are the information that you collect for the purposes of answering your research question . The type of data you need depends on the aims of your research.

Qualitative vs quantitative data

Your choice of qualitative or quantitative data collection depends on the type of knowledge you want to develop.

For questions about ideas, experiences and meanings, or to study something that can’t be described numerically, collect qualitative data .

If you want to develop a more mechanistic understanding of a topic, or your research involves hypothesis testing , collect quantitative data .

You can also take a mixed methods approach, where you use both qualitative and quantitative research methods.

Primary vs secondary data

Primary data are any original information that you collect for the purposes of answering your research question (e.g. through surveys , observations and experiments ). Secondary data are information that has already been collected by other researchers (e.g. in a government census or previous scientific studies).

If you are exploring a novel research question, you’ll probably need to collect primary data. But if you want to synthesise existing knowledge, analyse historical trends, or identify patterns on a large scale, secondary data might be a better choice.

Descriptive vs experimental data

In descriptive research , you collect data about your study subject without intervening. The validity of your research will depend on your sampling method .

In experimental research , you systematically intervene in a process and measure the outcome. The validity of your research will depend on your experimental design .

To conduct an experiment, you need to be able to vary your independent variable , precisely measure your dependent variable, and control for confounding variables . If it’s practically and ethically possible, this method is the best choice for answering questions about cause and effect.

Prevent plagiarism, run a free check.

Your data analysis methods will depend on the type of data you collect and how you prepare them for analysis.

Data can often be analysed both quantitatively and qualitatively. For example, survey responses could be analysed qualitatively by studying the meanings of responses or quantitatively by studying the frequencies of responses.

Qualitative analysis methods

Qualitative analysis is used to understand words, ideas, and experiences. You can use it to interpret data that were collected:

  • From open-ended survey and interview questions, literature reviews, case studies, and other sources that use text rather than numbers.
  • Using non-probability sampling methods .

Qualitative analysis tends to be quite flexible and relies on the researcher’s judgement, so you have to reflect carefully on your choices and assumptions.

Quantitative analysis methods

Quantitative analysis uses numbers and statistics to understand frequencies, averages and correlations (in descriptive studies) or cause-and-effect relationships (in experiments).

You can use quantitative analysis to interpret data that were collected either:

  • During an experiment.
  • Using probability sampling methods .

Because the data are collected and analysed in a statistically valid way, the results of quantitative analysis can be easily standardised and shared among researchers.

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to test a hypothesis by systematically collecting and analysing data, while qualitative methods allow you to explore ideas and experiences in depth.

In mixed methods research , you use both qualitative and quantitative data collection and analysis methods to answer your research question .

A sample is a subset of individuals from a larger population. Sampling means selecting the group that you will actually collect data from in your research.

For example, if you are researching the opinions of students in your university, you could survey a sample of 100 students.

Statistical sampling allows you to test a hypothesis about the characteristics of a population. There are various sampling methods you can use to ensure that your sample is representative of the population as a whole.

The research methods you use depend on the type of data you need to answer your research question .

  • If you want to measure something or test a hypothesis , use quantitative methods . If you want to explore ideas, thoughts, and meanings, use qualitative methods .
  • If you want to analyse a large amount of readily available data, use secondary data. If you want data specific to your purposes with control over how they are generated, collect primary data.
  • If you want to establish cause-and-effect relationships between variables , use experimental methods. If you want to understand the characteristics of a research subject, use descriptive methods.

Methodology refers to the overarching strategy and rationale of your research project . It involves studying the methods used in your field and the theories or principles behind them, in order to develop an approach that matches your objectives.

Methods are the specific tools and procedures you use to collect and analyse data (e.g. experiments, surveys , and statistical tests ).

In shorter scientific papers, where the aim is to report the findings of a specific study, you might simply describe what you did in a methods section .

In a longer or more complex research project, such as a thesis or dissertation , you will probably include a methodology section , where you explain your approach to answering the research questions and cite relevant sources to support your choice of methods.

Is this article helpful?

More interesting articles.

  • A Quick Guide to Experimental Design | 5 Steps & Examples
  • Between-Subjects Design | Examples, Pros & Cons
  • Case Study | Definition, Examples & Methods
  • Cluster Sampling | A Simple Step-by-Step Guide with Examples
  • Confounding Variables | Definition, Examples & Controls
  • Construct Validity | Definition, Types, & Examples
  • Content Analysis | A Step-by-Step Guide with Examples
  • Control Groups and Treatment Groups | Uses & Examples
  • Controlled Experiments | Methods & Examples of Control
  • Correlation vs Causation | Differences, Designs & Examples
  • Correlational Research | Guide, Design & Examples
  • Critical Discourse Analysis | Definition, Guide & Examples
  • Cross-Sectional Study | Definitions, Uses & Examples
  • Data Cleaning | A Guide with Examples & Steps
  • Data Collection Methods | Step-by-Step Guide & Examples
  • Descriptive Research Design | Definition, Methods & Examples
  • Doing Survey Research | A Step-by-Step Guide & Examples
  • Ethical Considerations in Research | Types & Examples
  • Explanatory Research | Definition, Guide, & Examples
  • Explanatory vs Response Variables | Definitions & Examples
  • Exploratory Research | Definition, Guide, & Examples
  • External Validity | Types, Threats & Examples
  • Extraneous Variables | Examples, Types, Controls
  • Face Validity | Guide with Definition & Examples
  • How to Do Thematic Analysis | Guide & Examples
  • How to Write a Strong Hypothesis | Guide & Examples
  • Inclusion and Exclusion Criteria | Examples & Definition
  • Independent vs Dependent Variables | Definition & Examples
  • Inductive Reasoning | Types, Examples, Explanation
  • Inductive vs Deductive Research Approach (with Examples)
  • Internal Validity | Definition, Threats & Examples
  • Internal vs External Validity | Understanding Differences & Examples
  • Longitudinal Study | Definition, Approaches & Examples
  • Mediator vs Moderator Variables | Differences & Examples
  • Mixed Methods Research | Definition, Guide, & Examples
  • Multistage Sampling | An Introductory Guide with Examples
  • Naturalistic Observation | Definition, Guide & Examples
  • Operationalisation | A Guide with Examples, Pros & Cons
  • Population vs Sample | Definitions, Differences & Examples
  • Primary Research | Definition, Types, & Examples
  • Qualitative vs Quantitative Research | Examples & Methods
  • Quasi-Experimental Design | Definition, Types & Examples
  • Questionnaire Design | Methods, Question Types & Examples
  • Random Assignment in Experiments | Introduction & Examples
  • Reliability vs Validity in Research | Differences, Types & Examples
  • Reproducibility vs Replicability | Difference & Examples
  • Research Design | Step-by-Step Guide with Examples
  • Sampling Methods | Types, Techniques, & Examples
  • Semi-Structured Interview | Definition, Guide & Examples
  • Simple Random Sampling | Definition, Steps & Examples
  • Stratified Sampling | A Step-by-Step Guide with Examples
  • Structured Interview | Definition, Guide & Examples
  • Systematic Review | Definition, Examples & Guide
  • Systematic Sampling | A Step-by-Step Guide with Examples
  • Textual Analysis | Guide, 3 Approaches & Examples
  • The 4 Types of Reliability in Research | Definitions & Examples
  • The 4 Types of Validity | Types, Definitions & Examples
  • Transcribing an Interview | 5 Steps & Transcription Software
  • Triangulation in Research | Guide, Types, Examples
  • Types of Interviews in Research | Guide & Examples
  • Types of Research Designs Compared | Examples
  • Types of Variables in Research | Definitions & Examples
  • Unstructured Interview | Definition, Guide & Examples
  • What Are Control Variables | Definition & Examples
  • What Is a Case-Control Study? | Definition & Examples
  • What Is a Cohort Study? | Definition & Examples
  • What Is a Conceptual Framework? | Tips & Examples
  • What Is a Double-Barrelled Question?
  • What Is a Double-Blind Study? | Introduction & Examples
  • What Is a Focus Group? | Step-by-Step Guide & Examples
  • What Is a Likert Scale? | Guide & Examples
  • What is a Literature Review? | Guide, Template, & Examples
  • What Is a Prospective Cohort Study? | Definition & Examples
  • What Is a Retrospective Cohort Study? | Definition & Examples
  • What Is Action Research? | Definition & Examples
  • What Is an Observational Study? | Guide & Examples
  • What Is Concurrent Validity? | Definition & Examples
  • What Is Content Validity? | Definition & Examples
  • What Is Convenience Sampling? | Definition & Examples
  • What Is Convergent Validity? | Definition & Examples
  • What Is Criterion Validity? | Definition & Examples
  • What Is Deductive Reasoning? | Explanation & Examples
  • What Is Discriminant Validity? | Definition & Example
  • What Is Ecological Validity? | Definition & Examples
  • What Is Ethnography? | Meaning, Guide & Examples
  • What Is Non-Probability Sampling? | Types & Examples
  • What Is Participant Observation? | Definition & Examples
  • What Is Peer Review? | Types & Examples
  • What Is Predictive Validity? | Examples & Definition
  • What Is Probability Sampling? | Types & Examples
  • What Is Purposive Sampling? | Definition & Examples
  • What Is Qualitative Observation? | Definition & Examples
  • What Is Qualitative Research? | Methods & Examples
  • What Is Quantitative Observation? | Definition & Examples
  • What Is Quantitative Research? | Definition & Methods
  • What Is Quota Sampling? | Definition & Examples
  • What is Secondary Research? | Definition, Types, & Examples
  • What Is Snowball Sampling? | Definition & Examples
  • Within-Subjects Design | Explanation, Approaches, Examples

Elsevier QRcode Wechat

  • Research Process

Choosing the Right Research Methodology: A Guide for Researchers

  • 3 minute read
  • 35.4K views

Table of Contents

Choosing an optimal research methodology is crucial for the success of any research project. The methodology you select will determine the type of data you collect, how you collect it, and how you analyse it. Understanding the different types of research methods available along with their strengths and weaknesses, is thus imperative to make an informed decision.

Understanding different research methods:

There are several research methods available depending on the type of study you are conducting, i.e., whether it is laboratory-based, clinical, epidemiological, or survey based . Some common methodologies include qualitative research, quantitative research, experimental research, survey-based research, and action research. Each method can be opted for and modified, depending on the type of research hypotheses and objectives.

Qualitative vs quantitative research:

When deciding on a research methodology, one of the key factors to consider is whether your research will be qualitative or quantitative. Qualitative research is used to understand people’s experiences, concepts, thoughts, or behaviours . Quantitative research, on the contrary, deals with numbers, graphs, and charts, and is used to test or confirm hypotheses, assumptions, and theories. 

Qualitative research methodology:

Qualitative research is often used to examine issues that are not well understood, and to gather additional insights on these topics. Qualitative research methods include open-ended survey questions, observations of behaviours described through words, and reviews of literature that has explored similar theories and ideas. These methods are used to understand how language is used in real-world situations, identify common themes or overarching ideas, and describe and interpret various texts. Data analysis for qualitative research typically includes discourse analysis, thematic analysis, and textual analysis. 

Quantitative research methodology:

The goal of quantitative research is to test hypotheses, confirm assumptions and theories, and determine cause-and-effect relationships. Quantitative research methods include experiments, close-ended survey questions, and countable and numbered observations. Data analysis for quantitative research relies heavily on statistical methods.

Analysing qualitative vs quantitative data:

The methods used for data analysis also differ for qualitative and quantitative research. As mentioned earlier, quantitative data is generally analysed using statistical methods and does not leave much room for speculation. It is more structured and follows a predetermined plan. In quantitative research, the researcher starts with a hypothesis and uses statistical methods to test it. Contrarily, methods used for qualitative data analysis can identify patterns and themes within the data, rather than provide statistical measures of the data. It is an iterative process, where the researcher goes back and forth trying to gauge the larger implications of the data through different perspectives and revising the analysis if required.

When to use qualitative vs quantitative research:

The choice between qualitative and quantitative research will depend on the gap that the research project aims to address, and specific objectives of the study. If the goal is to establish facts about a subject or topic, quantitative research is an appropriate choice. However, if the goal is to understand people’s experiences or perspectives, qualitative research may be more suitable. 

Conclusion:

In conclusion, an understanding of the different research methods available, their applicability, advantages, and disadvantages is essential for making an informed decision on the best methodology for your project. If you need any additional guidance on which research methodology to opt for, you can head over to Elsevier Author Services (EAS). EAS experts will guide you throughout the process and help you choose the perfect methodology for your research goals.

Why is data validation important in research

Why is data validation important in research?

Importance-of-Data-Collection

When Data Speak, Listen: Importance of Data Collection and Analysis Methods

You may also like.

what is a descriptive research design

Descriptive Research Design and Its Myriad Uses

Doctor doing a Biomedical Research Paper

Five Common Mistakes to Avoid When Writing a Biomedical Research Paper

what are the 5 research methods

Making Technical Writing in Environmental Engineering Accessible

Risks of AI-assisted Academic Writing

To Err is Not Human: The Dangers of AI-assisted Academic Writing

Importance-of-Data-Collection

Writing a good review article

Scholarly Sources What are They and Where can You Find Them

Scholarly Sources: What are They and Where can You Find Them?

Input your search keywords and press Enter.

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • What Is Qualitative Research? | Methods & Examples

What Is Qualitative Research? | Methods & Examples

Published on June 19, 2020 by Pritha Bhandari . Revised on June 22, 2023.

Qualitative research involves collecting and analyzing non-numerical data (e.g., text, video, or audio) to understand concepts, opinions, or experiences. It can be used to gather in-depth insights into a problem or generate new ideas for research.

Qualitative research is the opposite of quantitative research , which involves collecting and analyzing numerical data for statistical analysis.

Qualitative research is commonly used in the humanities and social sciences, in subjects such as anthropology, sociology, education, health sciences, history, etc.

  • How does social media shape body image in teenagers?
  • How do children and adults interpret healthy eating in the UK?
  • What factors influence employee retention in a large organization?
  • How is anxiety experienced around the world?
  • How can teachers integrate social issues into science curriculums?

Table of contents

Approaches to qualitative research, qualitative research methods, qualitative data analysis, advantages of qualitative research, disadvantages of qualitative research, other interesting articles, frequently asked questions about qualitative research.

Qualitative research is used to understand how people experience the world. While there are many approaches to qualitative research, they tend to be flexible and focus on retaining rich meaning when interpreting data.

Common approaches include grounded theory, ethnography , action research , phenomenological research, and narrative research. They share some similarities, but emphasize different aims and perspectives.

Note that qualitative research is at risk for certain research biases including the Hawthorne effect , observer bias , recall bias , and social desirability bias . While not always totally avoidable, awareness of potential biases as you collect and analyze your data can prevent them from impacting your work too much.

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

Each of the research approaches involve using one or more data collection methods . These are some of the most common qualitative methods:

  • Observations: recording what you have seen, heard, or encountered in detailed field notes.
  • Interviews:  personally asking people questions in one-on-one conversations.
  • Focus groups: asking questions and generating discussion among a group of people.
  • Surveys : distributing questionnaires with open-ended questions.
  • Secondary research: collecting existing data in the form of texts, images, audio or video recordings, etc.
  • You take field notes with observations and reflect on your own experiences of the company culture.
  • You distribute open-ended surveys to employees across all the company’s offices by email to find out if the culture varies across locations.
  • You conduct in-depth interviews with employees in your office to learn about their experiences and perspectives in greater detail.

Qualitative researchers often consider themselves “instruments” in research because all observations, interpretations and analyses are filtered through their own personal lens.

For this reason, when writing up your methodology for qualitative research, it’s important to reflect on your approach and to thoroughly explain the choices you made in collecting and analyzing the data.

Qualitative data can take the form of texts, photos, videos and audio. For example, you might be working with interview transcripts, survey responses, fieldnotes, or recordings from natural settings.

Most types of qualitative data analysis share the same five steps:

  • Prepare and organize your data. This may mean transcribing interviews or typing up fieldnotes.
  • Review and explore your data. Examine the data for patterns or repeated ideas that emerge.
  • Develop a data coding system. Based on your initial ideas, establish a set of codes that you can apply to categorize your data.
  • Assign codes to the data. For example, in qualitative survey analysis, this may mean going through each participant’s responses and tagging them with codes in a spreadsheet. As you go through your data, you can create new codes to add to your system if necessary.
  • Identify recurring themes. Link codes together into cohesive, overarching themes.

There are several specific approaches to analyzing qualitative data. Although these methods share similar processes, they emphasize different concepts.

Qualitative research often tries to preserve the voice and perspective of participants and can be adjusted as new research questions arise. Qualitative research is good for:

  • Flexibility

The data collection and analysis process can be adapted as new ideas or patterns emerge. They are not rigidly decided beforehand.

  • Natural settings

Data collection occurs in real-world contexts or in naturalistic ways.

  • Meaningful insights

Detailed descriptions of people’s experiences, feelings and perceptions can be used in designing, testing or improving systems or products.

  • Generation of new ideas

Open-ended responses mean that researchers can uncover novel problems or opportunities that they wouldn’t have thought of otherwise.

Prevent plagiarism. Run a free check.

Researchers must consider practical and theoretical limitations in analyzing and interpreting their data. Qualitative research suffers from:

  • Unreliability

The real-world setting often makes qualitative research unreliable because of uncontrolled factors that affect the data.

  • Subjectivity

Due to the researcher’s primary role in analyzing and interpreting data, qualitative research cannot be replicated . The researcher decides what is important and what is irrelevant in data analysis, so interpretations of the same data can vary greatly.

  • Limited generalizability

Small samples are often used to gather detailed data about specific contexts. Despite rigorous analysis procedures, it is difficult to draw generalizable conclusions because the data may be biased and unrepresentative of the wider population .

  • Labor-intensive

Although software can be used to manage and record large amounts of text, data analysis often has to be checked or performed manually.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Chi square goodness of fit test
  • Degrees of freedom
  • Null hypothesis
  • Discourse analysis
  • Control groups
  • Mixed methods research
  • Non-probability sampling
  • Quantitative research
  • Inclusion and exclusion criteria

Research bias

  • Rosenthal effect
  • Implicit bias
  • Cognitive bias
  • Selection bias
  • Negativity bias
  • Status quo bias

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to systematically measure variables and test hypotheses . Qualitative methods allow you to explore concepts and experiences in more detail.

There are five common approaches to qualitative research :

  • Grounded theory involves collecting data in order to develop new theories.
  • Ethnography involves immersing yourself in a group or organization to understand its culture.
  • Narrative research involves interpreting stories to understand how people make sense of their experiences and perceptions.
  • Phenomenological research involves investigating phenomena through people’s lived experiences.
  • Action research links theory and practice in several cycles to drive innovative changes.

Data collection is the systematic process by which observations or measurements are gathered in research. It is used in many different contexts by academics, governments, businesses, and other organizations.

There are various approaches to qualitative data analysis , but they all share five steps in common:

  • Prepare and organize your data.
  • Review and explore your data.
  • Develop a data coding system.
  • Assign codes to the data.
  • Identify recurring themes.

The specifics of each step depend on the focus of the analysis. Some common approaches include textual analysis , thematic analysis , and discourse analysis .

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bhandari, P. (2023, June 22). What Is Qualitative Research? | Methods & Examples. Scribbr. Retrieved April 15, 2024, from https://www.scribbr.com/methodology/qualitative-research/

Is this article helpful?

Pritha Bhandari

Pritha Bhandari

Other students also liked, qualitative vs. quantitative research | differences, examples & methods, how to do thematic analysis | step-by-step guide & examples, "i thought ai proofreading was useless but..".

I've been using Scribbr for years now and I know it's a service that won't disappoint. It does a good job spotting mistakes”

5 Most Popular Research Methods in Psychology

Ready to find the perfect college degree.

Image of researcher for our article on 5 Most Popular Research Methods in Psychology

  • Make descriptions
  • Predict outcomes
  • Test an independent variable
  • Communities
  • Individuals
  • Assist patients with psychological ailments
  • Diagnose patients
  • Understand problems

How to Use Case Study Research Method

5 Research Methods Used in Psychology

Content Analysis

Close reading, summative analysis, ground theory, stages in ground theory.

5 Research Methods Used in Psychology

Categorizing

Conceptualizing.

  • Consistency
  • Control group
  • Control of variables
  • Showing cause and effect

3 Main Types of Experiments

  • Field experiments
  • Lab experiments
  • Natural experiments
  • Human behavior studies
  • Human development
  • Sleep studies

5 Research Methods Used in Psychology

Observational Study

  • Social constructs

Guidelines to Follow for Observational Study

  • Individuals must remain anonymous
  • Observations must happen in public contexts
  • You must not expect private observations

An image of a psychology graphic for our article on 5 Most Popular Research Methods in Psychology

Survey Method

Communication channels for survey data collection studies.

  • The internet
  • Gender inequality
  • Substance abuse

Other Types of Research Methods Used in Psychology

  • Correlational research
  • Dependent variable
  • Independent variable
  • Experimental task design
  • Positive correlation (also correlational research)
  • Structured observation
  • Random sampling in experimental design
  • Statistical estimation

Final Thoughts

By BDP Staff

Related Resources:

5 Career Options with a Bachelor’s in Psychology

5 Careers in Sociology That You Can Feel Good About

30 Best Bachelor’s in Psychology Degrees Online: Small Colleges

Top 10 Best Majors for Indecisive Students

Ultimate Guide to Psychology and Counseling Degrees and Careers

This concludes our article on the various research methods in psychology.

Brenda Rufener Author

Julie McCaulley Expert

Carrie Sealey-Morris Editor-in-Chief

  • University Libraries
  • Research Guides
  • Topic Guides
  • Research Methods Guide
  • Research Design & Method

Research Methods Guide: Research Design & Method

  • Introduction
  • Survey Research
  • Interview Research
  • Data Analysis
  • Resources & Consultation

Tutorial Videos: Research Design & Method

Research Methods (sociology-focused)

Qualitative vs. Quantitative Methods (intro)

Qualitative vs. Quantitative Methods (advanced)

what are the 5 research methods

FAQ: Research Design & Method

What is the difference between Research Design and Research Method?

Research design is a plan to answer your research question.  A research method is a strategy used to implement that plan.  Research design and methods are different but closely related, because good research design ensures that the data you obtain will help you answer your research question more effectively.

Which research method should I choose ?

It depends on your research goal.  It depends on what subjects (and who) you want to study.  Let's say you are interested in studying what makes people happy, or why some students are more conscious about recycling on campus.  To answer these questions, you need to make a decision about how to collect your data.  Most frequently used methods include:

  • Observation / Participant Observation
  • Focus Groups
  • Experiments
  • Secondary Data Analysis / Archival Study
  • Mixed Methods (combination of some of the above)

One particular method could be better suited to your research goal than others, because the data you collect from different methods will be different in quality and quantity.   For instance, surveys are usually designed to produce relatively short answers, rather than the extensive responses expected in qualitative interviews.

What other factors should I consider when choosing one method over another?

Time for data collection and analysis is something you want to consider.  An observation or interview method, so-called qualitative approach, helps you collect richer information, but it takes time.  Using a survey helps you collect more data quickly, yet it may lack details.  So, you will need to consider the time you have for research and the balance between strengths and weaknesses associated with each method (e.g., qualitative vs. quantitative).

  • << Previous: Introduction
  • Next: Survey Research >>
  • Last Updated: Aug 21, 2023 10:42 AM

logo image missing

  • > General Analytics

Different Types of Research Methods

  • Mallika Rangaiah
  • Dec 22, 2021
  • Updated on: Nov 21, 2023

Different Types of Research Methods title banner

Unlike what a layman generally presumes, Research is not just about determining a hypothesis and unraveling a conclusion for that hypothesis. Every research approach that we take up falls under the category of a type of methodology and every methodology is exclusive and intricate in its depth. 

So what are these research methodologies and how do the researchers make use of them? This is what we are going to explore through this blog. Before we attempt to understand these methods, let us understand what research methodology actually means. 

What are Research Methods ?

Firstly, let's understand why we undertake research? What exactly is the point of it? 

Research is mainly done to gain knowledge to support a survey or quest regarding a particular conception or theory and to reach a resolute conclusion regarding the same.  Research is generally an approach for gaining knowledge which is required to interpret, write, delve further and to distribute data. 

For ensuring that a fulfilling experience is delivered, it is essential that the Research is premium in its quality and that’s where Research Methods come to the rescue. 

(Recommended blog - Research Market Analysis )

Types of Research Methods

An area is selected, a specific hypothesis is determined and a defined conclusion is required to be achieved. But how is this conclusion reached? What is the approach that can be taken up? As per CR Kothari’s book “Research Methodology Methods and Techniques” (The Second Revised Edition),  the basic types of Research Methods are the following : 

The image depicts the Types of Research Methods and has the following points :1. Descriptive Research2. Analytical Research3. Applied Research4. Fundamental Research5. Quantitative Research6. Qualitative Research7. Conceptual Research8. Empirical Research

Descriptive Research

Descriptive Research is a form of research that incorporates surveys as well as different varieties of fact-finding investigations. This form of research is focused on describing the prevailing state of affairs as they are. Descriptive Research is also termed as Ex post facto research. 

This research form emphasises on factual reporting, the researcher cannot control the involved variables and can only report the details as they took place or as they are taking place. 

Researchers mainly make use of a descriptive research approach for purposes such as when the research is aimed at deciphering characteristics, frequencies or trends. 

Ex post facto studies also include attempts by researchers to discover causes even when they cannot control the variables. The descriptive research methods are mainly, observations, surveys as well as case studies. 

(Speaking of variables, have you ever wondered - What are confounding variables? )

Analytical Research

Analytical Research is a form of research where the researcher has to make do with the data and factual information available at their behest and interpret this information to undertake an acute evaluation of the data. 

This form of research is often undertaken by researchers to uncover some evidence that supports their present research and which makes it more authentic. It is also undertaken for concocting fresh ideas relating to the topic on which the research is based. 

From conducting meta analysis, literary research or scientific trials and learning public opinion, there are many methods through which this research is done. 

Applied Research

When a business or say, the society is faced with an issue that needs an immediate solution or resolution, Applied Research is the research type that comes to the rescue. 

We primarily make use of Applied Research when it comes to resolving the issues plaguing our daily lives, impacting our work, health or welfare. This research type is undertaken to uncover solutions for issues relating to varying sectors like education, engineering, psychology or business. 

For instance, a company might employ an applied researcher for concluding the best possible approach of selecting employees that would be the best fit for specific positions in the company. 

The crux of Applied Research is to figure out the solution to a certain growing practical issue. 

The 3 Types of Applied Research are mainly 

Evaluation Research - Research where prevailing data regarding the topic is interpreted to arrive at proper decisions

Research and Development - Where the focus is on setting up fresh products or services which focus on the target market requirements

Action Research - Which aims at offering practical solutions for certain business issues by giving them proper direction, are the 3 types of Applied Research. 

(Related blog - Target Marketing using AI )

Fundamental Research

This is a Research type that is primarily concerned with formulating a theory or understanding a particular natural phenomenon. Fundamental Research aims to discover information with an extensive application base, supplementing the existing concepts in a certain field or industry. 

Research on pure mathematics or research regarding generalisation of the behavior of humans are also examples of Fundamental Research. This form of research is mainly carried out in sectors like Education, Psychology and Science. 

For instance, in Psychology fundamental research assists the individual or the company in gaining better insights regarding certain behaviors such as deciphering how consumption of caffeine can possibly impact the attention span of a student or how culture stereotypes can possibly trigger depression. 

Quantitative Research

Quantitative Research, as the name suggests, is based on the measurement of a particular amount or quantity of a particular phenomenon. It focuses on gathering and interpreting numerical data and can be adopted for discovering any averages or patterns or for making predictions.

This form of Research is number based and it lies under the two main Research Types. It makes use of tables, data and graphs to reach a conclusion. The outcomes generated from this research are measurable and can be repeated unlike the outcomes of qualitative research. This research type is mainly adopted for scientific and field based research.

Quantitative research generally involves a large number of people and a huge section of data and has a lot of scope for accuracy in it. 

These research methods can be adopted for approaches like descriptive, correlational or experimental research.

Descriptive research - The study variables are analyzed and a summary of the same is seeked.

Correlational Research - The relationship between the study variables is analyzed. 

Experimental Research - It is deciphered to analyse whether a cause and effect relationship between the variables exists. 

Quantitative research methods

  • Experiment Research - This method controls or manages independent variables for calculating the effect it has on dependent variables. 
  • Survey - Surveys involve inquiring questions from a certain specified number or set of people either online, face to face or over the phone. 
  • (Systematic) observation - This method involves detecting any occurrence and monitoring it in a natural setting. 
  • Secondary research : This research focuses on making use of data which has been previously collected for other purposes such as for say, a national survey. 

(Related blog - Hypothesis Testing )

Qualitative Research

As the name suggests, this form of Research is more considered with the quality of a certain phenomenon, it dives into the “why” alongside the “what”. For instance, let’s consider a gender neutral clothing store which has more women visiting it than men. 

Qualitative research would be determining why men are not visiting the store by carrying out an in-depth interview of some potential customers in this category.

This form of research is interested in getting to the bottom of the reasons for human behaviour, i.e understanding why certain actions are taken by people or why they think certain thoughts. 

Through this research the factors influencing people into behaving in a certain way or which control their preferences towards a certain thing can be interpreted.

An example of Qualitative Research would be Motivation Research . This research focuses on deciphering the rooted motives or desires through intricate methods like in depth interviews. It involves several tests like story completion or word association. 

Another example would be Opinion Research . This type of research is carried out to discover the opinion and perspective of people regarding a certain subject or phenomenon.

This is a theory based form of research and it works by describing an issue by taking into account the prior concepts, ideas and studies. The experience of the researcher plays an integral role here.

The Types of Qualitative Research includes the following methods :

Qualitative research methods

  • Observations: In this method what the researcher sees, hears of or encounters is recorded in detail.
  • Interviews: Personally asking people questions in one-on-one conversations.
  • Focus groups: This involves asking questions and discussions among a group of people to generate conclusions from the same. 
  • Surveys: In these surveys unlike the quantitative research surveys, the questionnaires involve extensive open ended questions that require elaborate answers. 
  • Secondary research: Gathering the existing data such as images, texts or audio or video recordings. This can involve a text analysis, a research of a case study, or an In-depth interview.

Conceptual Research

This research is related to an abstract idea or a theory. It is adopted by thinkers and philosophers with the aim of developing a new concept or to re-examine the existing concepts. 

Conceptual Research is mainly defined as a methodology in which the research is conducted by observing and interpreting the already present information on a present topic. It does not include carrying out any practical experiments. 

This methodology has often been adopted by famous Philosophers like Aristotle, Copernicus, Einstein and Newton for developing fresh theories and insights regarding the working of the world and for examining the existing ones from a different perspective. 

The concepts were set up by philosophers to observe their environment and to sort, study, and summarise the information available. 

Empirical Research

This is a research method that focuses solely on aspects like observation and experience, without focusing on the theory or system. It is based on data and it can churn conclusions that can be confirmed or verified through observation and experiment. Empirical Research is mainly undertaken to determine proof that certain variables are affecting the others in a particular way.   

This kind of research can also be termed as Experimental Research. In this research it is essential that all the facts are received firsthand, directly from the source so that the researcher can actively go and carry out the actions and manipulate the concerned materials to gain the information he requires.

In this research a hypothesis is generated and then a path is undertaken to confirm or invalidate this hypothesis. The control that the researcher holds over the involved variables defines this research. The researcher can manipulate one of these variables to examine its effect.

(Recommended blog - Data Analysis )

Other Types of Research

All research types apart from the ones stated above are mainly variations of them, either in terms of research purpose or in the terms of the time that is required for accomplishing the research, or say, the research environment. 

If we take the perspective of time, research can be considered as either One-time research or Longitudinal Research. 

One time Research : The research is restricted to a single time period. 

Longitudinal Research : The research is executed over multiple time periods. 

A research can also be set in a field or a laboratory or be a simulation, it depends on the environment that the research is based on. 

We’ve also got Historical Research which makes use of historical sources such as documents and remains for examining past events and ideas. This also includes the philosophy of an individual and groups at a particular time. 

Research may be clinical or diagnostic . These kinds of research generally carry out case study or in-depth interview approaches to determine basic causal relationships. 

Research can also be Exploratory or Formalized. 

Exploratory Research: This is a research that is more focused on establishing hypotheses than on deriving the result. This form of Research focuses on understanding the prevailing issue but it doesn’t really offer defining results. 

Formalized research: This is a research that has a solid structure and which also has specific hypotheses for testing. 

We can also classify Research as conclusion-oriented and decision-oriented. 

Conclusion Oriented Research: In this form of research, the researcher can select an issue, revamp the enquiry as he continues and visualize it as per his requirements. 

Decision-oriented research: This research depends on the requirement of the decision maker and offers less freedom to the research to conduct it as he pleases. 

The common and well known research methods have been listed in this blog. Hopefully this blog will give the readers and present and future researchers proper knowledge regarding important methods they can adopt to conduct their Research.

Share Blog :

what are the 5 research methods

Be a part of our Instagram community

Trending blogs

5 Factors Influencing Consumer Behavior

Elasticity of Demand and its Types

What is PESTLE Analysis? Everything you need to know about it

An Overview of Descriptive Analysis

What is Managerial Economics? Definition, Types, Nature, Principles, and Scope

5 Factors Affecting the Price Elasticity of Demand (PED)

6 Major Branches of Artificial Intelligence (AI)

Dijkstra’s Algorithm: The Shortest Path Algorithm

Scope of Managerial Economics

Latest Comments

what are the 5 research methods

Susan Bickford

It's A Great News to Celebrate with you Viewer, I am truly living the life I have been looking for after Dr Kachi made me win my Powerball Lottery, I had been playing for a good 8years. It was a friend of mine who directed me to Dr Kachi because my friend Nancy has won the Powerball so many times and I don't know how she got the match six numbers to play and win a very big amount of money, then the last time she won the Mega Millions I told her to tell me the secret on how she win. That's when she started telling me about the powerful Dr Kachi who has been her helper. and she gave me Dr Kachi Text/Call Number:+1 (209) 893-8075 I texted the greatest spell caster Dr Kachi and I told him I wanted to win my Powerball with his spiritual rightful number and he told me I should give him 2hours to get everything done and hopefully Dr Kachi do it, and give me a winning numbers to play my ticket that make me win the prize of $223.3 Million Dollars Powerball lottery Tuesday i bought the winning ticket at the Carlie C’s IGA store in Hope Mills, that changed my life for good today, and Dr Kachi a strong spell caster and trust him when he says the results will manifest it's Truth, God bless you Dr kachi for your kind help also can Email: [email protected] or website:  https://drkachispellcaster.wixsite.com/my-site

what are the 5 research methods

peggycarter756c8d62413832a41b0

Love PINNACLE CREDIT SPECIALIST and I’m grateful for their services and the incredible knowledge base here. My credit score has gone from 490 (2/2024) to 809 across the three credit bureaus on a clean scorecard, with the help of PINNACLE CREDIT SPECIALIST I’ve been approved of loan, got my car and most especially got myself together. I strongly recommend him for your fix. Reach him via: [email protected] Or text +1 (409) 231-0041. Mention to their team that you read a good review of an expert job done for Peggy.

what are the 5 research methods

Cindy Jason

HOW TO GET YOUR EX HUSBAND BACK HELP OF DR KACHI CALL NUMBER +1 (209) 893-8075 God did it for me again with the help of Dr Kachi with his love spell to get my husband back. we divorce 3months ago and since things become so hard for me because I love my husband so much, But he was chatting on me with another woman and he always goes to party every night my husband doesn't care about me whenever he get back at night he will be beating me up with no reason, I cry every night and day to get my husband back to his normal love and affection that he give to me before. but nothing was working out for me I try my best I left him with my kids but I couldn't sleep at night without thinking about my husband, then one day I was reading a new online about our politics and I see a comment about Dr Kachi how he restored broken relationship back and marriage, i didn't believe in love spell at the first place, then i have to make further research about Dr Kachi I opened his website I can't believe what I saw a great man helping people return their lover back and being happy in relationship again. I went fast and contacted Dr Kachi to help me restore my marriage back, after I provided the required needed to cast the love spell, the next day my husband come back to me and apologies for him leaving me and the kids Dr Kachi made me the happiest woman on earth I am so happy, I do appreciate your kind help bring my husband home, you can also contact him and seek for help in break up in married Via Text Number Call: +1 (209) 893-8075 Website: https://drkachispellcaster.wixsite.com/my-site Email [email protected]

arungupta16576c733168b08642b0

The lots of debt on my credit report deprived me so some many opportunities. Long story short I had 14k in CC debt, 30k in student loan debt, not great. I was low income/living beyond means while younger, perfect payment history last 7 years, no delinquency, charge offs, bankruptcy etc., no negatives other than balances, I think. Fico 8 was high 500s (599 EQ, 595 TU, 590 EX). After my mum told me about PINNACLE CREDIT SPECIALIST, I reached out to them immediately with the proposal, after about 6 days I was sitting here Netflix and chill when I get a notification from PINNACLE CREDIT SPECIALIST to pull my report. Yes, my scores are now (801 EQ, 809 TU, 811 EX) I shed in tears the goal was to be among the 800s club before the end of the year which has been achieved. You can contact him via: [email protected] Or Text +1 (409) 231-0041.

josephmurray9086eda1a5dafdb64f0a

I intend to propose to my long-time girlfriend later this summer, so that we can move to Las Vegas together by the end of December. After that, we needed urgent help fixing our credit. I had a car loan, student loans and credit card debt. I also had another debt (mortgage). My girlfriend had 93k in consumer debt, with 75k of it on credit cards and the rest on her Acura. She does well in her career (makes $120k/yr. gross) which she has a spending issue. Well, I sought help online and I read quite a lot of good reviews about a credit guy called PINNACLE CREDIT SPECIALIST. I contacted him for credit help. We got started and after a couple of days, PINNACLE CREDIT SPECIALIST cleared the debts on my credit profile and eventually raised my score to 811. All these miracles happened in March. Feel free to contact him by email: [email protected] Or phone +1 (409) 231-0041 for credit related issues.

scottfeifer7d3d8bfd1061b4ac1

I’m writing this review in appreciation to this great credit specialist called PINNACLE CREDIT SPECIALIST. I can’t stop thanking them for the good job they did on my credit. I had been suffering from financial bondage until I met a PINNACLE CREDIT SPECIALIST. They promised to turn my situation around and they kept their promise. Every negative item on my credit report (hard inquiries, late payments, credit card debts and charge-off etc.) has been cleared and my credit score raised by 279 points within 6 days. Contact info Email: [email protected] Telephone: +1 (409) 231-0041.

petermoseley8d80beb73ae514962

Have you guys ever checked out PINNACLE CREDIT SPECIALIST? I’m knocking on my 30s doors (29y/o). I was obviously careless and irresponsible in my 20s but wanted to turn my ship around before my 30s. I had a credit score of 580 (Equifax), 527 (Experian), 503 (TransUnion). $2.2K in credit debt with a limit of $2.9, so high utilization, 3 cards in collections. Well, I also had about 14 closed accounts 4 of which have been paid off but still were still on my report. I do have an auto loan and student loan. Luckily for me a friend introduced me to PINNACLE CREDIT SPECIALIST, he helped me raise my score by 231 points and marked all debts as paid on time, he also deleted all collections from my credit report thereby giving me a new slate within a few weeks. Contact him directly by email: [email protected] Or Call +1 (409) 231-0041.

what are the 5 research methods

helpful professor logo

15 Types of Research Methods

types of research methods, explained below

Research methods refer to the strategies, tools, and techniques used to gather and analyze data in a structured way in order to answer a research question or investigate a hypothesis (Hammond & Wellington, 2020).

Generally, we place research methods into two categories: quantitative and qualitative. Each has its own strengths and weaknesses, which we can summarize as:

  • Quantitative research can achieve generalizability through scrupulous statistical analysis applied to large sample sizes.
  • Qualitative research achieves deep, detailed, and nuance accounts of specific case studies, which are not generalizable.

Some researchers, with the aim of making the most of both quantitative and qualitative research, employ mixed methods, whereby they will apply both types of research methods in the one study, such as by conducting a statistical survey alongside in-depth interviews to add context to the quantitative findings.

Below, I’ll outline 15 common research methods, and include pros, cons, and examples of each .

Types of Research Methods

Research methods can be broadly categorized into two types: quantitative and qualitative.

  • Quantitative methods involve systematic empirical investigation of observable phenomena via statistical, mathematical, or computational techniques, providing an in-depth understanding of a specific concept or phenomenon (Schweigert, 2021). The strengths of this approach include its ability to produce reliable results that can be generalized to a larger population, although it can lack depth and detail.
  • Qualitative methods encompass techniques that are designed to provide a deep understanding of a complex issue, often in a specific context, through collection of non-numerical data (Tracy, 2019). This approach often provides rich, detailed insights but can be time-consuming and its findings may not be generalizable.

These can be further broken down into a range of specific research methods and designs:

Combining the two methods above, mixed methods research mixes elements of both qualitative and quantitative research methods, providing a comprehensive understanding of the research problem . We can further break these down into:

  • Sequential Explanatory Design (QUAN→QUAL): This methodology involves conducting quantitative analysis first, then supplementing it with a qualitative study.
  • Sequential Exploratory Design (QUAL→QUAN): This methodology goes in the other direction, starting with qualitative analysis and ending with quantitative analysis.

Let’s explore some methods and designs from both quantitative and qualitative traditions, starting with qualitative research methods.

Qualitative Research Methods

Qualitative research methods allow for the exploration of phenomena in their natural settings, providing detailed, descriptive responses and insights into individuals’ experiences and perceptions (Howitt, 2019).

These methods are useful when a detailed understanding of a phenomenon is sought.

1. Ethnographic Research

Ethnographic research emerged out of anthropological research, where anthropologists would enter into a setting for a sustained period of time, getting to know a cultural group and taking detailed observations.

Ethnographers would sometimes even act as participants in the group or culture, which many scholars argue is a weakness because it is a step away from achieving objectivity (Stokes & Wall, 2017).

In fact, at its most extreme version, ethnographers even conduct research on themselves, in a fascinating methodology call autoethnography .

The purpose is to understand the culture, social structure, and the behaviors of the group under study. It is often useful when researchers seek to understand shared cultural meanings and practices in their natural settings.

However, it can be time-consuming and may reflect researcher biases due to the immersion approach.

Example of Ethnography

Liquidated: An Ethnography of Wall Street  by Karen Ho involves an anthropologist who embeds herself with Wall Street firms to study the culture of Wall Street bankers and how this culture affects the broader economy and world.

2. Phenomenological Research

Phenomenological research is a qualitative method focused on the study of individual experiences from the participant’s perspective (Tracy, 2019).

It focuses specifically on people’s experiences in relation to a specific social phenomenon ( see here for examples of social phenomena ).

This method is valuable when the goal is to understand how individuals perceive, experience, and make meaning of particular phenomena. However, because it is subjective and dependent on participants’ self-reports, findings may not be generalizable, and are highly reliant on self-reported ‘thoughts and feelings’.

Example of Phenomenological Research

A phenomenological approach to experiences with technology  by Sebnem Cilesiz represents a good starting-point for formulating a phenomenological study. With its focus on the ‘essence of experience’, this piece presents methodological, reliability, validity, and data analysis techniques that phenomenologists use to explain how people experience technology in their everyday lives.

3. Historical Research

Historical research is a qualitative method involving the examination of past events to draw conclusions about the present or make predictions about the future (Stokes & Wall, 2017).

As you might expect, it’s common in the research branches of history departments in universities.

This approach is useful in studies that seek to understand the past to interpret present events or trends. However, it relies heavily on the availability and reliability of source materials, which may be limited.

Common data sources include cultural artifacts from both material and non-material culture , which are then examined, compared, contrasted, and contextualized to test hypotheses and generate theories.

Example of Historical Research

A historical research example might be a study examining the evolution of gender roles over the last century. This research might involve the analysis of historical newspapers, advertisements, letters, and company documents, as well as sociocultural contexts.

4. Content Analysis

Content analysis is a research method that involves systematic and objective coding and interpreting of text or media to identify patterns, themes, ideologies, or biases (Schweigert, 2021).

A content analysis is useful in analyzing communication patterns, helping to reveal how texts such as newspapers, movies, films, political speeches, and other types of ‘content’ contain narratives and biases.

However, interpretations can be very subjective, which often requires scholars to engage in practices such as cross-comparing their coding with peers or external researchers.

Content analysis can be further broken down in to other specific methodologies such as semiotic analysis, multimodal analysis , and discourse analysis .

Example of Content Analysis

How is Islam Portrayed in Western Media?  by Poorebrahim and Zarei (2013) employs a type of content analysis called critical discourse analysis (common in poststructuralist and critical theory research ). This study by Poorebrahum and Zarei combs through a corpus of western media texts to explore the language forms that are used in relation to Islam and Muslims, finding that they are overly stereotyped, which may represent anti-Islam bias or failure to understand the Islamic world.

5. Grounded Theory Research

Grounded theory involves developing a theory  during and after  data collection rather than beforehand.

This is in contrast to most academic research studies, which start with a hypothesis or theory and then testing of it through a study, where we might have a null hypothesis (disproving the theory) and an alternative hypothesis (supporting the theory).

Grounded Theory is useful because it keeps an open mind to what the data might reveal out of the research. It can be time-consuming and requires rigorous data analysis (Tracy, 2019).

Grounded Theory Example

Developing a Leadership Identity   by Komives et al (2005) employs a grounded theory approach to develop a thesis based on the data rather than testing a hypothesis. The researchers studied the leadership identity of 13 college students taking on leadership roles. Based on their interviews, the researchers theorized that the students’ leadership identities shifted from a hierarchical view of leadership to one that embraced leadership as a collaborative concept.

6. Action Research

Action research is an approach which aims to solve real-world problems and bring about change within a setting. The study is designed to solve a specific problem – or in other words, to take action (Patten, 2017).

This approach can involve mixed methods, but is generally qualitative because it usually involves the study of a specific case study wherein the researcher works, e.g. a teacher studying their own classroom practice to seek ways they can improve.

Action research is very common in fields like education and nursing where practitioners identify areas for improvement then implement a study in order to find paths forward.

Action Research Example

Using Digital Sandbox Gaming to Improve Creativity Within Boys’ Writing   by Ellison and Drew was a research study one of my research students completed in his own classroom under my supervision. He implemented a digital game-based approach to literacy teaching with boys and interviewed his students to see if the use of games as stimuli for storytelling helped draw them into the learning experience.

7. Natural Observational Research

Observational research can also be quantitative (see: experimental research), but in naturalistic settings for the social sciences, researchers tend to employ qualitative data collection methods like interviews and field notes to observe people in their day-to-day environments.

This approach involves the observation and detailed recording of behaviors in their natural settings (Howitt, 2019). It can provide rich, in-depth information, but the researcher’s presence might influence behavior.

While observational research has some overlaps with ethnography (especially in regard to data collection techniques), it tends not to be as sustained as ethnography, e.g. a researcher might do 5 observations, every second Monday, as opposed to being embedded in an environment.

Observational Research Example

A researcher might use qualitative observational research to study the behaviors and interactions of children at a playground. The researcher would document the behaviors observed, such as the types of games played, levels of cooperation , and instances of conflict.

8. Case Study Research

Case study research is a qualitative method that involves a deep and thorough investigation of a single individual, group, or event in order to explore facets of that phenomenon that cannot be captured using other methods (Stokes & Wall, 2017).

Case study research is especially valuable in providing contextualized insights into specific issues, facilitating the application of abstract theories to real-world situations (Patten, 2017).

However, findings from a case study may not be generalizable due to the specific context and the limited number of cases studied (Walliman, 2021).

See More: Case Study Advantages and Disadvantages

Example of a Case Study

Scholars conduct a detailed exploration of the implementation of a new teaching method within a classroom setting. The study focuses on how the teacher and students adapt to the new method, the challenges encountered, and the outcomes on student performance and engagement. While the study provides specific and detailed insights of the teaching method in that classroom, it cannot be generalized to other classrooms, as statistical significance has not been established through this qualitative approach.

Quantitative Research Methods

Quantitative research methods involve the systematic empirical investigation of observable phenomena via statistical, mathematical, or computational techniques (Pajo, 2022). The focus is on gathering numerical data and generalizing it across groups of people or to explain a particular phenomenon.

9. Experimental Research

Experimental research is a quantitative method where researchers manipulate one variable to determine its effect on another (Walliman, 2021).

This is common, for example, in high-school science labs, where students are asked to introduce a variable into a setting in order to examine its effect.

This type of research is useful in situations where researchers want to determine causal relationships between variables. However, experimental conditions may not reflect real-world conditions.

Example of Experimental Research

A researcher may conduct an experiment to determine the effects of a new educational approach on student learning outcomes. Students would be randomly assigned to either the control group (traditional teaching method) or the experimental group (new educational approach).

10. Surveys and Questionnaires

Surveys and questionnaires are quantitative methods that involve asking research participants structured and predefined questions to collect data about their attitudes, beliefs, behaviors, or characteristics (Patten, 2017).

Surveys are beneficial for collecting data from large samples, but they depend heavily on the honesty and accuracy of respondents.

They tend to be seen as more authoritative than their qualitative counterparts, semi-structured interviews, because the data is quantifiable (e.g. a questionnaire where information is presented on a scale from 1 to 10 can allow researchers to determine and compare statistical means, averages, and variations across sub-populations in the study).

Example of a Survey Study

A company might use a survey to gather data about employee job satisfaction across its offices worldwide. Employees would be asked to rate various aspects of their job satisfaction on a Likert scale. While this method provides a broad overview, it may lack the depth of understanding possible with other methods (Stokes & Wall, 2017).

11. Longitudinal Studies

Longitudinal studies involve repeated observations of the same variables over extended periods (Howitt, 2019). These studies are valuable for tracking development and change but can be costly and time-consuming.

With multiple data points collected over extended periods, it’s possible to examine continuous changes within things like population dynamics or consumer behavior. This makes a detailed analysis of change possible.

a visual representation of a longitudinal study demonstrating that data is collected over time on one sample so researchers can examine how variables change over time

Perhaps the most relatable example of a longitudinal study is a national census, which is taken on the same day every few years, to gather comparative demographic data that can show how a nation is changing over time.

While longitudinal studies are commonly quantitative, there are also instances of qualitative ones as well, such as the famous 7 Up study from the UK, which studies 14 individuals every 7 years to explore their development over their lives.

Example of a Longitudinal Study

A national census, taken every few years, uses surveys to develop longitudinal data, which is then compared and analyzed to present accurate trends over time. Trends a census can reveal include changes in religiosity, values and attitudes on social issues, and much more.

12. Cross-Sectional Studies

Cross-sectional studies are a quantitative research method that involves analyzing data from a population at a specific point in time (Patten, 2017). They provide a snapshot of a situation but cannot determine causality.

This design is used to measure and compare the prevalence of certain characteristics or outcomes in different groups within the sampled population.

A visual representation of a cross-sectional group of people, demonstrating that the data is collected at a single point in time and you can compare groups within the sample

The major advantage of cross-sectional design is its ability to measure a wide range of variables simultaneously without needing to follow up with participants over time.

However, cross-sectional studies do have limitations . This design can only show if there are associations or correlations between different variables, but cannot prove cause and effect relationships, temporal sequence, changes, and trends over time.

Example of a Cross-Sectional Study

Our longitudinal study example of a national census also happens to contain cross-sectional design. One census is cross-sectional, displaying only data from one point in time. But when a census is taken once every few years, it becomes longitudinal, and so long as the data collection technique remains unchanged, identification of changes will be achievable, adding another time dimension on top of a basic cross-sectional study.

13. Correlational Research

Correlational research is a quantitative method that seeks to determine if and to what degree a relationship exists between two or more quantifiable variables (Schweigert, 2021).

This approach provides a fast and easy way to make initial hypotheses based on either positive or  negative correlation trends  that can be observed within dataset.

While correlational research can reveal relationships between variables, it cannot establish causality.

Methods used for data analysis may include statistical correlations such as Pearson’s or Spearman’s.

Example of Correlational Research

A team of researchers is interested in studying the relationship between the amount of time students spend studying and their academic performance. They gather data from a high school, measuring the number of hours each student studies per week and their grade point averages (GPAs) at the end of the semester. Upon analyzing the data, they find a positive correlation, suggesting that students who spend more time studying tend to have higher GPAs.

14. Quasi-Experimental Design Research

Quasi-experimental design research is a quantitative research method that is similar to experimental design but lacks the element of random assignment to treatment or control.

Instead, quasi-experimental designs typically rely on certain other methods to control for extraneous variables.

The term ‘quasi-experimental’ implies that the experiment resembles a true experiment, but it is not exactly the same because it doesn’t meet all the criteria for a ‘true’ experiment, specifically in terms of control and random assignment.

Quasi-experimental design is useful when researchers want to study a causal hypothesis or relationship, but practical or ethical considerations prevent them from manipulating variables and randomly assigning participants to conditions.

Example of Quasi-Experimental Design

A researcher wants to study the impact of a new math tutoring program on student performance. However, ethical and practical constraints prevent random assignment to the “tutoring” and “no tutoring” groups. Instead, the researcher compares students who chose to receive tutoring (experimental group) to similar students who did not choose to receive tutoring (control group), controlling for other variables like grade level and previous math performance.

Related: Examples and Types of Random Assignment in Research

15. Meta-Analysis Research

Meta-analysis statistically combines the results of multiple studies on a specific topic to yield a more precise estimate of the effect size. It’s the gold standard of secondary research .

Meta-analysis is particularly useful when there are numerous studies on a topic, and there is a need to integrate the findings to draw more reliable conclusions.

Some meta-analyses can identify flaws or gaps in a corpus of research, when can be highly influential in academic research, despite lack of primary data collection.

However, they tend only to be feasible when there is a sizable corpus of high-quality and reliable studies into a phenomenon.

Example of a Meta-Analysis

The power of feedback revisited (Wisniewski, Zierer & Hattie, 2020) is a meta-analysis that examines 435 empirical studies research on the effects of feedback on student learning. They use a random-effects model to ascertain whether there is a clear effect size across the literature. The authors find that feedback tends to impact cognitive and motor skill outcomes but has less of an effect on motivational and behavioral outcomes.

Choosing a research method requires a lot of consideration regarding what you want to achieve, your research paradigm, and the methodology that is most valuable for what you are studying. There are multiple types of research methods, many of which I haven’t been able to present here. Generally, it’s recommended that you work with an experienced researcher or research supervisor to identify a suitable research method for your study at hand.

Hammond, M., & Wellington, J. (2020). Research methods: The key concepts . New York: Routledge.

Howitt, D. (2019). Introduction to qualitative research methods in psychology . London: Pearson UK.

Pajo, B. (2022). Introduction to research methods: A hands-on approach . New York: Sage Publications.

Patten, M. L. (2017). Understanding research methods: An overview of the essentials . New York: Sage

Schweigert, W. A. (2021). Research methods in psychology: A handbook . Los Angeles: Waveland Press.

Stokes, P., & Wall, T. (2017). Research methods . New York: Bloomsbury Publishing.

Tracy, S. J. (2019). Qualitative research methods: Collecting evidence, crafting analysis, communicating impact . London: John Wiley & Sons.

Walliman, N. (2021). Research methods: The basics. London: Routledge.

Chris

Chris Drew (PhD)

Dr. Chris Drew is the founder of the Helpful Professor. He holds a PhD in education and has published over 20 articles in scholarly journals. He is the former editor of the Journal of Learning Development in Higher Education. [Image Descriptor: Photo of Chris]

  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd/ 5 Top Tips for Succeeding at University
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd/ 50 Durable Goods Examples
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd/ 100 Consumer Goods Examples
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd/ 30 Globalization Pros and Cons

Leave a Comment Cancel Reply

Your email address will not be published. Required fields are marked *

MeasuringU Logo

5 Types of Qualitative Methods

what are the 5 research methods

But just as with quantitative methods, there are actually many varieties of qualitative methods.

Similar to the way you can group usability testing methods , there are also a number of ways to segment qualitative methods.

A popular and helpful categorization separate qualitative methods into five groups: ethnography, narrative, phenomenological, grounded theory, and case study. John Creswell outlines these five methods in Qualitative Inquiry and Research Design .

While the five methods generally use similar data collection techniques (observation, interviews, and reviewing text), the purpose of the study differentiates them—something similar with different types of usability tests . And like classifying different usability studies, the differences between the methods can be a bit blurry. Here are the five qualitative methods in more detail.

1. Ethnography

Ethnographic research is probably the most familiar and applicable type of qualitative method to UX professionals. In ethnography, you immerse yourself in the target participants’ environment to understand the goals, cultures, challenges, motivations, and themes that emerge. Ethnography has its roots in cultural anthropology where researchers immerse themselves within a culture, often for years! Rather than relying on interviews or surveys, you experience the environment first hand, and sometimes as a “participant observer.”

For example, one way of uncovering the unmet needs of customers is to “ follow them home ” and observe them as they interact with the product. You don’t come armed with any hypotheses to necessarily test; rather, you’re looking to find out how a product is used.

2. Narrative

The narrative approach weaves together a sequence of events, usually from just one or two individuals to form a cohesive story. You conduct in-depth interviews, read documents, and look for themes; in other words, how does an individual story illustrate the larger life influences that created it. Often interviews are conducted over weeks, months, or even years, but the final narrative doesn’t need to be in chronological order. Rather it can be presented as a story (or narrative) with themes, and can reconcile conflicting stories and highlight tensions and challenges which can be opportunities for innovation.

For example, a narrative approach can be an appropriate method for building a persona . While a persona should be built using a mix of methods—including segmentation analysis from surveys—in-depth interviews with individuals in an identified persona can provide the details that help describe the culture, whether it’s a person living with Multiple Sclerosis, a prospective student applying for college, or a working mom.

3. Phenomenological

When you want to describe an event, activity, or phenomenon, the aptly named phenomenological study is an appropriate qualitative method. In a phenomenological study, you use a combination of methods, such as conducting interviews, reading documents, watching videos, or visiting places and events, to understand the meaning participants place on whatever’s being examined. You rely on the participants’ own perspectives to provide insight into their motivations.

Like other qualitative methods, you don’t start with a well-formed hypothesis. In a phenomenological study, you often conduct a lot of interviews, usually between 5 and 25 for common themes , to build a sufficient dataset to look for emerging themes and to use other participants to validate your findings.

For example, there’s been an explosion in the last 5 years in online courses and training. But how do students engage with these courses? While you can examine time spent and content accessed using log data and even assess student achievement vis-a-vis in-person courses, a phenomenological study would aim to better understand the students experience and how that may impact comprehension of the material.

4. Grounded Theory

Whereas a phenomenological study looks to describe the essence of an activity or event, grounded theory looks to provide an explanation or theory behind the events. You use primarily interviews and existing documents to build a theory based on the data. You go through a series of open and axial coding techniques to identify themes and build the theory. Sample sizes are often also larger—between 20 to 60—with these studies to better establish a theory. Grounded theory can help inform design decisions by better understanding how a community of users currently use a product or perform tasks.

For example, a grounded theory study could involve understanding how software developers use portals to communicate and write code or how small retail merchants approve or decline customers for credit.

5. Case Study

Made famous by the Harvard Business School, even mainly quantitative researchers can relate to the value of the case study in explaining an organization, entity, company, or event. A case study involves a deep understanding through multiple types of data sources. Case studies can be explanatory, exploratory, or describing an event. The annual CHI conference has a peer-reviewed track dedicated to case studies.

For example, a case study of how a large multi-national company introduced UX methods into an agile development environment would be informative to many organizations.

The table below summarizes the differences between the five qualitative methods.

You might also be interested in

042821-Feature

Read our research on: Gun Policy | International Conflict | Election 2024

Regions & Countries

Our methods.

Pew Research Center is committed to meeting the highest methodological standards — and to exploring the newest frontiers of research. Learn more about the methods the Center uses to conduct objective, non-partisan research on a wide range of topics that is trusted around the world.

what are the 5 research methods

U.S. Surveys

what are the 5 research methods

International Surveys

what are the 5 research methods

Demographic Analysis

what are the 5 research methods

Data Science

Featured methods publications, how public polling has changed in the 21st century.

A new study found that 61% of national pollsters used different methods in 2022 than in 2016. And last year, 17% of pollsters used multiple methods to sample or interview people – up from 2% in 2016.

Public Opinion Polling Basics

By the end of our free, five-lesson course, you will know why we have polls, what the different kinds of polls are, how polling works and what you should look for in a poll.

National Public Opinion Reference Survey (NPORS)

NPORS is an annual survey of U.S. adults conducted by the Pew Research Center.

Confronting 2016 and 2020 Polling Limitations

Looking at final estimates of the outcome of the 2020 U.S. presidential race, 93% of national polls overstated the Democratic candidate’s support among voters, while nearly as many (88%) did so in 2016.

What 2020’s Election Poll Errors Tell Us About the Accuracy of Issue Polling

Given the errors in 2016 and 2020 election polling, how much should we trust polls that attempt to measure opinions on issues?

Key things to know about election polling in the United States

The real environment in which polls are conducted bears little resemblance to the idealized settings presented in textbooks.

More Methods Resources

Methods 101 videos.

Our series of brief explainer videos breaks down key topics in research methodology for non-expert audiences.

A behind-the-scenes blog about research methods at Pew Research Center.

OUR METHODS TEAMS

Sign up for our quarterly methods newsletter

About Pew Research Center Pew Research Center is a nonpartisan fact tank that informs the public about the issues, attitudes and trends shaping the world. It conducts public opinion polling, demographic research, media content analysis and other empirical social science research. Pew Research Center does not take policy positions. It is a subsidiary of The Pew Charitable Trusts .

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Published: 12 April 2024

Pretraining a foundation model for generalizable fluorescence microscopy-based image restoration

  • Chenxi Ma 1   na1 ,
  • Weimin Tan   ORCID: orcid.org/0000-0001-7677-4772 1   na1 ,
  • Ruian He 1 &
  • Bo Yan   ORCID: orcid.org/0000-0001-5692-3486 1  

Nature Methods ( 2024 ) Cite this article

1074 Accesses

34 Altmetric

Metrics details

  • Confocal microscopy
  • Image processing
  • Super-resolution microscopy
  • Wide-field fluorescence microscopy

Fluorescence microscopy-based image restoration has received widespread attention in the life sciences and has led to significant progress, benefiting from deep learning technology. However, most current task-specific methods have limited generalizability to different fluorescence microscopy-based image restoration problems. Here, we seek to improve generalizability and explore the potential of applying a pretrained foundation model to fluorescence microscopy-based image restoration. We provide a universal fluorescence microscopy-based image restoration (UniFMIR) model to address different restoration problems, and show that UniFMIR offers higher image restoration precision, better generalization and increased versatility. Demonstrations on five tasks and 14 datasets covering a wide range of microscopy imaging modalities and biological samples demonstrate that the pretrained UniFMIR can effectively transfer knowledge to a specific situation via fine-tuning, uncover clear nanoscale biomolecular structures and facilitate high-quality imaging. This work has the potential to inspire and trigger new research highlights for fluorescence microscopy-based image restoration.

This is a preview of subscription content, access via your institution

Access options

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

24,99 € / 30 days

cancel any time

Subscribe to this journal

Receive 12 print issues and online access

251,40 € per year

only 20,95 € per issue

Buy this article

  • Purchase on Springer Link
  • Instant access to full article PDF

Prices may be subject to local taxes which are calculated during checkout

what are the 5 research methods

Similar content being viewed by others

what are the 5 research methods

Unsupervised content-preserving transformation for optical microscopy

Xinyang Li, Guoxun Zhang, … Qionghai Dai

what are the 5 research methods

Applications, promises, and pitfalls of deep learning for fluorescence image reconstruction

Chinmay Belthangady & Loic A. Royer

what are the 5 research methods

Deep learning enables reference-free isotropic super-resolution for volumetric fluorescence microscopy

Hyoungjun Park, Myeongsu Na, … Jong Chul Ye

Data availability

All training and testing data involved in the experiments come from existing literature and can be downloaded from the corresponding links provided in Supplementary Table 2 or via Zenodo at https://doi.org/10.5281/zenodo.8401470 (ref. 55 ).

Code availability

The PyTorch code of our UniFMIR, together with trained models, as well as some example images for inference are publicly available at https://github.com/cxm12/UNiFMIR ( https://doi.org/10.5281/zenodo.10117581 ) 56 . Furthermore, We also provide a live demo for UniFMIR at http://unifmir.fdudml.cn/ . Users can also access the colab at https://colab.research.google.com/github/cxm12/UNiFMIR/blob/main/UniFMIR.ipynb or use the steps in our GitHub documentation to run the demo locally. This newly built interactive software platform facilitates users to freely and easily use the pretrained foundation model. It also makes it easy for us to continuously train the foundation model with new data and share it with the community. Finally, we shared all models on BioImage.IO at https://bioimage.io/#/ . Data are available via Zenodo at https://doi.org/10.5281/zenodo.10577218 , https://doi.org/10.5281/zenodo.10579778 , https://doi.org/10.5281/zenodo.10579822 , https://doi.org/10.5281/zenodo.10595428 , https://doi.org/10.5281/zenodo.10595460 , https://doi.org/10.5281/zenodo.8420081 and https://doi.org/10.5281/zenodo.8420100 (refs. 57 , 58 , 59 , 60 , 61 , 62 , 63 ). We used the Pycharm software for code development.

Preibisch, S. et al. Efficient bayesian-based multiview deconvolution. Nat. Methods 11 , 645–648 (2014).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Gustafsson, N. et al. Fast live-cell conventional fluorophore nanoscopy with ImageJ through super-resolution radial fluctuations. Nat. Commun. 7 , 12471 (2016).

Arigovindan, M. et al. High-resolution restoration of 3D structures from widefield images with extreme low signal-to-noise-ratio. Proc. Natl Acad. Sci. USA 110 , 17344–17349 (2013).

Weigert, M. et al. Content-aware image restoration: pushing the limits of fluorescence microscopy. Nat. Methods 15 , 1090–1097 (2018).

Article   CAS   PubMed   Google Scholar  

Qiao, C. et al. Evaluation and development of deep neural networks for image super-resolution in optical microscopy. Nat. Methods 18 , 194–202 (2021).

Chen, J. et al. Three-dimensional residual channel attention networks denoise and sharpen fluorescence microscopy image volumes. Nat. Methods 18 , 678–687 (2021).

Wang, Z., Xie, Y. & Ji, S. Global voxel transformer networks for augmented microscopy. Nat. Mach. Intell. 3 , 161–171 (2021).

Article   Google Scholar  

Wang, Z. et al. Real-time volumetric reconstruction of biological dynamics with light-field microscopy and deep learning. Nat. Methods 18 , 551–556 (2021).

Li, X. et al. Reinforcing neuron extraction and spike inference in calcium imaging using deep self-supervised denoising. Nat. Methods 18 , 1395–1400 (2021).

Qiao, C. et al. Rationalized deep neural network for sustained super-resolution live imaging of rapid subcellular processes. Nat. Biotechol. 41 , 367–377 (2022).

Belthangady, C. & Royer, L. A. Applications, promises, and pitfalls of deep learning for fluorescence image reconstruction. Nat. Methods 16 , 1215–1225 (2019).

Wu, Y. & Shroff, H. Faster, sharper, and deeper: structured illumination microscopy for biological imaging. Nat. Methods 15 , 1011–1019 (2018).

Wu, Y. et al. Multiview confocal super-resolution microscopy. Nature 600 , 279–284 (2021).

Chen, R. et al. Single-frame deep-learning super-resolution microscopy for intracellular dynamics imaging. Nat. Commun. 14 , 2854 (2023).

Xu, Y. K. T. et al. Cross-modality supervised image restoration enables nanoscale tracking of synaptic plasticity in living mice. Nat. Methods 20 , 935–944 (2023).

Bommasani, R. et al. On the opportunities and risks of foundation models. Preprint at https://arxiv.org/abs/2108.07258 (2021).

Fei, N. et al. Towards artificial general intelligence via a multimodal foundation model. Nat. Commun. 13 , 3094 (2022).

Zhang, Y. et al. DialoGPT: large-scale generative pre-training for conversational response generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. 270–278 (2020).

Yang, Z. et al. Xlnet: generalized autoregressive pretraining for language understanding. In Conference on Neural Information Processing Systems (NeurIPS) (2019).

Dai, Z. et al. Coatnet: marrying convolution and attention for all data sizes. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021).

Kirillov, A. et al. Segment anything. In Proceedings of the IEEE/CVF International Conference on Computer Vision , 4015–4026 (2023).

Achiam, J. et al. Gpt-4 technical report. Preprint at https://arxiv.org/abs/2303.08774 (2023).

Bao, F. et al. One transformer fits all distributions in multi-modal diffusion at scale. In International Conference on Machine Learning (ICML) (2023).

Bi, K. et al. Accurate medium-range global weather forecasting with 3D neural networks. Nature 619 , 533–538 (2023).

Singhal, K. et al. Large language models encode clinical knowledge. Nature 620 , 172–180 (2023).

Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature 619 , 357–362 (2023).

Huang, Z. et al. A visual-language foundation model for pathology image analysis using medical twitter. Nat. Methods 29 , 2307–2316 (2023).

CAS   Google Scholar  

Zhou, Y. et al. A foundation model for generalizable disease detection from retinal images. Nature 622 , 156–163 (2023).

Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 , 259–265 (2023).

Madani, A. et al. Large language models generate functional protein sequences across diverse families. Nature Biotechnol. 41 , 1099–1106 (2023).

Article   CAS   Google Scholar  

Theodoris, C. V. et al. Transfer learning enables predictions in network biology. Nature 618 , 616–624 (2023).

Henighan, T. et al. Scaling laws for autoregressive generative modeling. Preprint at https://arxiv.org/abs/2010.14701 (2020).

Zamir, A. et al. Taskonomy: disentangling task transfer learning. In Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI) , 3712–3722 (2019).

Liu, Z. et al. Swin transformer: hierarchical vision transformer using shifted windows. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021).

Xia, B. et al. Efficient non-local contrastive attention for image super-resolution. In Association for the Advancement of Artificial Intelligence (AAAI) (2022).

Descloux, A., Grubmayer, K. S. & Radenovic, A. Parameter-free image resolution estimation based on decorrelation analysis. Nat. Methods 16 , 918–924 (2019).

Nieuwenhuizen, R. et al. Measuring image resolution in optical nanoscopy. Nat. Methods 10 , 557–562 (2013).

Culley, S. et al. Quantitative mapping and minimization of super-resolution optical imaging artifacts. Nat. Methods 15 , 263–266 (2018).

Li, X. et al. Three-dimensional structured illumination microscopy with enhanced axial resolution. Nat. Biotechnol. 41 , 1307–1319 (2023).

Spahn, C. et al. DeepBacs for multi-task bacterial image analysis using open-source deep learning approaches. Commun. Biol. 5 , 688 (2022).

Ouyang, W. et al. ShareLoc—an open platform for sharing localization microscopy data. Nat. Methods 19 , 1331–1333 (2022).

Zhang, X. C. et al. Zoom to learn, learn to zoom. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019).

Nehme, E. et al. Deep-storm: super-resolution single-molecule microscopy by deep learning. Optica 5 , 458–464 (2018).

Guo, L. L. et al. EHR foundation models improve robustness in the presence of temporal distribution shift. Sci. Rep. 13 , 3767 (2023).

Liang, J. et al. Swinir: image restoration using swin transformer. In IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) , 1833–1844 (2021).

Simonyan, K. & Zisserman, A. Very deep convolutional networks for large-scale image recognition. In International Conference on Machine Learning (ICLR) (2015).

Kingma, D. & Ba, J. Adam: a method for stochastic optimization. Preprint at https://arxiv.org/abs/1412.6980 (2014).

Wang, Z. et al. Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13 , 600–612 (2004).

Article   PubMed   Google Scholar  

Abbe, E. Beiträge zur theorie des mikroskops und der mikroskopischen wahrnehmung. Archiv. f. Mikrosk. Anatomie 9 , 413–418 (1873).

Koho, S. et al. Fourier ring correlation simplifies image restoration in fluorescence microscopy. Nat. Commun. 10 , 3103 (2019).

Article   PubMed   PubMed Central   Google Scholar  

Baskin, C. et al. UNIQ: uniform noise injection for non-uniform quantization of neural networks. ACM Transactions on Computer Systems (TOCS) , 37 (1–4), 1–15 (2021).

Arganda, C. et al. Trainable weka segmentation: a machine learning tool for microscopy pixel classification. Bioinformatics 33 , 2424–2426 (2017).

Jacob, B. et al. Quantization and training of neural networks for efficient integer-arithmetic-only inference. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , 2704–2713 (2018).

Ma, C., Tan, W., He, R. & Yan, B. UniFMIR: pre-training a foundation model for universal fluorescence microscopy image restoration (2023.10.03). Zenodo https://doi.org/10.5281/zenodo.8401470 (2023).

Ma, C., Tan, W., He, R., & Yan, B. UniFMIR: pre-training a foundation model for universal fluorescence microscopy image restoration (version 2023.11.13). Zenodo https://doi.org/10.5281/zenodo.10117581 (2023).

Ma, C., Tan, W., He, R. & Yan, B. UniFMIRProjectionOnFlyWing. Zenodo https://doi.org/10.5281/zenodo.10577218 (2024).

Ma, C., Tan, W., He, R. & Yan, B. UniFMIRDenoiseOnPlanaria. Zenodo https://doi.org/10.5281/zenodo.10579778 (2024).

Ma, C., Tan, W., He, R. & Yan, B. UniFMIRDenoiseOnTribolium. Zenodo https://doi.org/10.5281/zenodo.10579822 (2024).

Ma, C., Tan, W., He, R. & Yan, B. UniFMIRVolumetricReconstructionOnVCD. Zenodo https://doi.org/10.5281/zenodo.10595428 (2024).

Ma, C., Tan, W., He, R. & Yan, B. UniFMIRIsotropicReconstructionOnLiver. Zenodo https://doi.org/10.5281/zenodo.10595460 (2024) .

Ma, C., Tan, W., He, R. & Yan, B. UniFMIRSuperResolutionOnMicrotubules. Zenodo https://doi.org/10.5281/zenodo.8420081 (2023).

Ma, C., Tan, W., He, R. & Yan, B. UniFMIRSuperResolutionOnFactin. Zenodo https://doi.org/10.5281/zenodo.8420100 (2023).

Download references

Acknowledgements

We gratefully acknowledge support for this work provided by the National Natural Science Foundation of China (NSFC) (grant nos. U2001209 to B.Y. and 62372117 to W.T.) and the Natural Science Foundation of Shanghai (grant no. 21ZR1406600 to W.T.).

Author information

These authors contributed equally: Chenxi Ma, Weimin Tan.

Authors and Affiliations

School of Computer Science, Shanghai Key Laboratory of Intelligent Information Processing, Fudan University, Shanghai, China

Chenxi Ma, Weimin Tan, Ruian He & Bo Yan

You can also search for this author in PubMed   Google Scholar

Contributions

B.Y. and W.T. supervised the research. C.M. and W.T. conceived of the technique. C.M. implemented the algorithm. C.M. and W.T. designed the validation experiments. C.M. trained the network and performed the validation experiments. R.H. implemented the interactive software platform and organized the codes and models. All authors had access to the study and wrote the paper.

Corresponding author

Correspondence to Bo Yan .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Peer review

Peer review information.

Nature Methods thanks Ricardo Henriques and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Rita Strack, in collaboration with the Nature Methods team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended data fig. 1 overall architecture of the unifmir..

The proposed UniFMIR approach is composed of three submodules: a multihead module, a Swin transformer-based feature enhancement module, and a multitail module. The numbers of parameters (M) and calculations (GFLOPs) required for the head, feature enhancement and tail modules for different tasks are marked below the structures of the respective modules. The input sizes and output sizes of training batches for different tasks are also marked below the images.

Extended Data Fig. 2 Network architecture of the Swin transformer-based feature enhancement module 46 .

The feature enhancement module consists of convolutional layers and a series of Swin transformer blocks (STB), each of which includes several Swin transformer layers (STL), a convolutional layer and a residual connection. The STL is composed of layer normalization operations, a multihead self-attention (MSA) mechanism and a multilayer perceptron (MLP). In the MSA mechanism, the input features are first divided into multiple small patches with a moving window operation, and then the self-attention in each patch is calculated to output features f out . The MLP is composed of two fully connected layers (FCs) and Gaussian-error linear unit (GELU) activation.

Extended Data Fig. 3 Generalization ability analysis of super-resolution on unseen modality of single-molecule localization microscopy data from the Shareloc platform 52 .

a, SR results obtained by the SOTA model (DeepSTORM 54 ), the pretrained UniFMIR model without fine-tuning, Baseline (same network structure as UniFMIR trained from scratch), and our fine-tuned UniFMIR model. The GT dSTORM images of microtubules stained with Alexa 647 in U2OS cells incubated with nocodazole and the input synthesized LR images are also shown. The PSNR/NRMSE results of the SR outputs obtained on n = 16 synthetic inputs are shown on the right. b, SR results obtained on the real-world wide-field images. The NRMSE values are depicted on the residual images under different SR results and the raw input images. The PSNR/NRMSE results on n = 9 real-world inputs are shown on the right. Box-plot elements are defined as follows: center line (median); box limits (upper and lower quartiles); whiskers (1.5x interquartile range). The line plots show the pixel intensities along the dashed lines in the corresponding images. Scale bar: 6.5 μ m.

Supplementary information

Supplementary information.

Supplementary Notes 1–5, Figs. 1–17 and Tables 1 and 2.

Reporting Summary

Rights and permissions.

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article.

Ma, C., Tan, W., He, R. et al. Pretraining a foundation model for generalizable fluorescence microscopy-based image restoration. Nat Methods (2024). https://doi.org/10.1038/s41592-024-02244-3

Download citation

Received : 27 July 2023

Accepted : 13 March 2024

Published : 12 April 2024

DOI : https://doi.org/10.1038/s41592-024-02244-3

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

what are the 5 research methods

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • J Glob Health

Logo of jogh

A systematic review of the clinical features of pneumonia in children aged 5-9 years: Implications for guidelines and research

Priya m kevat.

1 Murdoch Children’s Research Institute, Melbourne, Victoria, Australia

2 University of Melbourne, Melbourne, Victoria, Australia

3 Royal Children’s Hospital Melbourne, Melbourne, Victoria, Australia

Melinda Morpeth

Hamish graham, associated data.

Childhood pneumonia presents a large global burden, though most data and guidelines focus on children less than 5 years old. Less information is available about the clinical presentation of pneumonia in children 5-9 years of age. Appropriate diagnostic and treatment algorithms may differ from those applied to younger children. This systematic literature review aimed to identify clinical features of pneumonia in children aged 5-9 years, with a focus on delineation from other age groups and comparison with existing WHO guidance for pneumonia in children less than 5 years old.

We searched MEDLINE, EMBASE and PubMed databases for publications that described clinical features of pneumonia in children 5-9 years old, from any country with no date restriction in English. The quality of included studies was evaluated using a modified Effective Public Health Project Practice (EPHPP) tool. Data relating to research context, study type, clinical features of pneumonia and comparisons with children less than 5 years old were extracted. For each clinical feature of pneumonia, we described mean percentage (95% confidence interval) of participants with this finding in terms of aetiology (all cause vs Mycoplasma pneumoniae ), and method of diagnosis (radiological vs clinical).

We included 15 publications, eight addressing all-cause pneumonia and seven addressing Mycoplasma pneumoniae . Cough and fever were common in children aged 5-9 years with pneumonia. Tachypnoea was documented in around half of patients. Dyspnoea/difficulty breathing and chest indrawing were present in approximately half of all-cause pneumonia cases, with no data on indrawing in the outpatient setting. Chest and abdominal pain were documented in around one third of cases of all-cause pneumonia, based on limited numbers. In addition to markers of pneumonia severity used in children <5 years, pallor has been identified as being associated with poorer outcomes alongside comorbidities and nutritional status.

Conclusions

Quality research exploring clinical features of pneumonia, treatment and outcomes in children aged 5-9 years using consistent inclusion criteria, definitions of features and age ranges are urgently needed to better inform practice and guidelines. Based on limited data fever and cough are common in this age group, but tachypnoea cannot be relied on for diagnosis. While waiting for better evidence, broader attention to features such as chest and abdominal pain, the role of chest radiographs for diagnosis in the absence of symptoms such as tachypnoea, and risk factors which may influence patient disposition (chest indrawing, pallor, nutritional status) warrant consideration by clinicians.

Protocol registration

PROSPERO: CRD42020213837.

Childhood pneumonia is responsible for a large mortality burden globally however most guidelines for low resource settings are focused on pneumonia in children less than 5 years old [ 1 , 2 ]. Focus on young children has been justified by the fact that more than 90% of childhood pneumonia deaths occur in young children less than 5 years of age [ 3 ]. Yet pneumonia is also important for older children. Global Burden of Disease estimates suggest that pneumonia accounts for around 7% of deaths in children aged 5-9 years [ 3 ].

While children aged 5-9 years are generally regarded as at lower risk for pneumonia and pneumonia death, the risk may still be substantial in certain contexts or patient cohorts (for example, children with chronic health conditions or disability). Appropriate diagnostic and treatment algorithms may differ from those applied to younger children and this group has not been addressed in previous guidelines.

The aim of this review was to describe the available evidence for clinical features of pneumonia in children aged 5-9 years in community, primary care, or hospital settings, with a focus on delineation from other age groups and comparison with existing WHO guidance for pneumonia in young children.

The protocol for this study was registered on PROSPERO, the international prospective register of systematic reviews (registration number CRD42020213837). We searched MEDLINE via Ovid, EMBASE via Ovid and PubMed in August 2020 using key search terms including synonyms for pneumonia, ages 5-9 years, and clinical findings or diagnosis (example in Appendix S1 in the Online Supplementary Document ). No date restriction was applied. We did not restrict by location of study but for practical reasons we restricted the search to studies available in English language.

We included studies that contained original data on the clinical features of pneumonia among children aged 5-9 years, published in English language. We excluded case reports, small case series (<10 participants), conference abstracts, or those in which data relating to children aged 5-9 years was not meaningfully disaggregated.

PK completed initial title and abstract screening. Full-text screening was completed by three reviewers (PK, MM, AG), with each article screened by two of these reviewers (PK, MM, AG) and any conflicts resolved by the majority opinion from the third remaining reviewer (PK, MM, AG). Reference lists of included articles were searched to identify additional relevant studies missed from the search.

We extracted data from included studies with a standardised data extraction tool. Information extracted included: year of publication, study details, inclusion and exclusion criteria, pneumonia diagnostic/case definition criteria, aetiological agent(s), participant characteristics (including socioeconomic status), presence of comorbid conditions, respiratory and extra-pulmonary clinical features, chest radiograph findings, treatment received, and outcomes, with comparison to the under 5 years age group wherever possible. Data extraction was completed by two reviewers (PK, MM), with data from each article extracted by one of these reviewers (PK, MM) and the extracted information checked by the second reviewer (PK, MM). Any conflicts were resolved by the majority opinion from a third reviewer (AG).

We separated data from studies describing pneumonia of any aetiology (all-cause pneumonia) and studies describing pneumonia attributed to Mycoplasma pneumoniae , given that several studies addressed Mycoplasma pneumoniae specifically. For each clinical feature, we described the number and percentage of patients who were documented to have the feature in each study. Using aggregated data of all studies which included each clinical feature we calculated the mean percentage and 95% confidence interval according to the cause of pneumonia (all-cause and attributable to Mycoplasma pneumoniae ) and the method of diagnosis (radiological or clinical). If studies stipulated their inclusion criteria as a clinical diagnosis with or without radiological diagnosis, they were included in the studies based on clinical diagnosis for analysis (as we were unable to identify which participants had a radiograph performed). Due to the relatively weak quality of the studies identified and the variable nature of the data from the studies we did not perform any additional statistical analysis, to avoid over-interpretation of the data available.

We used the EPHPP tool to evaluate the risk of bias in included studies [ 4 ]. This tool was modified to assess the study designs included (Table S1 in the Online Supplementary Document ). Application of the EPHPP tool required separate evaluation and consensus between two reviewers (PK, MM).

The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 statement was followed, with a checklist completed (Table S2 in the Online Supplementary Document ) [ 5 ].

A total of 2641 references were retrieved, and an additional four relevant publications were identified through reference list screening ( Figure 1 ). After duplicates were removed, 1776 references were screened, and 301 proceeded to full-text review. Two articles were excluded as the full text was unavailable, after authors were contacted twice to request them. Fifteen studies were included in qualitative synthesis after inclusion and exclusion criteria were applied.

An external file that holds a picture, illustration, etc.
Object name is jogh-12-10002-F1.jpg

Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) flow diagram.

Study descriptions

Studies had variable methods to identify patients with pneumonia. Seven of the 15 studies included children with radiologically confirmed pneumonia (two of these requiring clinical features in addition) and eight of the 15 studies were based on clinical diagnosis with or without a radiograph. The heterogeny in diagnostic methods was significant. For example, one study based on radiological diagnosis only included patients with obvious chest indrawing. Furthermore, of those based on clinical diagnosis, three studies included children with or without a radiograph being performed, three required clinician diagnosis alone, and two studies were of Mycoplasma pneumoniae positive patients that described clinical features and/or chest radiograph changes consistent with pneumonia. Eight studies addressed all-cause pneumonia ( Table 1 ) whilst seven discussed pneumonia attributable to Mycoplasma pneumonia e, based on a variety of diagnostic assays ( Table 2 ). Three out of the 15 studies, Macpherson et al [ 12 ], Salih et al [ 13 ] and Forgie et al [ 11 ], were from low or lower-middle income settings. Twelve studies described inpatients only, one study by Harris et al [ 9 ] was of outpatients, and two studies by Korppi et al [ 8 ] and Othman et al [ 19 ] included a combination of inpatients and outpatients.

Clinical features described in children aged 5-9 y diagnosed with pneumonia of any aetiology (all-cause pneumonia)

CAP – community acquired pneumonia, RCT – randomised controlled trial, WCC – white cell count, PFTs – pulmonary function tests, ALRI – acute lower respiratory infection, CXR – chest x-ray, y – year, mo – months

*Values significant with P  < 0.05 when the <2 years group was compared to the ≥2 years group (combined data for 2-4 years and ≥5 years).

†Corrected percentage value due to error in calculated percentage within study.

‡Tachypnoea was defined by age-specific WHO criteria: respiratory rate >50 breaths/min in infants <12 months old, >40 breaths/min in children aged 1-5 years and >30 breaths/min in children aged ≥6 years.

§Conventional therapy = amoxicillin/clavulanate if ≤5 years of age and erythromycin if >5 y of age.

‖Abnormal respiratory rate was defined as >24 breaths/min for patients ≤2 year of age and >20 breaths/min for patients >2 year of age.

¶Fever was defined as ≥100.5°F oral or ≥101°F rectal, or history in the last 24 h.

**Absolute numbers and percentages are extrapolated data.

†† P values not calculated in this study.

‡‡Weight <80% of median value using National Center for Health Statistics reference values (United States Department of Health, Education and Welfare, 1976).

Clinical features described in children aged 5-9 y diagnosed with pneumonia attributable to Mycoplasma pneumoniae

ICD-10 - International Classification of Diseases 10th edition, CXR – chest x-ray, PCR – polymerase chain reaction, CAP – community acquired pneumonia, LRTI – lower respiratory tract infection, ICU – intensive care unit, VATS – video-assisted thorascopic surgery, y – years, d – days

*Pneumonia pattern characterised by WHO Standardization of Interpretation of Chest Radiographs for the diagnosis of community acquired pneumonia in children.

†Study text states that “one-half of the children had lobar pneumonia in both groups”, however study Figure 3 suggests a higher number (between 40 and 60 patients with lobar pneumonia for each of the ≤5 years and >5 years groups).

‡Data for a symptom/sign was included if able to be disaggregated from combined data.

§ P values relate to comparison of <5 years group with 5 to <10 years and 10-14 years groups.

‖Tachypnoea was defined as a respiratory rate >99 th percentile for age.

¶Includes any type of rash, urticaria and Stevens-Johnson Syndrome.

**112/134 total patients had CXRs, fraction of children aged 7-15 years who had CXRs not specified.

††Absolute numbers and percentages not described.

Three of the eight studies that explored all-cause pneumonia included patients with comorbid conditions, three specifically excluded those with comorbidities, and two did not specify information about comorbidities. A significant proportion of participants aged 5-9 years in study by Macpherson et al had comorbid disease including malaria (28.77%), asthma (10.91%), neurological disorders (10.77%), severe malnutrition (9.48%) or HIV (8.32%) [ 12 ]. Meanwhile, 46% of children aged 5-14 years in study by Salih et al were underweight [ 13 ], and a variety of underlying chronic comorbid conditions were described by Udomittipong et al but not disaggregated by age [ 10 ]. Within the group of studies addressing Mycoplasma pneumoniae , four included those with chronic conditions or comorbidities, two excluded children with these and one did not specify information about comorbidities. Chronic pulmonary disease and asthma were most frequently described as pre-existing underlying disease [ 17 , 19 , 20 ].

Most studies were of weak quality when assessed with the EPHPP tool ( Table 1 and Table 2 ). The exceptions were Macpherson et al [ 12 ] and Harris et al [ 9 ], which were assessed as moderate quality. There were seven retrospective observational studies, six studies with prospective recruitment of participants, one randomized controlled trial (RCT) and one descriptive study based on interview and questionnaire data. Describing clinical features of pneumonia was a primary objective in thirteen of the studies; two were not conducted with this as a primary aim but included clinical features of pneumonia in a description of participants. Many studies (8/15) did not specify or utilise a standardised data collection method. Although all studies included participants aged 5-9 years, study populations also included older and younger children. Three studies provided data disaggregated for the 5-9 age range exactly; the remaining twelve studies overlapped with the target population with a sufficiently close age range to be representative. In some studies, there was a paucity of disaggregated data relating to clinical features in children 5-9 years old. There were also differing definitions and terms for some clinical features between studies. Most importantly, the definition of fast breathing varied from >20 breaths per minute [ 9 ], to >40 breaths per minute [ 8 ], to a respiratory rate >99 th percentile for age [ 20 ].

Study outcomes

Aggregated data regarding the proportion of older children with specific respiratory symptoms and extra-pulmonary clinical features is summarised in Table 3 .

Overall data regarding proportion of children with specific clinical features in included studies

*Includes dyspnoea/difficulty breathing/gasping/breathlessness, combined data from 4-6 and ≥7-14 age groups from Gao et al included [ 14 ].

†Includes flaring/nasal flaring.

‡Includes indrawing/recession/chest wall indrawing/chest recession/chest retraction, Forgie et al excluded from analysis as study selected for patients with indrawing [ 11 ].

§Includes all utilised definitions of tachypnoea and abnormal respiratory rate, data pertaining to respiratory rate of ≥40 breaths per minute rather than ≥50 breaths per minute included from Juvén et al [ 7 ].

‖Includes crepitations/rales/crackles/pulmonary crackles at onset, data included if able to be disaggregated from other abnormal breath sounds, combined data from 4-6 and ≥7-14 age groups from Gao et al included [ 14 ].

¶Includes wheeze/wheezes/wheezing/auscultation – wheezing, data included if able to be disaggregated from other abnormal breath sounds, fraction and percentage of children with auscultation finding rather than reported symptom included from Sondergaard et al [ 20 ].

**Includes all utilised definitions of fever, data pertaining to fever >37.5°C rather than fever >39.5°C included from Korppi et al [ 8 ].

††includes any pallor present

‡‡Includes inability to drink/poor appetite/refusal to eat/cannot eat or drink/feeding difficulties.

§§Data included if able to be disaggregated from other gastrointestinal symptoms.

‖‖Data included if able to be disaggregated from pain at other sites.

¶¶Includes chest pain/thoracic pain.

***Includes any type of rash, urticaria and Stevens-Johnson Syndrome.

Cough was the most common clinical feature, documented in around 90% of patients in both all-cause and Mycoplasma cohorts, whether diagnosed clinically or radiologically. Fever was also common in both cohorts but more common in Mycoplasma (91.7%, 95% confidence interval (CI) = 91.2-92.3) compared to all-cause pneumonia (74.8%, 95% CI = 73.6-76.0).

Tachypnoea was identified in around half of patients overall but less frequently in the Mycoplasma cohort (all-cause pneumonia 55.4%, 95% CI = 53.6-57.2 and Mycoplasma pneumoniae 40.1%, 95% CI = 37.9-42.3). The study of outpatients by Harris et al had the highest prevalence of tachypnoea but the lowest threshold for defining tachypnoea (>20 breaths per minute for children older than 2 years) [ 9 ]. The percentage of patients with tachypnoea was lower for patients with a radiological diagnosis (all-cause pneumonia 48.0%, 95% CI = 42.9-53.1 and Mycoplasma pneumoniae 8.5%, 95% CI = 7.7-9.2) compared to a clinical diagnosis (all-cause pneumonia 77.0% comprising 1 study with 937/1216 patients and Mycoplasma pneumoniae 50.1%, 95% CI = 48.5-51.7). Of note, less than 10% of patients with a radiological diagnosis of Mycoplasma pneumoniae had documented tachypnoea.

Dyspnoea/difficulty breathing was documented in 29.1% (95% CI = 28.2-30.8) of all-cause pneumonia patients and 23.1% (95% CI = 22.4-23.8) of Mycoplasma pneumoniae patients. In the all-cause pneumonia cohort, the proportion of patients with dyspnoea was higher in the clinical diagnosis group (43.0%, 95% CI = 42.3-43.7) compared to the radiological (14.9%, 95% CI = 13.7-16.1). Chest indrawing was observed in approximately half of all-cause pneumonia cases, all of which were based on clinical diagnosis. There was only one small study of Mycoplasma pneumoniae patients which documented chest-indrawing in 30.0% (14/46) of patients [ 19 ]. Crackles or crepitations were variably described between studies but documented in around one half of patients overall. Wheeze or rhonchi were described in around one quarter of patients.

Chest and abdominal pain were each included in two studies of all-cause pneumonia (radiological diagnosis) and both were documented in around one third of patients. Abdominal pain was included in one small study of Mycoplasma pneumoniae patients (radiological diagnosis) and was found in 17% (11/66) of patients [ 15 ]. Headache, nausea and vomiting also occurred in around one third of patients in the all-cause pneumonia cohort, though these are non-specific symptoms that may occur in a range of illnesses. Skin manifestations were described in one study addressing Mycoplasma pneumoniae with data disaggregated by age and, in this study, were found in 25% (21/88) children [ 20 ].

With respect to chest radiograph findings in all studies, one study by Gao et al selected for patients with segmental/lobar Mycoplasma pneumoniae and additionally reported on the presence of pleural effusions (4%-5%) [ 14 ]. Aside from this, only a small number of study participants overall in the 5-9 year age range had disaggregated chest radiograph findings reported ( Table 4 ). Lobar changes were documented in around half of patients who had chest radiographs but any further conclusions are limited by the variable inclusion and diagnostic criteria and limited data.

Chest radiograph findings document in studies in children 5-9 y with pneumonia

*Data included if able to be disaggregated from other chest x-ray findings and both numerator and denominator clearly stated.

†Includes lobar consolidation/lobal consolidation/lobar infiltration and segmental/lobar pneumonia, Gao et al excluded from analysis as selected for patients with segmental/lobar pneumonia [ 14 ].

‡Includes interstitial changes/interstitial pattern.

§Includes pleural effusion/empyema, combined data from 4-6 y and ≥7-14 y groups from Gao et al included [ 14 ], data for pleural effusion rather than single case of empyema included from Sondergaard et al [ 20 ].

Outcome data for children aged 5-9 years with pneumonia were available from a single study of inpatients in Kenya, which was also the largest study in the review [ 12 ]. Macpherson et al described risk factors associated with mortality in children aged 5-14 years admitted to hospital with pneumonia [ 12 ]. Outcome information was available for 1825/1832 (99.5%) patients, of whom 145 (7.9%) died. Inpatient case fatality was higher in children aged 10-14 years compared to the 5-9 year age group (14.05% vs 6.43%, P  < 0.001). For children aged 5-10 years, risk factors for death demonstrated in multi-variate analysis included the presence of severe pallor (OR = 9.89, 95% CI = 4.68 to 20.93, P  < 0.001), mild/moderate pallor (OR = 2.85, 95% CI = 1.35-6, P  < 0.006), reduced consciousness (OR = 6.27, 95% CI = 2.8-14.08, P  < 0.001), central cyanosis (OR = 6.35, 95% CI = 1.33-30.25, P  < 0.02), a weight for age Z-score of≤-3 SD (OR = 2.99, 95% CI = 1.61-5.55, P  < 0.001) and comorbid HIV (OR = 2.49, 95% CI = 1.18-5.28, P  < 0.017). A respiratory rate >30 breaths per minute and inability to drink were associated with poor outcome, though did not reach statistical significance. Sex, presence of grunting, crackles, chest wall indrawing and comorbid malaria were not associated with mortality and wheeze was found to be relatively protective (not statistically significant). Additional analysis demonstrated that the combination of clinical characteristics used by WHO to define severe pneumonia in children less than 5 years old was poor in discriminating those at risk of death (sensitivity: 0.56, specificity: 0.68 and AUC: 0.62) in this study.

Regarding pneumonia severity and the need for inpatient treatment in children aged 5-9 years, there is little additional data to draw upon beyond the study by Macpherson et al [ 12 ]. Studies involving outpatients either did not describe chest indrawing or did not disaggregate data by age in combination with admission status [ 8 , 9 , 19 ]. Whilst lethargy was documented frequently, reduced consciousness as a specific sign was only described in the study by Macpherson et al [ 12 ].

Comparison with clinical features of pneumonia in younger children was made in six out of eight all-cause pneumonia studies and all seven Mycoplasma pneumoniae studies ( Table 1 and Table 2 ). In all studies which included chest and abdominal pain and compared frequency between older and younger children, they were found to be more common in older children [ 6 - 8 ]. Crocker et al found that abdominal pain was a reported symptom in all 12 cases in which pleural effusion or empyema were detected in children aged 3-16 years [ 6 ]. Comparison of chest auscultation findings between age groups demonstrated no clear trends, with some studies finding crackles and wheeze to be more common in younger children but other studies reporting greater frequency in older children [ 7 , 9 , 13 ]. Similarly, one study found that normal breath sounds were more common in children older than 5 years and another found that it was less common [ 7 , 11 ]. Inconsistent use of terms for auscultation findings between studies limited comparison. In a study of 127 children with Mycoplasma pneumoniae , Ma et al found that children less than 5 years of age were more likely to have a severe illness course, including intensive care unit admission, supplemental oxygen requirement and need for video-assisted thoracoscopic surgery (VATS) [ 15 ]. Vomiting also occurred more often in younger children with Mycoplasma pneumoniae [ 15 , 19 ]. Segmental or lobar consolidation on chest radiograph was a more common finding in older children for both all-cause pneumonia and Mycoplasma pneumoniae groups [ 13 , 16 , 18 ].

Comparative analysis of clinical features between those with and without comorbidities was not possible as data was not disaggregated for subgroups of participants with comorbidities in the 5-9 year age range in studies that included such participants.

There is a paucity of quality evidence describing clinical features of pneumonia in children aged 5-9 years. This review explored findings from 15 studies, eight addressing pneumonia of all causes and seven addressing pneumonia attributable to Mycoplasma pneumoniae . The lack of evidence highlights the urgent need for research to understand clinical features, treatment approaches and outcomes for children 5-9 years of age with pneumonia, which remains one of the highest causes of death in this age group globally [ 3 ]. However, the evidence that does exist indicates that applying existing WHO definitions of pneumonia for children under 5 years of age, to this older age group, is likely to lower the diagnostic yield.

Current WHO guidelines for children under 5 years old distinguish simple cough from pneumonia based on the presence or absence of tachypnoea. Among studies in this review, tachypnoea lacked standard definitions and this complicates interpretation of findings. However approximately only half of patients in the all-cause pneumonia cohort were documented to have tachypnoea, and this was lower for Mycoplasma pneumoniae patients, notably those diagnosed radiologically. Higher proportions of children with pneumonia in clinically diagnosed groups may represent later diagnosis. Alternatively, it may reflect greater emphasis on accurate measurement and recording of respiratory rate in clinicians using clinical diagnosis. The data on clinical diagnosis regarding tachypnoea in the all-cause pneumonia cohort is based on the Kenyan study, which is a cohort of sick children in a high burden setting. Yet, even amongst these patients around 1 in 4 did not have tachypnoea (respiratory rate >30 breaths per minute) documented on admission [ 12 ]. The measurement of respiratory rate is a skill which is often not performed well or documented correctly; the evidence indicates that it cannot be relied upon to identify pneumonia among older children with cough [ 21 ].

If tachypnoea cannot be relied on to diagnose pneumonia in older children, then addition of other symptoms to aid diagnostic approaches should be considered. Although the study numbers are small, chest pain and abdominal pain were relatively common in children aged5-9 years with all-cause pneumonia, whether due to their ability to report symptoms, or to the likelihood that researchers sought to identify these symptoms in older children. Chest radiographs may also have a greater role in diagnosing children with pneumonia in this age group, particularly in the setting of persistent cough and fever without other signs to confirm pneumonia (or alternative diagnoses). It should be noted, the data on chest radiograph findings in pneumonia in this age group is limited and there is insufficient data supporting the use of radiographs to distinguish pneumonia aetiology (eg, Mycoplasma from all-cause).

Symptoms used to define severe pneumonia in children <5 years of age, such as reduced conscious state, central cyanosis and/or hypoxia (oxygen saturation <90%) and inability to eat or drink [ 1 , 2 ], still have relevance in older children in low and lower-middle income settings in terms of their risk of mortality and therefore the severity of pneumonia. Similarly, nutritional status and underlying chronic conditions (including HIV) are associated with mortality in older children and should be part of any risk stratification approach used by clinicians to determine the need for admission and treatment [ 1 , 2 ]. Pallor, whether mild, moderate or severe, was identified as being associated with a higher risk of mortality in children 5-9 years old and should also be part of a clinician’s consideration of risk and patient disposition [ 12 ]. This is consistent with recent evidence suggesting that pallor is an important marker of serious disease in younger age groups [ 22 - 24 ]. The sign of chest indrawing has been an important and evolving marker of pneumonia severity and therefore need for admission in guidelines for children under 5 years old [ 25 ]. This review identified no data on the management of chest indrawing in children aged 5-9 years in the outpatient setting. Given chest compliance reduces with age [ 25 ], it is reasonable to suspect that chest indrawing may indicate greater severity in older children, as its presence may suggest generation of greater intrathoracic pressures to maintain ventilation. The Kenyan study in this review examined risk of death in older children with pneumonia and found no association between chest indrawing and mortality [ 12 ]. This finding, among others described above, is based on a single study in one context and should be interpreted with caution. Of note no radiological studies of all-cause pneumonia documented the presence or absence of chest indrawing in patients, despite its potential importance in guiding treatment.

Our review identified several studies relating to Mycoplasma pneumoniae in children 5-9 years of age mostly from high income countries, from which data has been reported separately to not unduly influence data on all-cause pneumonia, and to consider differences in clinical features. While Mycoplasma pneumoniae is important in pneumonia in older children, the emphasis on this organism in this review may represent bias on the part of researchers in considering it above other aetiologies. There is a clear need for more data on other potential aetiologies (eg, influenza), but particularly those relevant in the global context, such as HIV and tuberculosis.

Based on the available evidence for Mycoplasma pneumoniae , there are no respiratory clinical features that can distinguish it from pneumonia of other aetiologies in children aged 5-9 years. This is consistent with other studies that demonstrated no clinical or radiological features to identify Mycoplasma pneumoniae and guide therapeutic decisions [ 26 , 27 ]. Considering Mycoplasma pneumoniae as an aetiology and treating this possibility is therefore important, including in HIV positive children among whom it has also been shown to be common [ 28 ]. Skin symptoms may be useful in distinguishing Mycoplasma pneumoniae as a potential aetiological agent in pneumonia in older children, however there may be bias in seeking and reporting on these symptoms in studies focused on Mycoplasma pneumoniae and disaggregated supportive evidence was available from only one study in this review [ 20 ]. Separately, a review by Schalock and Dinulos [ 29 ] specifically addressing Mycoplasma pneumoniae -induced cutaneous disease in paediatric and adult populations and a study by Sauteur et al [ 30 ] in paediatric patients aged 3-18 years described skin manifestations as a feature of Mycoplasma pneumoniae , such as exanthematous skin eruptions, urticaria, erythema nodosum, Mycoplasma pneumoniae -induced rash and mucositis (MIRM) and Stevens-Johnson Syndrome. A key limitation in determining aetiology is that available diagnostic tests for Mycoplasma pneumoniae may not distinguish infection from carriage [ 31 ].

Implications for WHO pneumonia guidelines

The relatively weak quality of studies and limited evidence in this review should be kept in mind when interpreting the findings. Evidence related to risk factors for death, for example, is derived from a single study of moderate quality. Different definitions (eg, for tachypnoea), different nomenclatures (eg, crepitations) and absence of documentation of key signs (eg, chest indrawing) should be noted. Nonetheless, there are some implications to be considered for WHO guidelines while further research is conducted and evidence is generated.

Cough and fever are common clinical features in pneumonia in children aged 5-9 years. However, tachypnoea, used to define pneumonia according to WHO criteria in children <5 years of age, may not be present in older children with pneumonia. Inclusion of chest pain and abdominal pain in diagnostic approaches for older children might expand recognition of pneumonia in this age group, especially if other signs are absent. Furthermore, chest radiographs may have greater importance for diagnosis. Clear definitions of tachypnoea are required for both clinical application and to standardise future research.

Symptoms reflecting severity of pneumonia in children <5 years of age (eg, reduced conscious state, hypoxia and inability to drink) have relevance in older children in low resource settings with respect to risk of mortality, and therefore severity of pneumonia. Separate to these markers of severe disease, other patient factors such as poor nutritional status, comorbid chronic conditions and pallor are associated with poor outcomes. As a result, they should be part of the clinician’s consideration of risk of a poor outcome for children aged 5-9 years with pneumonia, and inform decision making on patient disposition.

There is minimal data on chest indrawing in children aged 5-9 years, particularly its management in outpatient settings, to guide management recommendations. Without further evidence, it may be safest to recommend admission if chest indrawing is present.

Although there are differences in the proportions of patients with clinical features between the all-cause pneumonia and Mycoplasma cohorts, these cannot be used to distinguish pneumonia of different aetiologies in children aged 5-9 years on an individual level. Guidelines should account for causative agents other than pneumococcus and antibiotic recommendations should be altered accordingly. The addition of an antibiotic to cover for Mycoplasma pneumoniae (eg, macrolide) when treating pneumonia in this age group should be strongly considered, particularly in severe cases, in children with malnutrition and/or other co-morbidities, and when deterioration occurs on alternate therapy. Skin symptoms may be useful in distinguishing Mycoplasma pneumoniae as a potential aetiological agent in pneumonia in children aged 5-9 years, though there is limited evidence available and large potential for bias.

Limitations

This review was conducted with a rigorous systematic approach, broad search strategy to capture relevant publications and methods to minimise risk of bias. It was limited by the databases that were searched, restriction of publications to the English language and unavailability of two full-text articles. Overall, the key limitation is the breadth and depth of existing research pertaining to pneumonia in children aged 5-9 years that is available to inform decision making.

Further studies exploring clinical features of pneumonia in children aged 5-9 years are warranted to strengthen evidence and understanding of the presentation of pneumonia in this age group. Studies using consistent definitions of clinical features and age ranges would enable aggregation of data and comparison between studies and settings. A wider range of studies in outpatient and inpatient settings, which identify clinical features associated with pneumonia severity and help to define critical values of concern for key signs, eg, tachypnoea, would better identify children at risk of poor outcomes. Conversely, understanding the prevalence of features such as chest indrawing in outpatient settings would aid in guiding safe management of children in the community.

Studies describing pneumonia aetiology and associated clinical features in children aged 5-9 years are needed to better inform antimicrobial choices, or clinical scenarios in which particular antimicrobial choices should be prioritised.

Studies should also explore the presentation of pneumonia in children aged 5-9 years with comorbid chronic conditions, given that this group is likely to be at higher risk of recurrent and more severe pneumonia.

CONCLUSIONS

There is a lack of evidence describing clinical features of pneumonia in children aged 5-9 years highlighting an urgent need for further research to guide best practice. Despite the quality and quantity of data, there are some findings which should be considered in relation to whether existing WHO definitions of pneumonia in children less than 5 years of age can be applied to older children. Based on limited data fever and cough are common in this age group, but tachypnoea cannot be relied on for diagnosis. While waiting for better evidence, broader attention to features such as chest and abdominal pain, the role of chest radiographs for diagnosis in the absence of symptoms such as tachypnoea, and risk factors which may influence patient disposition (chest indrawing, pallor, nutritional status) warrants consideration by clinicians.

Additional material

Acknowledgments.

Full list of ARI Review group : Trevor Duke, Hamish Graham, Steve Graham, Amy Gray, Amanda Gwee, Claire von Mollendorf, Kim Mulholland, Fiona Russell (leadership group, MCRI/University of Melbourne); Maeve Hume-Nixon, Saniya Kazi, Priya Kevat, Eleanor Neal, Cattram Nguyen, Alicia Quach, Rita Reyburn, Kathleen Ryan, Patrick Walker, Chris Wilkes (lead researchers, MCRI); Poh Chua (research librarian, RCH); Yasir Bin Nisar, Jonathon Simon, Wilson Were (WHO).

Acknowledgements: We would like to acknowledge librarian, Poh Chua, at the Royal Children’s Hospital Melbourne, who assisted with formulating and conducting our literature search.

Disclaimer: The authors alone are responsible for the views expressed in this publication and they do not necessarily represent the views, decisions or policies of the World Health Organization.

Funding: Funding was provided by the World Health Organization (WHO).

Authorship contributions: HG, AG and members of the ARI Review group conceived the study and initiated the study design. PK and AG led the conduct of searches. Data extraction was led by MM and PK with input from AG. Data analysis was conducted by PK, AG, and MM. The manuscript was drafted by PK, with input from AG, MM and HG. All authors contributed to revisions and approved the final manuscript.

Competing interests: The authors completed the ICMJE Unified Competing Interest Form (available upon request from the corresponding author), and declare no conflicts of interest.

  • Privacy Policy

Buy Me a Coffee

Research Method

Home » Qualitative Research – Methods, Analysis Types and Guide

Qualitative Research – Methods, Analysis Types and Guide

Table of Contents

Qualitative Research

Qualitative Research

Qualitative research is a type of research methodology that focuses on exploring and understanding people’s beliefs, attitudes, behaviors, and experiences through the collection and analysis of non-numerical data. It seeks to answer research questions through the examination of subjective data, such as interviews, focus groups, observations, and textual analysis.

Qualitative research aims to uncover the meaning and significance of social phenomena, and it typically involves a more flexible and iterative approach to data collection and analysis compared to quantitative research. Qualitative research is often used in fields such as sociology, anthropology, psychology, and education.

Qualitative Research Methods

Types of Qualitative Research

Qualitative Research Methods are as follows:

One-to-One Interview

This method involves conducting an interview with a single participant to gain a detailed understanding of their experiences, attitudes, and beliefs. One-to-one interviews can be conducted in-person, over the phone, or through video conferencing. The interviewer typically uses open-ended questions to encourage the participant to share their thoughts and feelings. One-to-one interviews are useful for gaining detailed insights into individual experiences.

Focus Groups

This method involves bringing together a group of people to discuss a specific topic in a structured setting. The focus group is led by a moderator who guides the discussion and encourages participants to share their thoughts and opinions. Focus groups are useful for generating ideas and insights, exploring social norms and attitudes, and understanding group dynamics.

Ethnographic Studies

This method involves immersing oneself in a culture or community to gain a deep understanding of its norms, beliefs, and practices. Ethnographic studies typically involve long-term fieldwork and observation, as well as interviews and document analysis. Ethnographic studies are useful for understanding the cultural context of social phenomena and for gaining a holistic understanding of complex social processes.

Text Analysis

This method involves analyzing written or spoken language to identify patterns and themes. Text analysis can be quantitative or qualitative. Qualitative text analysis involves close reading and interpretation of texts to identify recurring themes, concepts, and patterns. Text analysis is useful for understanding media messages, public discourse, and cultural trends.

This method involves an in-depth examination of a single person, group, or event to gain an understanding of complex phenomena. Case studies typically involve a combination of data collection methods, such as interviews, observations, and document analysis, to provide a comprehensive understanding of the case. Case studies are useful for exploring unique or rare cases, and for generating hypotheses for further research.

Process of Observation

This method involves systematically observing and recording behaviors and interactions in natural settings. The observer may take notes, use audio or video recordings, or use other methods to document what they see. Process of observation is useful for understanding social interactions, cultural practices, and the context in which behaviors occur.

Record Keeping

This method involves keeping detailed records of observations, interviews, and other data collected during the research process. Record keeping is essential for ensuring the accuracy and reliability of the data, and for providing a basis for analysis and interpretation.

This method involves collecting data from a large sample of participants through a structured questionnaire. Surveys can be conducted in person, over the phone, through mail, or online. Surveys are useful for collecting data on attitudes, beliefs, and behaviors, and for identifying patterns and trends in a population.

Qualitative data analysis is a process of turning unstructured data into meaningful insights. It involves extracting and organizing information from sources like interviews, focus groups, and surveys. The goal is to understand people’s attitudes, behaviors, and motivations

Qualitative Research Analysis Methods

Qualitative Research analysis methods involve a systematic approach to interpreting and making sense of the data collected in qualitative research. Here are some common qualitative data analysis methods:

Thematic Analysis

This method involves identifying patterns or themes in the data that are relevant to the research question. The researcher reviews the data, identifies keywords or phrases, and groups them into categories or themes. Thematic analysis is useful for identifying patterns across multiple data sources and for generating new insights into the research topic.

Content Analysis

This method involves analyzing the content of written or spoken language to identify key themes or concepts. Content analysis can be quantitative or qualitative. Qualitative content analysis involves close reading and interpretation of texts to identify recurring themes, concepts, and patterns. Content analysis is useful for identifying patterns in media messages, public discourse, and cultural trends.

Discourse Analysis

This method involves analyzing language to understand how it constructs meaning and shapes social interactions. Discourse analysis can involve a variety of methods, such as conversation analysis, critical discourse analysis, and narrative analysis. Discourse analysis is useful for understanding how language shapes social interactions, cultural norms, and power relationships.

Grounded Theory Analysis

This method involves developing a theory or explanation based on the data collected. Grounded theory analysis starts with the data and uses an iterative process of coding and analysis to identify patterns and themes in the data. The theory or explanation that emerges is grounded in the data, rather than preconceived hypotheses. Grounded theory analysis is useful for understanding complex social phenomena and for generating new theoretical insights.

Narrative Analysis

This method involves analyzing the stories or narratives that participants share to gain insights into their experiences, attitudes, and beliefs. Narrative analysis can involve a variety of methods, such as structural analysis, thematic analysis, and discourse analysis. Narrative analysis is useful for understanding how individuals construct their identities, make sense of their experiences, and communicate their values and beliefs.

Phenomenological Analysis

This method involves analyzing how individuals make sense of their experiences and the meanings they attach to them. Phenomenological analysis typically involves in-depth interviews with participants to explore their experiences in detail. Phenomenological analysis is useful for understanding subjective experiences and for developing a rich understanding of human consciousness.

Comparative Analysis

This method involves comparing and contrasting data across different cases or groups to identify similarities and differences. Comparative analysis can be used to identify patterns or themes that are common across multiple cases, as well as to identify unique or distinctive features of individual cases. Comparative analysis is useful for understanding how social phenomena vary across different contexts and groups.

Applications of Qualitative Research

Qualitative research has many applications across different fields and industries. Here are some examples of how qualitative research is used:

  • Market Research: Qualitative research is often used in market research to understand consumer attitudes, behaviors, and preferences. Researchers conduct focus groups and one-on-one interviews with consumers to gather insights into their experiences and perceptions of products and services.
  • Health Care: Qualitative research is used in health care to explore patient experiences and perspectives on health and illness. Researchers conduct in-depth interviews with patients and their families to gather information on their experiences with different health care providers and treatments.
  • Education: Qualitative research is used in education to understand student experiences and to develop effective teaching strategies. Researchers conduct classroom observations and interviews with students and teachers to gather insights into classroom dynamics and instructional practices.
  • Social Work : Qualitative research is used in social work to explore social problems and to develop interventions to address them. Researchers conduct in-depth interviews with individuals and families to understand their experiences with poverty, discrimination, and other social problems.
  • Anthropology : Qualitative research is used in anthropology to understand different cultures and societies. Researchers conduct ethnographic studies and observe and interview members of different cultural groups to gain insights into their beliefs, practices, and social structures.
  • Psychology : Qualitative research is used in psychology to understand human behavior and mental processes. Researchers conduct in-depth interviews with individuals to explore their thoughts, feelings, and experiences.
  • Public Policy : Qualitative research is used in public policy to explore public attitudes and to inform policy decisions. Researchers conduct focus groups and one-on-one interviews with members of the public to gather insights into their perspectives on different policy issues.

How to Conduct Qualitative Research

Here are some general steps for conducting qualitative research:

  • Identify your research question: Qualitative research starts with a research question or set of questions that you want to explore. This question should be focused and specific, but also broad enough to allow for exploration and discovery.
  • Select your research design: There are different types of qualitative research designs, including ethnography, case study, grounded theory, and phenomenology. You should select a design that aligns with your research question and that will allow you to gather the data you need to answer your research question.
  • Recruit participants: Once you have your research question and design, you need to recruit participants. The number of participants you need will depend on your research design and the scope of your research. You can recruit participants through advertisements, social media, or through personal networks.
  • Collect data: There are different methods for collecting qualitative data, including interviews, focus groups, observation, and document analysis. You should select the method or methods that align with your research design and that will allow you to gather the data you need to answer your research question.
  • Analyze data: Once you have collected your data, you need to analyze it. This involves reviewing your data, identifying patterns and themes, and developing codes to organize your data. You can use different software programs to help you analyze your data, or you can do it manually.
  • Interpret data: Once you have analyzed your data, you need to interpret it. This involves making sense of the patterns and themes you have identified, and developing insights and conclusions that answer your research question. You should be guided by your research question and use your data to support your conclusions.
  • Communicate results: Once you have interpreted your data, you need to communicate your results. This can be done through academic papers, presentations, or reports. You should be clear and concise in your communication, and use examples and quotes from your data to support your findings.

Examples of Qualitative Research

Here are some real-time examples of qualitative research:

  • Customer Feedback: A company may conduct qualitative research to understand the feedback and experiences of its customers. This may involve conducting focus groups or one-on-one interviews with customers to gather insights into their attitudes, behaviors, and preferences.
  • Healthcare : A healthcare provider may conduct qualitative research to explore patient experiences and perspectives on health and illness. This may involve conducting in-depth interviews with patients and their families to gather information on their experiences with different health care providers and treatments.
  • Education : An educational institution may conduct qualitative research to understand student experiences and to develop effective teaching strategies. This may involve conducting classroom observations and interviews with students and teachers to gather insights into classroom dynamics and instructional practices.
  • Social Work: A social worker may conduct qualitative research to explore social problems and to develop interventions to address them. This may involve conducting in-depth interviews with individuals and families to understand their experiences with poverty, discrimination, and other social problems.
  • Anthropology : An anthropologist may conduct qualitative research to understand different cultures and societies. This may involve conducting ethnographic studies and observing and interviewing members of different cultural groups to gain insights into their beliefs, practices, and social structures.
  • Psychology : A psychologist may conduct qualitative research to understand human behavior and mental processes. This may involve conducting in-depth interviews with individuals to explore their thoughts, feelings, and experiences.
  • Public Policy: A government agency or non-profit organization may conduct qualitative research to explore public attitudes and to inform policy decisions. This may involve conducting focus groups and one-on-one interviews with members of the public to gather insights into their perspectives on different policy issues.

Purpose of Qualitative Research

The purpose of qualitative research is to explore and understand the subjective experiences, behaviors, and perspectives of individuals or groups in a particular context. Unlike quantitative research, which focuses on numerical data and statistical analysis, qualitative research aims to provide in-depth, descriptive information that can help researchers develop insights and theories about complex social phenomena.

Qualitative research can serve multiple purposes, including:

  • Exploring new or emerging phenomena : Qualitative research can be useful for exploring new or emerging phenomena, such as new technologies or social trends. This type of research can help researchers develop a deeper understanding of these phenomena and identify potential areas for further study.
  • Understanding complex social phenomena : Qualitative research can be useful for exploring complex social phenomena, such as cultural beliefs, social norms, or political processes. This type of research can help researchers develop a more nuanced understanding of these phenomena and identify factors that may influence them.
  • Generating new theories or hypotheses: Qualitative research can be useful for generating new theories or hypotheses about social phenomena. By gathering rich, detailed data about individuals’ experiences and perspectives, researchers can develop insights that may challenge existing theories or lead to new lines of inquiry.
  • Providing context for quantitative data: Qualitative research can be useful for providing context for quantitative data. By gathering qualitative data alongside quantitative data, researchers can develop a more complete understanding of complex social phenomena and identify potential explanations for quantitative findings.

When to use Qualitative Research

Here are some situations where qualitative research may be appropriate:

  • Exploring a new area: If little is known about a particular topic, qualitative research can help to identify key issues, generate hypotheses, and develop new theories.
  • Understanding complex phenomena: Qualitative research can be used to investigate complex social, cultural, or organizational phenomena that are difficult to measure quantitatively.
  • Investigating subjective experiences: Qualitative research is particularly useful for investigating the subjective experiences of individuals or groups, such as their attitudes, beliefs, values, or emotions.
  • Conducting formative research: Qualitative research can be used in the early stages of a research project to develop research questions, identify potential research participants, and refine research methods.
  • Evaluating interventions or programs: Qualitative research can be used to evaluate the effectiveness of interventions or programs by collecting data on participants’ experiences, attitudes, and behaviors.

Characteristics of Qualitative Research

Qualitative research is characterized by several key features, including:

  • Focus on subjective experience: Qualitative research is concerned with understanding the subjective experiences, beliefs, and perspectives of individuals or groups in a particular context. Researchers aim to explore the meanings that people attach to their experiences and to understand the social and cultural factors that shape these meanings.
  • Use of open-ended questions: Qualitative research relies on open-ended questions that allow participants to provide detailed, in-depth responses. Researchers seek to elicit rich, descriptive data that can provide insights into participants’ experiences and perspectives.
  • Sampling-based on purpose and diversity: Qualitative research often involves purposive sampling, in which participants are selected based on specific criteria related to the research question. Researchers may also seek to include participants with diverse experiences and perspectives to capture a range of viewpoints.
  • Data collection through multiple methods: Qualitative research typically involves the use of multiple data collection methods, such as in-depth interviews, focus groups, and observation. This allows researchers to gather rich, detailed data from multiple sources, which can provide a more complete picture of participants’ experiences and perspectives.
  • Inductive data analysis: Qualitative research relies on inductive data analysis, in which researchers develop theories and insights based on the data rather than testing pre-existing hypotheses. Researchers use coding and thematic analysis to identify patterns and themes in the data and to develop theories and explanations based on these patterns.
  • Emphasis on researcher reflexivity: Qualitative research recognizes the importance of the researcher’s role in shaping the research process and outcomes. Researchers are encouraged to reflect on their own biases and assumptions and to be transparent about their role in the research process.

Advantages of Qualitative Research

Qualitative research offers several advantages over other research methods, including:

  • Depth and detail: Qualitative research allows researchers to gather rich, detailed data that provides a deeper understanding of complex social phenomena. Through in-depth interviews, focus groups, and observation, researchers can gather detailed information about participants’ experiences and perspectives that may be missed by other research methods.
  • Flexibility : Qualitative research is a flexible approach that allows researchers to adapt their methods to the research question and context. Researchers can adjust their research methods in real-time to gather more information or explore unexpected findings.
  • Contextual understanding: Qualitative research is well-suited to exploring the social and cultural context in which individuals or groups are situated. Researchers can gather information about cultural norms, social structures, and historical events that may influence participants’ experiences and perspectives.
  • Participant perspective : Qualitative research prioritizes the perspective of participants, allowing researchers to explore subjective experiences and understand the meanings that participants attach to their experiences.
  • Theory development: Qualitative research can contribute to the development of new theories and insights about complex social phenomena. By gathering rich, detailed data and using inductive data analysis, researchers can develop new theories and explanations that may challenge existing understandings.
  • Validity : Qualitative research can offer high validity by using multiple data collection methods, purposive and diverse sampling, and researcher reflexivity. This can help ensure that findings are credible and trustworthy.

Limitations of Qualitative Research

Qualitative research also has some limitations, including:

  • Subjectivity : Qualitative research relies on the subjective interpretation of researchers, which can introduce bias into the research process. The researcher’s perspective, beliefs, and experiences can influence the way data is collected, analyzed, and interpreted.
  • Limited generalizability: Qualitative research typically involves small, purposive samples that may not be representative of larger populations. This limits the generalizability of findings to other contexts or populations.
  • Time-consuming: Qualitative research can be a time-consuming process, requiring significant resources for data collection, analysis, and interpretation.
  • Resource-intensive: Qualitative research may require more resources than other research methods, including specialized training for researchers, specialized software for data analysis, and transcription services.
  • Limited reliability: Qualitative research may be less reliable than quantitative research, as it relies on the subjective interpretation of researchers. This can make it difficult to replicate findings or compare results across different studies.
  • Ethics and confidentiality: Qualitative research involves collecting sensitive information from participants, which raises ethical concerns about confidentiality and informed consent. Researchers must take care to protect the privacy and confidentiality of participants and obtain informed consent.

Also see Research Methods

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Questionnaire

Questionnaire – Definition, Types, and Examples

Case Study Research

Case Study – Methods, Examples and Guide

Observational Research

Observational Research – Methods and Guide

Quantitative Research

Quantitative Research – Methods, Types and...

Qualitative Research Methods

Qualitative Research Methods

Explanatory Research

Explanatory Research – Types, Methods, Guide

AI makes retinal imaging 100 times faster, compared to manual method

Researchers at the National Institutes of Health applied artificial intelligence (AI) to a technique that produces high-resolution images of cells in the eye. They report that with AI, imaging is 100 times faster and improves image contrast 3.5-fold. The advance, they say, will provide researchers with a better tool to evaluate age-related macular degeneration (AMD) and other retinal diseases. 

Vineeta Das, NEI Clinical and Translational Imaging Section, explains how artificial intelligence improves imaging of the eye’s light-sensing retina.

“Artificial intelligence helps overcome a key limitation of imaging cells in the retina, which is time,” said Johnny Tam, Ph.D., who leads the Clinical and Translational Imaging Section at NIH's National Eye Institute.

Tam is developing a technology called adaptive optics (AO) to improve imaging devices based on optical coherence tomography (OCT). Like ultrasound, OCT is noninvasive, quick, painless, and standard equipment in most eye clinics. 

Woman using OCT device

Image of a commercially available OCT device, used to image intraocular tissues such as the light-sensing retina. Credit: Adobe Stock.

“Adaptive optics takes OCT-based imaging to the next level,” said Tam. “It’s like moving from a balcony seat to a front row seat to image the retina. With AO, we can reveal 3D retinal structures at cellular-scale resolution, enabling us to zoom in on very early signs of disease.” 

Johnny Tam

Johnny Tam, Ph.D., NEI Clinical and Translational Imaging Section. 

While adding AO to OCT provides a much better view of cells, processing AO-OCT images after they’ve been captured takes much longer than OCT without AO. 

Tam’s latest work targets the retinal pigment epithelium (RPE), a layer of tissue behind the light-sensing retina that supports the metabolically active retinal neurons, including the photoreceptors. The retina lines the back of the eye and captures, processes, and converts the light that enters the front of the eye into signals that it then transmits through the optic nerve to the brain. Scientists are interested in the RPE because many diseases of the retina occur when the RPE breaks down. 

Retina diagram

Illustration of the eye showing the location of the retina and its retinal pigment epithelium (RPE). 

RPE

A top-down view of lab-grown RPE cells as seen with high-resolution microscopy. Unlike AO-OCT, which is performed in an awake person, this image was created with preserved tissue. Credit: Kapil Bharti, National Eye Institute. 

Imaging RPE cells with AO-OCT comes with new challenges, including a phenomenon called speckle. Speckle interferes with AO-OCT the way clouds interfere with aerial photography. At any given moment, parts of the image may be obscured.   Managing speckle is somewhat similar to managing cloud cover. Researchers repeatedly image cells over a long period of time. As time passes, the speckle shifts, which allows different parts of the cells to become visible. The scientists then undertake the laborious and time-consuming task of piecing together many images to create an image of the RPE cells that's speckle-free. 

Tam and his team developed a novel AI-based method called parallel discriminator generative adversarial network (P-GAN)—a deep learning algorithm. By feeding the P-GAN network nearly 6,000 manually analyzed AO-OCT-acquired images of human RPE, each paired with its corresponding speckled original, the team trained the network to identify and recover speckle-obscured cellular features.  

When tested on new images, P-GAN successfully de-speckled the RPE images, recovering cellular details. With one image capture, it generated results comparable to the manual method, which required the acquisition and averaging of 120 images. With a variety of objective performance metrics that assess things like cell shape and structure, P-GAN outperformed other AI techniques. Vineeta Das, Ph.D., a postdoctoral fellow in the Clinical and Translational Imaging Section at NEI, estimates that P-GAN reduced imaging acquisition and processing time by about 100-fold. P-GAN also yielded greater contrast, about 3.5 greater than before. 

RPE images

(A) A top-down view of the RPE layer as seen by clinical OCT. Although the image is zoomed in to the scale of single cells, it is difficult to visualize the cells. (B) AO-OCT provides a more detailed image of the RPE layer, but the cells are obscured by speckle. (C) There is a remarkable improvement in RPE cell visualization gained by applying AI to the speckled AO-OCT image. Each dark area represents a single RPE cell. Credit: Vineeta Das, National Eye Institute.

By integrating AI with AO-OCT, Tam believes that a major obstacle for routine clinical imaging using AO-OCT has been overcome, especially for diseases that affect the RPE, which has traditionally been difficult to image.

“Our results suggest that AI can fundamentally change how images are captured,” said Tam. “Our P-GAN artificial intelligence will make AO imaging more accessible for routine clinical applications and for studies aimed at understanding the structure, function, and pathophysiology of blinding retinal diseases. Thinking about AI as a part of the overall imaging system, as opposed to a tool that is only applied after images have been captured, is a paradigm shift for the field of AI.”

More news from the NEI Clinical and Translational Imaging Section .

This press release describes a basic research finding. Basic research increases our understanding of human behavior and biology, which is foundational to advancing new and better ways to prevent, diagnose, and treat disease. Science is an unpredictable and incremental process— each research advance builds on past discoveries, often in unexpected ways. Most clinical advances would not be possible without the knowledge of fundamental basic research. To learn more about basic research, visit https://www.nih.gov/news-events/basic-research-digital-media-kit .

NEI leads the federal government’s efforts to eliminate vision loss and improve quality of life through vision research…driving innovation, fostering collaboration, expanding the vision workforce, and educating the public and key stakeholders. NEI supports basic and clinical science programs to develop sight-saving treatments and to broaden opportunities for people with vision impairment. For more information, visit   https://www.nei.nih.gov .  

About the National Institutes of Health (NIH): NIH, the nation’s medical research agency, includes 27 Institutes and Centers and is a component of the U.S. Department of Health and Human Services. NIH is the primary federal agency conducting and supporting basic, clinical, and translational medical research, and is investigating the causes, treatments, and cures for both common and rare diseases. For more information about NIH and its programs, visit https://www.nih.gov/. 

NIH…Turning Discovery Into Health®

  • Open access
  • Published: 11 April 2024

KEGG orthology prediction of bacterial proteins using natural language processing

  • Jing Chen 1 , 2 ,
  • Haoyu Wu 1 &
  • Ning Wang 1  

BMC Bioinformatics volume  25 , Article number:  146 ( 2024 ) Cite this article

139 Accesses

Metrics details

The advent of high-throughput technologies has led to an exponential increase in uncharacterized bacterial protein sequences, surpassing the capacity of manual curation. A large number of bacterial protein sequences remain unannotated by Kyoto Encyclopedia of Genes and Genomes (KEGG) orthology, making it necessary to use auto annotation tools. These tools are now indispensable in the biological research landscape, bridging the gap between the vastness of unannotated sequences and meaningful biological insights.

In this work, we propose a novel pipeline for KEGG orthology annotation of bacterial protein sequences that uses natural language processing and deep learning. To assess the effectiveness of our pipeline, we conducted evaluations using the genomes of two randomly selected species from the KEGG database. In our evaluation, we obtain competitive results on precision, recall, and F1 score, with values of 0.948, 0.947, and 0.947, respectively.

Conclusions

Our experimental results suggest that our pipeline demonstrates performance comparable to traditional methods and excels in identifying distant relatives with low sequence identity. This demonstrates the potential of our pipeline to significantly improve the accuracy and comprehensiveness of KEGG orthology annotation, thereby advancing our understanding of functional relationships within biological systems.

Peer Review reports

Bacteria, ubiquitous microorganisms inhabiting diverse environments, play an indispensable role in shaping the biosphere and influencing human health [ 1 , 2 , 3 ]. Their sheer abundance and diversity underscore their significance in ecological processes, ranging from nutrient cycling to bioremediation [ 4 , 5 , 6 ]. Moreover, bacteria have been central to pivotal discoveries in the fields of genetics, molecular biology, and biotechnology, serving as model organisms for fundamental biological research. The functional elucidation of bacterial proteins is pivotal in unraveling the intricacies of microbial life and harnessing their potential for biotechnological applications.

With the advent of high-throughput technologies, the number of newly discovered bacterial proteins per year is increasing rapidly [ 7 ]. While this wealth of genetic information offers immense potential for elucidating the roles and functions of these proteins, annotating the functions of newly discovered sequences remains a formidable challenge. Traditional experimental methods for function annotation, whether in vitro or in vivo, are not only expensive but also time-consuming. Consequently, there is an urgent need to explore alternative, cost-effective strategies for protein function prediction. One promising method is the application of automated annotation tools, which use computational methods to predict protein functions based on sequences.

These automated annotation tools rely on databases that have been manually curated and annotated by human experts. One widely used database for gene and protein functional annotation is the KEGG database [ 8 ]. It comprises comprehensive and integrated databases of molecular pathways, networks, and genes involved in various cellular processes, including metabolism, signaling, and diseases. The KEGG orthology (KO) database is a database of molecular functions represented in terms of functional orthologs. A functional ortholog is manually defined in the context of KEGG molecular networks. The KO identifier (called K number) is defined based on the experimental characterization of genes and proteins within specific organisms. These K numbers are subsequently used to assign orthologous genes in other organisms. KO data refers to the protein sequences cataloged within the KO database, whereas non-KO data references protein sequences identified in the KEGG GENES database yet to be associated with a KO identifier. Accurate and reliable KO prediction is essential for understanding the biological systems.

Several computational methods have been proposed for KO prediction, including sequence alignment and machine learning. KOBAS [ 9 , 10 , 11 ] used BLAST [ 12 ] E-value to assign K numbers. KAAS [ 13 ] employed BLAST to compute the bidirectional hit rate between query sequences and the KEGG reference databases. It defined a weighted score to assign K numbers, and these weighting factors take into account aspects such as ortholog group and sequence length, among others. BlastKOALA and GhostKOALA [ 14 ] used BLASTP and GHOSTX [ 15 ], respectively, for searching the non-redundant KEGG GENES database. KOALA (KEGG Orthology And Links Annotation) was originally developed as KEGG’s internal annotation tool for K number assignment using SSEARCH [ 16 ] computation. The scoring methodology of KOALA takes into account numerous factors. These include the Smith-Waterman (SW) score [ 17 ], the best-best flag, the degree of alignment overlap, the ratio of query to DB (DataBase) sequences, the taxonomic category, and the presence of Pfam domains. In BlastKOALA, the K number assignment is performed using the weighted sum of BLAST bit scores, where the weighting scheme is the same as the KOALA algorithm excluding the bidirectional best-hit information. In GhostKOALA, the K number assignment is simply based on the sum of GHOSTX normalized scores without considering any weighting factors. KofamKOALA [ 18 ] used profile hidden Markov models (pHMM) from machine learning to calculate similarity scores and subsequently also used the KOALA algorithm to assign K numbers.

KOBAS, KAAS, and BlastKOALA all utilize the BLAST algorithm to calculate sequence similarity but employ distinct methods for scoring computation. KOALA differentiates itself by incorporating additional information, such as taxonomic categories and Pfam domains, which often contribute to improved results. BlastKOALA and GhostKOALA, while both based on KOALA, adopt different approaches to sequence similarity calculation. BlastKOALA utilizes BLASTP, a heuristic local alignment algorithm, which is particularly suited for annotating fully sequenced genomes. On the other hand, GhostKOALA leverages GHOSTX, which employs genome-wide sequence alignment and uses suffix arrays for efficient matching. Unlike BLASTP, GHOSTX is designed for protein-level comparisons at the genomic scale, making it ideal for conducting comprehensive genome searches and homology analysis in large-scale genome data. KofamKOALA presents a different approach compared to BlastKOALA and GhostKOALA. It employs the KOALA framework but also integrates the use of a HMM profiles database for KEGG Orthologs, known as KOfam. This method allows KofamKOALA to provide accurate functional annotations by matching query sequences using HMM profiles instead of actual sequences. An additional advantage of KofamKOALA is its speed, as the use of HMM profiles can significantly speed up the matching process. However, note that after database updates, a substantial amount of time is needed to update these HMM profiles, which could be a potential limitation. Choosing between these methods largely depends on the specific characteristics of the dataset in question and the specific constraints of the study.

However, these methods have certain limitations, as they rely on sequence similarity and may not be effective in identifying KOs with dissimilar sequences. Around one-third of identified bacterial proteins lack known homologs, thereby restricting the number of annotations that can be accurately predicted [ 19 ]. Moreover, the growing reliance on high-throughput experiments has resulted in a skewed distribution of functional protein annotations in databases, leaving a considerable number of bacterial proteins unexplored in terms of their functions [ 20 ]. In recent years, deep learning has emerged as a promising method for protein function prediction, owing to its capacity to autonomously learn complex patterns and representations from large and complex datasets.

Anfinsen proposed the famous sequence-structure-function relationship in 1973 [ 21 ], which states that the protein sequence determines its structure, and the structure determines its function. Since the protein sequence is composed of amino acids and has a hierarchical structure similar to sentences and words, NLP (Natural Language Processing) can be used to model and learn protein sequences and predict protein functions. Compared to the previous sequence similarity-based methods, using NLP methods with deep learning for KO prediction can discover KOs that have similar functions but dissimilar sequences. These methods primarily involves extracting features from the protein sequence, converting them into word representations (embeddings), and subsequently classifying these representations. These methods can be classified into three categories: context-free models, context-sensitive models, and pre-trained large-scale protein language models. Context-free models generate a unique word representation for each amino acid(AA) [ 22 , 23 ]. While context-sensitive models produce representations that depend on the context in which the AA appears [ 24 , 25 ]. Therefore, a single AA may have different representations across different protein sequences. Pre-trained large-scale protein language models have extracted many biological features from protein sequences through unsupervised pre-training on a large corpus, and fine-tuning or feature extraction in downstream tasks can achieve good results [ 26 , 27 , 28 ]. In theory, embedding-based methods offer an alternative perspective for annotation, employing techniques such as clustering to overcome the limitations of homology-based methods.

In this paper, we propose a novel pipeline for the KO annotation of bacterial sequences using NLP and deep learning. Firstly , we propose a classifier based on pre-trained large-scale protein language models to distinguish between KO and non-KO data. Subsequently , an embeddings-based clustering module is conducted to assign a specific K number to each candidate sequence. Furthermore , we conduct a structural alignment method, using structural similarity, to ascertain the functional similarity of sequences, thus validating the assigned KOs. Our pipeline demonstrates competitive performance compared to traditional methods and notably excels in identifying distant relatives with low sequence identity. To the best of our knowledge, this study represents the pioneering effort in using a deep learning model that incorporates NLP for computational modeling in KO prediction.

Overview of our pipeline for KO annotation

In Fig.  1 , we present a schematic overview of the proposed KO annotation pipeline. This pipeline is comprised of two primary parts: a classifier designed to discriminate between candidate KO sequences and non-KO sequences, and a clustering module that subsequently assigns a specific K number to each candidate KO sequence. To validate our results, we performed structural alignment between the candidate KO sequences and the known sequences in the KEGG database. To train the classifier, we used the BD (Bacterial Data) dataset, which consists of pre-processed bacterial protein sequences sourced from KEGG GENES, totaling approximately 17 million sequences. The cluster module used the RD (Reference Data) dataset, which comprises reference genomes and addendum from KEGG GENES, totaling approximately 0.6 million sequences. For comprehensive information regarding the construction of both the BD and RD datasets, please refer to Data collection and filtering for detailed explanations.

In order to provide a more tangible understanding of our pipeline, we present a running example. Let’s consider an unannotated sequence, which matches with the annotated sequence ppu:PP_4955, associated with the KEGG number K02030. The process can be broken down into the following steps:

Sequence Embedding: The unannotated sequence is first transformed into an embedding using ProtT5. This sequence embedding captures the essential features of the sequence.

KO Prediction: This sequence embedding is subsequently input into a Multilayer Perceptron (MLP) layer, which acts as our primary prediction model. The MLP layer is used to predict whether the sequence is a KO or not, determining if the process proceeds to the clustering step or terminates.

Sequence Clustering: For sequences predicted as KO, the sequence embedding is compared to the embeddings of each sequence in the RD dataset. This comparison is performed using Euclidean distance as the similarity metric.

Annotation Assignment: The sequence (in this case, ppu:PP_4955) that exhibits the smallest Euclidean distance is chosen as the best match. The annotation associated with this best match (K02030 in our example) is then assigned to the initially unannotated sequence.

figure 1

Schematic overview of our pipeline. In this study, we started by collecting KO and non-KO data from the KEGG GENES database to construct our classifier (left). Subsequently, we employed the classifier to mine protein sequences for the identification of potential KOs and used an embedding-based clustering module to assign a specific K number (middle). To validate our results, we performed structural alignment between the candidate KO sequences and the known sequences in the KEGG database (right)

Performance evaluation of classifiers

Table  1 presents a comprehensive summary of the performance metrics obtained by evaluating various classifiers. The training dataset comprised 80% of the BD database, while the remaining 20% was allocated for testing purposes. The evaluated metrics encompass precision, recall, and the F1 score, which represents the harmonic mean of precision and recall. The LSTM (Long Short-Term Memory) model was based on Veltri et al. [ 29 ], which is a neural network model with a core layer of LSTM [ 30 ]. The attention model was inspired by Ma et al. [ 31 ], where the LSTM layer was replaced with an attention layer [ 32 ]. Finally, we included a Text-to-Text Transfer Transformer (ProtT5) model that was pre-trained using a large number of protein sequences. Notably, the ProtT5 model outperforms all other classifiers across all metrics, showcasing its superior predictive capabilities for KO annotation. With compelling results displayed in Table  1 , we confidently select the ProtT5 model as the preferred classifier for our study.

Performance evaluation of KO annotation tools

To validate the results, we implemented a evaluation that involved the random selection of two species, Bradyrhizobium japonicum E109 (bjp) and Paraburkholderia aromaticivorans BN5 (parb), from the KEGG organisms. A test set comprising 12,329 sequences from these selected species was used to evaluate the performance of each KO annotation tool. The test set had a ratio of 1.09:1 for KO to no KO assigned sequences. Sequences from the BD dataset that were identical to those from the two species were removed, leaving the remaining sequences as the training set for the classifier. In cases where identical sequences from different species exhibited varying annotations, we retained the annotation with the K number as the final annotation.

Our clustering module still relies on the RD dataset, which does not include sequences from these two species. As for BlastKOALA, GhostKOALA, and KofamKOALA, we used the default target databases of their respective webservers. Our RD dataset is largely consistent with the dataset used by BlastKOALA, while GhostKOALA employed a dataset that is one order of magnitude larger. KofamKOALA, on the other hand, utilized 25,346 pHMMs.

The evaluation of each tool contains the computation of the number of match, unmatch, missed, and added cases, alongside precision, recall, and F1 score calculations. Specifically, match refers to the number of cases where the predicted KO precisely matched the KO defined in the KEGG GENES database. Unmatch denotes cases where the predicted KO differed from the assigned KO in KEGG GENES. Missed cases represented KOs defined in KEGG GENES that were not successfully predicted by the tools. Finally, added cases indicated situations where a K number was assigned by the prediction despite no corresponding KO being defined in KEGG GENES.

In Table  2 , our pipeline achieved the best recall by having the highest number of match cases, the lowest number of missed cases, and the second-best F1 score. GhostKOALA obtained the best precision and F1 score due to having the fewest unmatch cases. And BlastKOALA had the lowest number of added cases. GhostKOALA’s precision is relatively higher, owing to the larger dataset, which has the potential to improve the accuracy of predictions. Due to the differences in datasets, our pipeline’s performance evaluation with BlastKOALA is the most equitable. Our pipeline outperforms BlastKOALA with higher match cases, recall, and F1 scores.

If classifier is not used, and a clustering threshold is employed to distinguish between KO and non-KO sequences, the metrics show inferior performance compared to the original pipeline that just used classifier. It indicates that relying solely on a clustering threshold may not capture the complexity and nuances required for accurate KO prediction. On the other hand, when both the classifier and clustering threshold are used simultaneously to differentiate KO and non-KO sequences, precision increases while recall decreases. However, the F1 score, which considers both precision and recall, remains almost the same. It suggests that the integration of classifier and clustering threshold allows for a more refined and precise classification of sequences. It is important to note that in this study, a threshold-based method was not utilized to avoid introducing excessive hyperparameters.

Generalizability across different bacterial species

As a critical measure of the robustness and utility of a model is its ability to generalize across diverse datasets, we extended the evaluation to assess our pipeline’s performance across different bacterial species. Initially, Bradyrhizobium japonicum E109 (bjp) and Paraburkholderia aromaticivorans BN5 (parb), were randomly selected from the KEGG database. These species, belonging to the Alphaproteobacteria and Betaproteobacteria classes within the Pseudomonadota phylum respectively. To augment our pipeline’s generalizability assessment, we randomly selected a bacterial species from a different phylum in the KEGG database, added after our initial download. We ultimately chose Borreliella finlandensis Z11 (bff) from the phylum Spirochaetota. The performance metrics of our pipeline on this additional species were congruent with our initial results, further substantiating our pipeline’s generalization potential. The performance results are listed in Table  3 .

Validating results through structural alignment

To evaluate the functional similarity in the unmatch and added cases, we conducted structural alignments between the known KO sequence and the KO sequence identified by our pipeline using the CE-CP (Combinatorial Extension for Circular Permutations) algorithm [ 33 ]. The quality of these alignments was assessed using the TM-score (Template Modeling score) [ 34 ], a score between (0, 1], where 1 indicates a perfect match between two structures. Therefore, a higher TM-score reflects a greater level of structural similarity. The results of these structural alignments are shown in Fig.  2 . In the unmatch cases, where the assigned K number differ from those defined in KEGG GENES, we found that 55.2% of the sequences had a TM-score \(\ge 0.8\) , indicating a high level of structural similarity. Only 13.7% of the sequences had a TM-score <0.5, suggesting dissimilar structural domains. Similarly, in the added cases, where our pipeline assigned the K number to sequences not defined in KEGG GENES, we observed that 59% of the sequences had a TM-score \(\ge 0.8\) , while only 7% had a TM-score <0.5. Within the unmatch cases, we found that 13.7% of the sequences had different KO numbers but belonged to the same EC number, suggesting shared enzymatic functions. For example, for the sequence parb:CJU94_35085, we assign K10010, whereas KEGG assigns K02028, but they share the same EC:7.4.2.1. And the TM-score between the parb:CJU94_35085 and our clustered sequence is 0.99. The findings indicate that despite differences in the assigned K number, the functionalities of the sequences are quite similar due to the high structural similarity.

figure 2

Distribution of structural similarity metric TM-score in unmatch and added cases. These two cases represent instances where our pipeline incorrectly assigned the K number, while the KEGG GENES database assigned a different K number (unmatch) or did not assign K number (added). A TM-score of \(\ge 0.5\) suggests the presence of similar structural domains, while a TM-score of \(\ge 0.8\) indicates highly similar structures, which implies potential functional similarity

Exploring recognition of distant relatives

Our pipeline achieved the highest number of match cases, prompting us to conduct further analysis. We used the Smith-Waterman algorithm [ 35 ] to compute the identity between the predicted sequences of all match cases and the clustered sequence, as shown in Fig.  3 a. Additionally, we calculated the identity of the sequences not predicted by other methods in our match cases, as shown in Fig.  3 b.

Based on the analysis of Fig.  3 , the identity distribution of sequences in our match cases mostly falls within the range of 80% or higher. However, for sequences not predicted by BlastKOALA and GhostKOALA in our match cases, the majority of identities are in the 60% or lower range. This indicates that our model has a stronger ability to identify distant relative proteins, despite GhostKOALA’s using a dataset that is one order of magnitude larger than ours. KofamKOALA displays a similar overall trend to our model, but it identifies fewer match cases compared to ours.

We provided two low identity (< 30%) sequences from our match cases as examples where other methods failed to make predictions. Sequence parb:CJU94_35185 exhibits only 21.2% identity with the clustered sequence eba:p2A55, yet they are remarkably close in the embedding space, allowing our model to recognize it. Likewise, another sequence bjp:RN69_21090 and the clustered sequence ppu:PP_4955, showcase a sequence identity of 24.3%, but close in the embedding space.

figure 3

Identity distribution. The width of the violin plot along the X-axis corresponds to the frequency of data points. a The identity distribution of the predicted sequences of all match cases and the clustered sequence. b The identity distribution of the sequences not predicted by other methods in our match cases

Annotating bacterial proteins with KO classifications is crucial for deciphering the functional roles of these proteins within the intricate machinery of microbial organisms. The comprehensive understanding of these annotations aids in elucidating the pathways, metabolic networks, and regulatory mechanisms that govern bacterial life. Accurate KO annotations are pivotal for various downstream analyses, including comparative genomics, pathway reconstruction, and functional inference.

In this study, we present a novel pipeline for predicting KO annotations of bacterial proteins using NLP from deep learning. Our model’s performance surpasses most traditional methods, falling slightly short only in comparison to GhostKOALA. However, it is important to note that GhostKOALA operates on a dataset that is an order of magnitude larger, which may account for the nuanced differences in performance. On the other hand, BlastKOALA uses a dataset that is largely consistent with our RD dataset, and our pipeline outperforms BlastKOALA with superior match cases, recall, and F1 scores.

In the comparison of the performance of three classifiers, the ProtT5 model outperforms the other two classifiers across all metrics. The ProtT5 model was pre-trained using approximately 45 million protein sequences, with the pre-training task involving learning to predict masked amino acids (tokens) within known sequences. Subsequently, it trained on our BD dataset with 17 million bacterial protein sequences using MLP to distinguish between KO and non-KO sequences. In contrast to LSTM and attention models trained solely on the BD dataset without pre-training, the extensive pre-training on a large dataset enabled ProtT5 to acquire a deeper understanding of the intricate language of life. This understanding contributed to its superior performance in our classification tasks.

We explored the use of both classifiers and clustering thresholds. Our findings indicate that employing classifiers, particularly those generated using pre-trained models to generate embeddings, offers a more effective method compared to solely relying on clustering thresholds. Combining the classifier and clustering thresholds allows for finer adjustments, enabling researchers to prioritize precision or recall depending on the specific needs of their analysis.

To further validate the accuracy of our predictions for the sequences in our unmatch and added cases, we conducted structural alignments. Although we did not precisely predict the matching K number, approximately 89.7% of the sequences exhibited TM-score greater than 0.5. This suggests that these proteins share similar structural domains and likely perform analogous functions. In the case of unmatch, 13.7% of the sequences possessed different KO assignments but shared identical EC numbers, indicating shared enzymatic function.

One of the most significant challenges in annotating bacterial proteins lies in the ability to capture functional relationships between proteins that share low sequence similarity. Traditional methods predominantly rely on sequence homology, which can overlook crucial associations, particularly among distantly related proteins. Our analysis revealed that a proportion of the KO proteins our model identified were missed by traditional methods, particularly those with low sequence similarity. This suggests that our NLP-based pipeline has the potential to uncover functional relationships that may be obscured by conventional homology-based methods.

While methods such as I-TASSER [ 36 ], which are based on protein 3D structures, may mitigate an over-reliance on sequence similarity alone, they often need significant computational resources and time. To illustrate, generating a protein structure with 384 residues using a V100 GPU card with 16GB memory can take approximately 9.2 min. This can be quite resource-intensive when dealing with large datasets. In contrast, our pipeline is far more efficient. More specifically, generating embeddings for a protein of the same length using the same GPU card takes only 0.057 s. Further, our study explores the feasibility and effectiveness of using embeddings from a pre-trained large-scale protein language model, solely based on sequence information, for functional clustering. We have also cross-validated our results using AlphaFold2, which demonstrated satisfactory performance. This approach, while being economical and efficient, also proves to be accurate, offering a viable alternative for KO prediction.

Despite the innovative approach and encouraging results achieved by our method, it is important to recognize certain limitations. Firstly, the substantial computational resources demanded by the large protein language model ProtT5 present a challenge. Specifically, the ability to process long sequences is constrained by the memory capacity of the GPU used. This requirement thus restricts the range of sequence lengths that our method can effectively handle. Furthermore, our pipeline currently focuses on sequence data. Despite its ability to yield important information, this approach might not comprehensively capture the intricate characteristics of proteins. This focus on sequences could potentially leave out important information derived from other protein characteristics, such as their three-dimensional structures or interactions within biological systems.

By effectively identifying and annotating new or unknown bacterial proteins, our pipeline contributes to an increased annotation coverage of bacterial proteins in the KEGG database, thereby expanding its application scope. Furthermore, the integration of our pipeline with NLP technologies offers a fresh perspective and methodology for future research in the KO prediction domain. It can be effectively applied to other species and extended to other protein function predictions, further amplifying its utility and impact.

This study introduces a novel NLP-based pipeline to the field of KO prediction and demonstrates its significant potential. Our pipeline excels in predicting distant relatives, providing a new solution to address the challenges faced by traditional homology-based methods.

For future research, we suggest exploring the integration of NLP-based methods with traditional methods to fully use their complementary advantages in KO prediction, thus improving prediction accuracy and comprehensiveness. In KEGG GENES, approximately 20% of bacterial protein sequences have a length greater than 600. Therefore, another direction is the analysis of long Transformers, which can handle longer amino acid sequences without preprocessing steps and significant computational resources. As a final point, we consider incorporating other features, such as KEGG pathways (molecular interaction, reaction, and relation networks) and protein structure information, to further enhance the performance of model.

Data collection and filtering

We collected three datasets, including BD, RD, and SD (Structural Data) datasets.

The BD dataset contains both KO and non-KO data, which were obtained from the KEGG GENES database (downloaded in August 2022) with a restriction on the species to bacteria (7409 species in total in KEGG GENES). Duplicate sequences were removed. To ensure the quality of the data, we removed sequences shorter than 100 amino acids and sequences longer than 600 amino acids. We set these size limits based on the observation that sequences shorter than 100 amino acids often have lower true positives [ 37 ], while sequences longer than 600 amino acids contain limited KO data (less than 20%). Sequences containing undefined amino acids were also removed (0.07%). The length distributions of KO and non-KO data were kept consistent (deviation <5%) to avoid length bias in the model. The data were split into training and testing sets with an 8:2 ratio, and data from all species were merged. In cases where identical sequences from different species exhibited varying annotations, we retained the annotation with the K number as the final annotation. The final training set consisted of 7,624,360 KO sequences and 1,906,089 non-KO sequences, while the testing set consisted of 5,875,496 KO sequences and 1,468,873 non-KO sequences. The length distribution is shown in Fig.  4 .

figure 4

Length distribution of BD dataset

The RD dataset is a small subset of KEGG GENES containing KEGG reference genomes and individual sequences linked from PubMed records of KO entries. Reference genomes are introduced for those genomes with enough experimental data for gene/protein functions, as seen by the number of sequence links in the PubMed reference fields of the KO database. We obtained 24,146 KOs and 623,239 reference sequences (Table  4 ).

The SD dataset contains protein structures. The structures of predicted sequences are generated using AlphaFold2 [ 38 ], while the structures of KEGG sequences are obtained from Protein Data Bank (PDB) [ 39 ] or AlphaFold Protein Structure Database (AFDB). Although AlphaFold2 is a predictive model, it has been shown to achieve atomic-level precision that is comparable to experimental protein structure resolution [ 38 ]. Therefore, structures generated by AlphaFold2 are considered to have high confidence.

We trained three models to distinguish KOs from non-KOs. LSTM and Attention did not use pre-training. ProtT5 [ 28 ] used pre-training on biological language corpora.

The LSTM model originates from the research conducted by Veltri et al. [ 29 ], while the Attention model is also based on earlier research by Ma et al. [ 31 ]. Firstly, we converted the protein sequences into fixed-size vectors by representing the 20 basic amino acids as numerical values ranging from 1 to 20. If the raw sequence did not reach 600 amino acids, we padded the sequence vectors with 0. The resulting vector was then expanded to 128 dimensions using an embedding layer, and fed into a 1D convolutional layer with 64 filters and a 1D max pooling layer. Secondly, an LSTM layer with 100 units was implemented, followed by a final classification layer that employed a sigmoid function (Fig.  5 a). The attention model simply replaces the LSTM layer with an attention layer, while keeping the rest of the network unchanged (Fig.  5 b).

ProtT5 (ProtT5-XL-U50) is trained on a large corpus of protein sequences. This allows it to learn representations that are particularly well-suited for protein-related tasks, such as predicting protein structure, function, and interactions. By feeding protein sequences into the model and extracting the last hidden layer representations generated by the model, we can obtain high-quality, low-dimensional representations of proteins that can be used as input to downstream models [ 40 ]. For our downstream model, we used an MLP architecture consisting of two fully connected layers with a hidden size of 100. The final classification was performed using the sigmoid activation function (Fig.  5 c).

figure 5

Classifier architecture. a The LSTM model architecture. The protein sequences were converted into fixed-size vectors and subsequently passed through an embedding layer with a length of 128. This was followed by a 1D convolutional layer comprising 64 filters and a subsequent 1D max pooling layer. Next, an LSTM layer with 100 units was implemented, followed by a final classification layer that employed a sigmoid function. b The attention model architecture. The attention model replaced the LSTM layer of the LSTM model with an attention layer, while the remaining modules remained unchanged. c The ProtT5 model architecture. The protein sequences were initially fed into the ProtT5 Layer, followed by an MLP Layer comprised of two fully connected layers with a hidden size of 100. Just like the LSTM and attention method, the final step used a sigmoid function for classification

Binary cross-entropy loss, Adam optimizer [ 41 ], and the ReLU activation function were selected for all models. To prevent overfitting, we reserved 20% of the training dataset as the validation dataset, which was employed to implement the early-stop strategy. The strategy halted the model’s training when its performance began to decline, and the best-performing model on the validation dataset was saved as the final model. The final classification layer produced a scalar value between 0 and 1, with values greater than 0.5 classified as KO.

To evaluate and compare the three classifiers, we used three evaluation metrics: \(\text {precision}^*\) , \(\text {recall}^*\) , and \(\text {F1}^*\) . To distinguish the calculation formulas for precision and recall of the classifier from the entire pipeline, an asterisk (*) is added as a superscript here. \(\text {Precision}^*\) measures the proportion of true positives out of all predicted positives, while \(\text {recall}^*\) measures the proportion of true positives out of all actual positives. Since \(\text {precision}^*\) and \(\text {recall}^*\) can sometimes conflict with each other, a common way to combine them is through the \(\text {F1}^*\) score, which is the harmonic mean of \(\text {precision}^*\) and \(\text {recall}^*\) . The \(\text {F1}^*\) score provides a balanced measure of model performance that takes both \(\text {precision}^*\) and \(\text {recall}^*\) into account, and is therefore often used as an overall indicator of a model’s classification ability. The definition of the formula is as follows:

where TP (True Positive) represents the number of real positive cases where the model correctly predicted a positive result, FP (False Positive) represents the number of real negative cases where the model incorrectly predicted a positive result, and FN (False Negative) represents the number of real positive cases where the model incorrectly predicted a negative result.

The process of clustering predicted KOs and known KOs from the RD dataset based on similar functions begins with the conversion of protein sequences into embeddings using ProtT5. Subsequently, the Euclidean distance (calculated using Eq. ( 4 )) is calculated between the embeddings of the predicted sequences and those of the known KOs. The best match is selected based on the smallest Euclidean distance, and the associated annotation of the best match is subsequently assigned to the predicted sequence. In cases where there are several top matches with different annotated K number, our pipeline is designed to report all such matches. While theoretically it’s possible to have multiple top matches, the likelihood of is extremely low due to the high dimensionality and complexity of protein embeddings. Thus, in our experiments, we have not encountered such cases.

where \(x=(x_1,\ldots ,x_n)\) and \(y=(y_1,\ldots ,y_n)\) are n-dimensional embeddings of two protein sequences.

We selected the ProtT5 model to convert the protein sequence into embeddings due to its superior performance, as observed in the experimental results of ProtTrans [ 28 ] and our classifier experiment. Among the models evaluated, ProtT5 exhibited the most comprehensive and effective performance, making it the preferred choice for generating embeddings from protein sequences in our study.

For a comprehensive evaluation, we use precision, recall, and F1 score. While precision and recall bear similarities to those used in classification, there exist subtle distinctions. Specific details can be found in Eqs. ( 5 ) and ( 6 ).

Match refers to the number of cases where the predicted KO precisely matched the KO defined in the KEGG GENES database. Unmatch denotes cases where the predicted KO differed from the assigned KO in KEGG GENES. Missed cases represented KOs defined in KEGG GENES that were not successfully predicted by the tools. Added cases indicated situations where a K number was assigned by the prediction despite no corresponding KO being defined in KEGG GENES.

Structural alignment

The predicted sequences are subjected to structural modeling using the highly precise AlphaFold2 model, renowned for its accuracy in protein structure prediction. Subsequently, these predicted structures are compared to the structures of clustered KO sequences, which are included in the SD dataset, using the CE-CP algorithm (Fig.  6 ). The CE-CP algorithm facilitates the comparison of circularly permuted proteins, enabling a comprehensive analysis of the structural similarities between the predicted sequences and the clustered KO sequences. We employed AlphaFold v2.3.2 with specific parameters configured as follows: model type: alphafold2_ptm, number relax: 0, templete mode: pdb70, msa mode: mmseqs2_uniref_env, pair mode: unpaired_paired, num recycles: 20, recycle early stop tolerance: tol = 0.5, max msa: auto, num seeds: 1, use dropout: False. For the CE-CP algorithm, specific parameters were set as follows: maximum gap size: 30, gap opening penalty: 5, gap extension penalty: 0.5, fragment size: 8, RMSD (Root Mean Square Deviation) threshold: 3, maximum RMSD: 99, and min CP block length: 5.

figure 6

Structural alignment. The structure of the predicted sequence was generated using the AlphaFold2 model, while the structure of the clustered KO sequence was sourced from the PDB or AFDB databases. Structural alignment was conducted using the CE-CP algorithm

For evaluating structural comparison, the TM-score is used as the assessment metric. TM-score measures the proportion of the distance difference between matched residues in the target protein and template protein to the length of the target protein. The TM-score equation is presented in Eq. ( 8 ).

where \(L_\text {target}\) is the length of the amino acid sequence of the target protein, and \(L_\text {common}\) is the number of residues that appear in both the template and target structures. \(d_i\) is the distance between the i th pair of residues in the template and target structures, and \(d_0(L_\text {target})=1.24\root 3 \of {L_\text {target}-15}-1.8\) is a distance scale that normalizes distances.

Data availibility

All code employed in this study is publicly available on GitHub ( https://github.com/wuhaoyu3/KO-Identification ). Publicly available datasets were analyzed in this study. These datasets were collected from the KEGG database ( https://www.kegg.jp/ ), the PDB database ( https://www.rcsb.org/ ), and the AFDB database ( https://alphafold.ebi.ac.uk/ ).

Abbreviations

Kyoto encyclopedia of genes and genomes

  • KEGG orthology

Natural language processing

Combinatorial extension for circular permutations

Template modeling score

Protein data bank

AlphaFold protein structure database

Bacterial data

Reference data

Structural data

Multilayer perceptron

Hidden Markov model

Long short-term memory

Root mean square deviation

True positives

False positives

False negatives

Rigauts C, Aizawa J, Taylor SL, Rogers GB, Govaerts M, Cos P, et al. Rothia mucilaginosa is an anti-inflammatory bacterium in the respiratory tract of patients with chronic lung disease. Eur Respir J. 2022;59(5):2101293.

von Mutius E. The microbial environment and its influence on asthma prevention in early life. J Allergy Clin Immunol. 2016;137(3):680–9.

Article   Google Scholar  

Das S, Bernasconi E, Koutsokera A, Wurlod DA, Tripathi V, Bonilla-Rosso G, et al. A prevalent and culturable microbiota links ecological balance to clinical stability of the human lung after transplantation. Nat Commun. 2021;12(1):2126.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Liao H, Liu C, Ai C, Gao T, Yang Q, Yu Z, et al. Mesophilic and thermophilic viruses are associated with nutrient cycling during hyperthermophilic composting. ISME J. 2023;17(6):916–30.

Article   PubMed   PubMed Central   Google Scholar  

Muriel-Millán L, Millán-López S, Pardo-López L. Biotechnological applications of marine bacteria in bioremediation of environments polluted with hydrocarbons and plastics. Appl Microbiol Biotechnol. 2021;105(19):7171–85.

Article   PubMed   Google Scholar  

Zhang Z, Fu Q, Xiao C, Ding M, Liang D, Li H, et al. Impact of Paenarthrobacter ureafaciens ZF1 on the soil enzyme activity and microbial community during the bioremediation of atrazine-contaminated soils. BMC Microbiol. 2022;22(1):1–12.

Haft DH, DiCuccio M, Badretdin A, Brover V, Chetvernin V, O’Neill K, et al. RefSeq: an update on prokaryotic genome annotation and curation. Nucleic Acids Res. 2018;46(D1):D851–60.

Kanehisa M, Furumichi M, Sato Y, Kawashima M, Ishiguro-Watanabe M. KEGG for taxonomy-based analysis of pathways and genomes. Nucleic Acids Res. 2023;51(D1):D587–92.

Article   CAS   PubMed   Google Scholar  

Wu J, Mao X, Cai T, Luo J, Wei L. KOBAS server: a web-based platform for automated annotation and pathway identification. Nucleic Acids Res. 2006;34(suppl_2):W720–4.

Xie C, Mao X, Huang J, Ding Y, Wu J, Dong S, et al. KOBAS 2.0: a web server for annotation and identification of enriched pathways and diseases. Nucleic Acids Res. 2011;39(suppl_2):W316–22.

Bu D, Luo H, Huo P, Wang Z, Zhang S, He Z, et al. KOBAS-i: intelligent prioritization and exploratory visualization of biological functions for gene enrichment analysis. Nucleic Acids Res. 2021;49(W1):W317–25.

Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25(17):3389–402.

Moriya Y, Itoh M, Okuda S, Yoshizawa AC, Kanehisa M. KAAS: an automatic genome annotation and pathway reconstruction server. Nucleic Acids Res. 2007;35(suppl_2):W182–5.

Kanehisa M, Sato Y, Morishima K. BlastKOALA and GhostKOALA: KEGG tools for functional characterization of genome and metagenome sequences. J Mol Biol. 2016;428(4):726–31.

Suzuki S, Kakuta M, Ishida T, Akiyama Y. GHOSTX: an improved sequence homology search algorithm using a query suffix array and a database suffix array. PLoS ONE. 2014;9(8): e103833.

Pearson WR. Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms. Genomics. 1991;11(3):635–650.

Smith TF, Waterman MS, et al. Identification of common molecular subse- quences. J Mol Biol. 1981;147(1):195–197.

Aramaki T, Blanc-Mathieu R, Endo H, Ohkubo K, Kanehisa M, Goto S, et al. KofamKOALA: KEGG Ortholog assignment based on profile HMM and adaptive score threshold. Bioinformatics. 2020;36(7):2251–2.

Antczak M, Michaelis M, Wass MN. Environmental conditions shape the nature of a minimal bacterial genome. Nat Commun. 2019;10(1):3100.

Schnoes AM, Ream DC, Thorman AW, Babbitt PC, Friedberg I. Biases in the experimental annotations of protein function and their effect on our understanding of protein function space. PLoS Comput Biol. 2013;9(5): e1003063.

Anfinsen CB. Principles that govern the folding of protein chains. Science. 1973;181(4096):223–30.

Naamati G, Askenazi M, Linial M. ClanTox: a classifier of short animal toxins. Nucleic Acids Res. 2009;37(suppl_2):W363–8.

Asgari E, Mofrad MR. Continuous distributed representation of biological sequences for deep proteomics and genomics. PLoS ONE. 2015;10(11): e0141287.

Heinzinger M, Elnaggar A, Wang Y, Dallago C, Nechaev D, Matthes F, et al. Modeling aspects of the language of life through transfer-learning protein sequences. BMC Bioinform. 2019;20(1):1–17.

Strodthoff N, Wagner P, Wenzel M, Samek W. UDSMProt: universal deep sequence models for protein classification. Bioinformatics. 2020;36(8):2401–9.

Rives A, Meier J, Sercu T, Goyal S, Lin Z, Liu J, et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc Natl Acad Sci. 2021;118(15): e2016239118.

Lin Z, Akin H, Rao R, Hie B, Zhu Z, Lu W, et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science. 2023;379(6637):1123–30.

Elnaggar A, Heinzinger M, Dallago C, Rehawi G, Wang Y, Jones L, et al. Prottrans: toward understanding the language of life through self-supervised learning. IEEE Trans Pattern Anal Mach Intell. 2021;44(10):7112–27.

Veltri D, Kamath U, Shehu A. Deep learning improves antimicrobial peptide recognition. Bioinformatics. 2018;34(16):2740–7.

Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–80.

Ma Y, Guo Z, Xia B, Zhang Y, Liu X, Yu Y, et al. Identification of antimicrobial peptides from the human gut microbiome using deep learning. Nat Biotechnol. 2022;40(6):921–31.

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. Advances in Neural Information Processing Systems. 2017;30.

Bliven SE, Bourne PE, Prlić A. Detection of circular permutations within protein structures using CE-CP. Bioinformatics. 2015;31(8):1316–8.

Zhang Y, Skolnick J. Scoring function for automated assessment of protein structure template quality. Proteins: Struct Funct Bioinform. 2004;57(4):702–10.

Smith TF, Waterman MS, et al. Identification of common molecular subsequences. J Mol Biol. 1981;147(1):195–7.

Yang J, Zhang Y. Protein structure and function prediction using I-TASSER. Curr Protoc Bioinform. 2015;52(1):5–8.

Hyatt D, LoCascio PF, Hauser LJ, Uberbacher EC. Gene and translation initiation site prediction in metagenomic sequences. Bioinformatics. 2012;28(17):2223–30.

Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596(7873):583–9.

Burley SK, Bhikadiya C, Bi C, Bittrich S, Chen L, Crichlow GV, et al. RCSB protein data bank: powerful new tools for exploring 3D structures of biological macromolecules for basic and applied research and education in fundamental biology, biomedicine, biotechnology, bioengineering and energy sciences. Nucleic Acids Res. 2021;49(D1):D437–51.

Ofer D, Brandes N, Linial M. The language of proteins: NLP, machine learning and protein sequences. Comput Struct Biotechnol J. 2021;19:1750–8.

Kingma D, Ba J. Adam: A method for stochastic optimization. In: International Conference on Learning Representations (ICLR). San Diega, CA, USA; 2015.

Download references

Acknowledgements

The authors thank Professor Feng Zhang and Engineer Chun Luo for stimulating and useful discussions.

This research was funded by the Fundamental Research Funds for the Central Universities (Grant No. JUSRP123035).

Author information

Authors and affiliations.

School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, China

Jing Chen, Haoyu Wu & Ning Wang

Jiangsu Provincial Engineering Laboratory of Pattern Recognition and Computing Intelligence, Jiangnan University, Wuxi, China

You can also search for this author in PubMed   Google Scholar

Contributions

HYW developed the classifier, cluster module, pre-processing, model training, and evaluation. HYW wrote the main manuscript text and prepared all the figures and tables. JC and NW coordinated the study and proofread the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Ning Wang .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Consent for publication

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Chen, J., Wu, H. & Wang, N. KEGG orthology prediction of bacterial proteins using natural language processing. BMC Bioinformatics 25 , 146 (2024). https://doi.org/10.1186/s12859-024-05766-x

Download citation

Received : 09 October 2023

Accepted : 03 April 2024

Published : 11 April 2024

DOI : https://doi.org/10.1186/s12859-024-05766-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Protein function prediction
  • Protein language model
  • Deep learning

BMC Bioinformatics

ISSN: 1471-2105

what are the 5 research methods

An official website of the United States government

Here’s how you know

Official websites use .gov A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS A lock ( Lock Locked padlock icon ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

  • Entire Site
  • Research & Funding
  • Health Information
  • About NIDDK
  • Diabetes Overview

Healthy Living with Diabetes

  • Español

On this page:

How can I plan what to eat or drink when I have diabetes?

How can physical activity help manage my diabetes, what can i do to reach or maintain a healthy weight, should i quit smoking, how can i take care of my mental health, clinical trials for healthy living with diabetes.

Healthy living is a way to manage diabetes . To have a healthy lifestyle, take steps now to plan healthy meals and snacks, do physical activities, get enough sleep, and quit smoking or using tobacco products.

Healthy living may help keep your body’s blood pressure , cholesterol , and blood glucose level, also called blood sugar level, in the range your primary health care professional recommends. Your primary health care professional may be a doctor, a physician assistant, or a nurse practitioner. Healthy living may also help prevent or delay health problems  from diabetes that can affect your heart, kidneys, eyes, brain, and other parts of your body.

Making lifestyle changes can be hard, but starting with small changes and building from there may benefit your health. You may want to get help from family, loved ones, friends, and other trusted people in your community. You can also get information from your health care professionals.

What you choose to eat, how much you eat, and when you eat are parts of a meal plan. Having healthy foods and drinks can help keep your blood glucose, blood pressure, and cholesterol levels in the ranges your health care professional recommends. If you have overweight or obesity, a healthy meal plan—along with regular physical activity, getting enough sleep, and other healthy behaviors—may help you reach and maintain a healthy weight. In some cases, health care professionals may also recommend diabetes medicines that may help you lose weight, or weight-loss surgery, also called metabolic and bariatric surgery.

Choose healthy foods and drinks

There is no right or wrong way to choose healthy foods and drinks that may help manage your diabetes. Healthy meal plans for people who have diabetes may include

  • dairy or plant-based dairy products
  • nonstarchy vegetables
  • protein foods
  • whole grains

Try to choose foods that include nutrients such as vitamins, calcium , fiber , and healthy fats . Also try to choose drinks with little or no added sugar , such as tap or bottled water, low-fat or non-fat milk, and unsweetened tea, coffee, or sparkling water.

Try to plan meals and snacks that have fewer

  • foods high in saturated fat
  • foods high in sodium, a mineral found in salt
  • sugary foods , such as cookies and cakes, and sweet drinks, such as soda, juice, flavored coffee, and sports drinks

Your body turns carbohydrates , or carbs, from food into glucose, which can raise your blood glucose level. Some fruits, beans, and starchy vegetables—such as potatoes and corn—have more carbs than other foods. Keep carbs in mind when planning your meals.

You should also limit how much alcohol you drink. If you take insulin  or certain diabetes medicines , drinking alcohol can make your blood glucose level drop too low, which is called hypoglycemia . If you do drink alcohol, be sure to eat food when you drink and remember to check your blood glucose level after drinking. Talk with your health care team about your alcohol-drinking habits.

A woman in a wheelchair, chopping vegetables at a kitchen table.

Find the best times to eat or drink

Talk with your health care professional or health care team about when you should eat or drink. The best time to have meals and snacks may depend on

  • what medicines you take for diabetes
  • what your level of physical activity or your work schedule is
  • whether you have other health conditions or diseases

Ask your health care team if you should eat before, during, or after physical activity. Some diabetes medicines, such as sulfonylureas  or insulin, may make your blood glucose level drop too low during exercise or if you skip or delay a meal.

Plan how much to eat or drink

You may worry that having diabetes means giving up foods and drinks you enjoy. The good news is you can still have your favorite foods and drinks, but you might need to have them in smaller portions  or enjoy them less often.

For people who have diabetes, carb counting and the plate method are two common ways to plan how much to eat or drink. Talk with your health care professional or health care team to find a method that works for you.

Carb counting

Carbohydrate counting , or carb counting, means planning and keeping track of the amount of carbs you eat and drink in each meal or snack. Not all people with diabetes need to count carbs. However, if you take insulin, counting carbs can help you know how much insulin to take.

Plate method

The plate method helps you control portion sizes  without counting and measuring. This method divides a 9-inch plate into the following three sections to help you choose the types and amounts of foods to eat for each meal.

  • Nonstarchy vegetables—such as leafy greens, peppers, carrots, or green beans—should make up half of your plate.
  • Carb foods that are high in fiber—such as brown rice, whole grains, beans, or fruits—should make up one-quarter of your plate.
  • Protein foods—such as lean meats, fish, dairy, or tofu or other soy products—should make up one quarter of your plate.

If you are not taking insulin, you may not need to count carbs when using the plate method.

Plate method, with half of the circular plate filled with nonstarchy vegetables; one fourth of the plate showing carbohydrate foods, including fruits; and one fourth of the plate showing protein foods. A glass filled with water, or another zero-calorie drink, is on the side.

Work with your health care team to create a meal plan that works for you. You may want to have a diabetes educator  or a registered dietitian  on your team. A registered dietitian can provide medical nutrition therapy , which includes counseling to help you create and follow a meal plan. Your health care team may be able to recommend other resources, such as a healthy lifestyle coach, to help you with making changes. Ask your health care team or your insurance company if your benefits include medical nutrition therapy or other diabetes care resources.

Talk with your health care professional before taking dietary supplements

There is no clear proof that specific foods, herbs, spices, or dietary supplements —such as vitamins or minerals—can help manage diabetes. Your health care professional may ask you to take vitamins or minerals if you can’t get enough from foods. Talk with your health care professional before you take any supplements, because some may cause side effects or affect how well your diabetes medicines work.

Research shows that regular physical activity helps people manage their diabetes and stay healthy. Benefits of physical activity may include

  • lower blood glucose, blood pressure, and cholesterol levels
  • better heart health
  • healthier weight
  • better mood and sleep
  • better balance and memory

Talk with your health care professional before starting a new physical activity or changing how much physical activity you do. They may suggest types of activities based on your ability, schedule, meal plan, interests, and diabetes medicines. Your health care professional may also tell you the best times of day to be active or what to do if your blood glucose level goes out of the range recommended for you.

Two women walking outside.

Do different types of physical activity

People with diabetes can be active, even if they take insulin or use technology such as insulin pumps .

Try to do different kinds of activities . While being more active may have more health benefits, any physical activity is better than none. Start slowly with activities you enjoy. You may be able to change your level of effort and try other activities over time. Having a friend or family member join you may help you stick to your routine.

The physical activities you do may need to be different if you are age 65 or older , are pregnant , or have a disability or health condition . Physical activities may also need to be different for children and teens . Ask your health care professional or health care team about activities that are safe for you.

Aerobic activities

Aerobic activities make you breathe harder and make your heart beat faster. You can try walking, dancing, wheelchair rolling, or swimming. Most adults should try to get at least 150 minutes of moderate-intensity physical activity each week. Aim to do 30 minutes a day on most days of the week. You don’t have to do all 30 minutes at one time. You can break up physical activity into small amounts during your day and still get the benefit. 1

Strength training or resistance training

Strength training or resistance training may make your muscles and bones stronger. You can try lifting weights or doing other exercises such as wall pushups or arm raises. Try to do this kind of training two times a week. 1

Balance and stretching activities

Balance and stretching activities may help you move better and have stronger muscles and bones. You may want to try standing on one leg or stretching your legs when sitting on the floor. Try to do these kinds of activities two or three times a week. 1

Some activities that need balance may be unsafe for people with nerve damage or vision problems caused by diabetes. Ask your health care professional or health care team about activities that are safe for you.

 Group of people doing stretching exercises outdoors.

Stay safe during physical activity

Staying safe during physical activity is important. Here are some tips to keep in mind.

Drink liquids

Drinking liquids helps prevent dehydration , or the loss of too much water in your body. Drinking water is a way to stay hydrated. Sports drinks often have a lot of sugar and calories , and you don’t need them for most moderate physical activities.

Avoid low blood glucose

Check your blood glucose level before, during, and right after physical activity. Physical activity often lowers the level of glucose in your blood. Low blood glucose levels may last for hours or days after physical activity. You are most likely to have low blood glucose if you take insulin or some other diabetes medicines, such as sulfonylureas.

Ask your health care professional if you should take less insulin or eat carbs before, during, or after physical activity. Low blood glucose can be a serious medical emergency that must be treated right away. Take steps to protect yourself. You can learn how to treat low blood glucose , let other people know what to do if you need help, and use a medical alert bracelet.

Avoid high blood glucose and ketoacidosis

Taking less insulin before physical activity may help prevent low blood glucose, but it may also make you more likely to have high blood glucose. If your body does not have enough insulin, it can’t use glucose as a source of energy and will use fat instead. When your body uses fat for energy, your body makes chemicals called ketones .

High levels of ketones in your blood can lead to a condition called diabetic ketoacidosis (DKA) . DKA is a medical emergency that should be treated right away. DKA is most common in people with type 1 diabetes . Occasionally, DKA may affect people with type 2 diabetes  who have lost their ability to produce insulin. Ask your health care professional how much insulin you should take before physical activity, whether you need to test your urine for ketones, and what level of ketones is dangerous for you.

Take care of your feet

People with diabetes may have problems with their feet because high blood glucose levels can damage blood vessels and nerves. To help prevent foot problems, wear comfortable and supportive shoes and take care of your feet  before, during, and after physical activity.

A man checks his foot while a woman watches over his shoulder.

If you have diabetes, managing your weight  may bring you several health benefits. Ask your health care professional or health care team if you are at a healthy weight  or if you should try to lose weight.

If you are an adult with overweight or obesity, work with your health care team to create a weight-loss plan. Losing 5% to 7% of your current weight may help you prevent or improve some health problems  and manage your blood glucose, cholesterol, and blood pressure levels. 2 If you are worried about your child’s weight  and they have diabetes, talk with their health care professional before your child starts a new weight-loss plan.

You may be able to reach and maintain a healthy weight by

  • following a healthy meal plan
  • consuming fewer calories
  • being physically active
  • getting 7 to 8 hours of sleep each night 3

If you have type 2 diabetes, your health care professional may recommend diabetes medicines that may help you lose weight.

Online tools such as the Body Weight Planner  may help you create eating and physical activity plans. You may want to talk with your health care professional about other options for managing your weight, including joining a weight-loss program  that can provide helpful information, support, and behavioral or lifestyle counseling. These options may have a cost, so make sure to check the details of the programs.

Your health care professional may recommend weight-loss surgery  if you aren’t able to reach a healthy weight with meal planning, physical activity, and taking diabetes medicines that help with weight loss.

If you are pregnant , trying to lose weight may not be healthy. However, you should ask your health care professional whether it makes sense to monitor or limit your weight gain during pregnancy.

Both diabetes and smoking —including using tobacco products and e-cigarettes—cause your blood vessels to narrow. Both diabetes and smoking increase your risk of having a heart attack or stroke , nerve damage , kidney disease , eye disease , or amputation . Secondhand smoke can also affect the health of your family or others who live with you.

If you smoke or use other tobacco products, stop. Ask for help . You don’t have to do it alone.

Feeling stressed, sad, or angry can be common for people with diabetes. Managing diabetes or learning to cope with new information about your health can be hard. People with chronic illnesses such as diabetes may develop anxiety or other mental health conditions .

Learn healthy ways to lower your stress , and ask for help from your health care team or a mental health professional. While it may be uncomfortable to talk about your feelings, finding a health care professional whom you trust and want to talk with may help you

  • lower your feelings of stress, depression, or anxiety
  • manage problems sleeping or remembering things
  • see how diabetes affects your family, school, work, or financial situation

Ask your health care team for mental health resources for people with diabetes.

Sleeping too much or too little may raise your blood glucose levels. Your sleep habits may also affect your mental health and vice versa. People with diabetes and overweight or obesity can also have other health conditions that affect sleep, such as sleep apnea , which can raise your blood pressure and risk of heart disease.

Man with obesity looking distressed talking with a health care professional.

NIDDK conducts and supports clinical trials in many diseases and conditions, including diabetes. The trials look to find new ways to prevent, detect, or treat disease and improve quality of life.

What are clinical trials for healthy living with diabetes?

Clinical trials—and other types of clinical studies —are part of medical research and involve people like you. When you volunteer to take part in a clinical study, you help health care professionals and researchers learn more about disease and improve health care for people in the future.

Researchers are studying many aspects of healthy living for people with diabetes, such as

  • how changing when you eat may affect body weight and metabolism
  • how less access to healthy foods may affect diabetes management, other health problems, and risk of dying
  • whether low-carbohydrate meal plans can help lower blood glucose levels
  • which diabetes medicines are more likely to help people lose weight

Find out if clinical trials are right for you .

Watch a video of NIDDK Director Dr. Griffin P. Rodgers explaining the importance of participating in clinical trials.

What clinical trials for healthy living with diabetes are looking for participants?

You can view a filtered list of clinical studies on healthy living with diabetes that are federally funded, open, and recruiting at www.ClinicalTrials.gov . You can expand or narrow the list to include clinical studies from industry, universities, and individuals; however, the National Institutes of Health does not review these studies and cannot ensure they are safe for you. Always talk with your primary health care professional before you participate in a clinical study.

This content is provided as a service of the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), part of the National Institutes of Health. NIDDK translates and disseminates research findings to increase knowledge and understanding about health and disease among patients, health professionals, and the public. Content produced by NIDDK is carefully reviewed by NIDDK scientists and other experts.

NIDDK would like to thank: Elizabeth M. Venditti, Ph.D., University of Pittsburgh School of Medicine.

IMAGES

  1. Types of Research Methodology: Uses, Types & Benefits

    what are the 5 research methods

  2. Different Types of Research

    what are the 5 research methods

  3. Research Methods

    what are the 5 research methods

  4. Research Methods

    what are the 5 research methods

  5. Types of Research Archives

    what are the 5 research methods

  6. Types of Research Methodology: Uses, Types & Benefits

    what are the 5 research methods

VIDEO

  1. Understanding Research Processes and Practices

  2. Research Methodology by Ranjit Kumar: Designed with students, for students

  3. PSY512 Gender Studies || Lecture # 5 || Research Methods for Gender Issues

  4. Film Industry Dadagiri 🔥 Dharmendra #bollywood #viral #viralvideo

  5. Psychology Experiments

  6. 5. Research Methods

COMMENTS

  1. Research Methods

    Research methods are specific procedures for collecting and analyzing data. Developing your research methods is an integral part of your research design. When planning your methods, there are two key decisions you will make. First, decide how you will collect data. Your methods depend on what type of data you need to answer your research question:

  2. Research Methods--Quantitative, Qualitative, and More: Overview

    About Research Methods. This guide provides an overview of research methods, how to choose and use them, and supports and resources at UC Berkeley. As Patten and Newhart note in the book Understanding Research Methods, "Research methods are the building blocks of the scientific enterprise. They are the "how" for building systematic knowledge.

  3. Research Methods

    Quantitative research methods are used to collect and analyze numerical data. This type of research is useful when the objective is to test a hypothesis, determine cause-and-effect relationships, and measure the prevalence of certain phenomena. Quantitative research methods include surveys, experiments, and secondary data analysis.

  4. 2.2 Research Methods

    Recall the 6 Steps of the Scientific Method. Differentiate between four kinds of research methods: surveys, field research, experiments, and secondary data analysis. Explain the appropriateness of specific research approaches for specific topics. Sociologists examine the social world, see a problem or interesting pattern, and set out to study it.

  5. Types of Research Methods (With Best Practices and Examples)

    Research methods are processes used to collect data. You can use this data to analyze current methods or procedures and to find additional information on a topic. Professionals use research methods while studying medicine, human behavior and other scholarly topics. There are two main categories of research methods: qualitative research methods ...

  6. Research Methods In Psychology

    Research methods in psychology are systematic procedures used to observe, describe, predict, and explain behavior and mental processes. They include experiments, surveys, case studies, and naturalistic observations, ensuring data collection is objective and reliable to understand and explain psychological phenomena.

  7. Research Methods: What are research methods?

    What are research methods. Research methods are the strategies, processes or techniques utilized in the collection of data or evidence for analysis in order to uncover new information or create better understanding of a topic. There are different types of research methods which use different tools for data collection.

  8. What are research methods?

    Research methods are different from research methodologies because they are the ways in which you will collect the data for your research project. The best method for your project largely depends on your topic, the type of data you will need, and the people or items from which you will be collecting data. The following boxes below contain a ...

  9. Research Methods

    You can also take a mixed methods approach, where you use both qualitative and quantitative research methods. Primary vs secondary data. Primary data are any original information that you collect for the purposes of answering your research question (e.g. through surveys, observations and experiments). Secondary data are information that has already been collected by other researchers (e.g. in ...

  10. Choosing the Right Research Methodology: A Guide

    Choosing an optimal research methodology is crucial for the success of any research project. The methodology you select will determine the type of data you collect, how you collect it, and how you analyse it. Understanding the different types of research methods available along with their strengths and weaknesses, is thus imperative to make an ...

  11. What Is Qualitative Research?

    Qualitative research methods. Each of the research approaches involve using one or more data collection methods.These are some of the most common qualitative methods: Observations: recording what you have seen, heard, or encountered in detailed field notes. Interviews: personally asking people questions in one-on-one conversations. Focus groups: asking questions and generating discussion among ...

  12. 5 Most Popular Research Methods in Psychology

    The truth is there are many. But the main types of research methods used in psychology are quantitative and qualitative. Quantitative research involves using data to: Make descriptions. Predict outcomes. Test an independent variable. And qualitative research uses qualitative data collection from: Speech. Text.

  13. PDF Comparing the Five Approaches

    All five approaches have in common the general process of research that begins with . a research problem and proceeds to the questions, the data, the data analysis and interpretations, and the research report. Qualitative researchers have found it helpful to see at this point an overall sketch for each of the five approaches. From these sketches

  14. Research Methodology

    The research methodology is an important section of any research paper or thesis, as it describes the methods and procedures that will be used to conduct the research. It should include details about the research design, data collection methods, data analysis techniques, and any ethical considerations.

  15. (PDF) Understanding research methods: An overview of the essentials

    Abstract. A perennial bestseller since 1997, this updated tenth edition of Understanding Research Methods provides a detailed overview of all the important concepts traditionally covered in a ...

  16. Research Methods Guide: Research Design & Method

    Most frequently used methods include: Observation / Participant Observation. Surveys. Interviews. Focus Groups. Experiments. Secondary Data Analysis / Archival Study. Mixed Methods (combination of some of the above) One particular method could be better suited to your research goal than others, because the data you collect from different ...

  17. Different Types of Research Methods

    Descriptive research - The study variables are analyzed and a summary of the same is seeked. Correlational Research - The relationship between the study variables is analyzed. Experimental Research -It is deciphered to analyse whether a cause and effect relationship between the variables exists. Quantitative research methods.

  18. 15 Types of Research Methods (2024)

    These methods are useful when a detailed understanding of a phenomenon is sought. 1. Ethnographic Research. Ethnographic research emerged out of anthropological research, where anthropologists would enter into a setting for a sustained period of time, getting to know a cultural group and taking detailed observations.

  19. Research Techniques

    Some common methods of research techniques are: Quantitative research: This is a research method that focuses on collecting and analyzing numerical data to establish patterns, relationships, and cause-and-effect relationships. Examples of quantitative research techniques are surveys, experiments, and statistical analysis.

  20. 5 Types of Qualitative Methods

    A popular and helpful categorization separate qualitative methods into five groups: ethnography, narrative, phenomenological, grounded theory, and case study. John Creswell outlines these five methods in Qualitative Inquiry and Research Design. While the five methods generally use similar data collection techniques (observation, interviews, and ...

  21. Our Methods

    Pew Research Center is committed to meeting the highest methodological standards — and to exploring the newest frontiers of research. Learn more about the methods the Center uses to conduct objective, non-partisan research on a wide range of topics that is trusted around the world.

  22. Pretraining a foundation model for generalizable fluorescence ...

    The proposed UniFMIR method could offer near-isotropic imaging by enhancing the axial resolution, facilitating the subsequent quantification of the shapes and volumes of biological samples (Fig ...

  23. A systematic review of the clinical features of pneumonia in children

    Data relating to research context, study type, clinical features of pneumonia and comparisons with children less than 5 years old were extracted. For each clinical feature of pneumonia, we described mean percentage (95% confidence interval) of participants with this finding in terms of aetiology (all cause vs Mycoplasma pneumoniae ), and method ...

  24. Phycological Research

    Phycological Research is a phycology journal publishing research across all aspects of algal science including ecology, physiology, evolution, genetics ... However, the same method is one of the most time-consuming because, for its implementation, it is necessary to carry out long preparatory work. The KoT results were less accurate than the ...

  25. 5 Reasons Every Church Needs a Crisis Communication Plan

    Here are five reasons it's wise to craft a crisis communication plan (with a practical tip for creating an effective plan at the end of each point). 1. Bad things happen to good people. In times of crisis, whether it's a natural disaster, a scandal, or unexpected challenges, even the most well-meaning church can find itself thrust into the ...

  26. Qualitative Research

    Qualitative Research. Qualitative research is a type of research methodology that focuses on exploring and understanding people's beliefs, attitudes, behaviors, and experiences through the collection and analysis of non-numerical data. It seeks to answer research questions through the examination of subjective data, such as interviews, focus ...

  27. AI makes retinal imaging 100 times faster, compared to manual method

    Clinical and Translational Imaging. Researchers at the National Institutes of Health applied artificial intelligence (AI) to a technique that produces high-resolution images of cells in the eye. They report that with AI, imaging is 100 times faster and improves image contrast 3.5-fold. The advance, they say, will provide researchers with a ...

  28. KEGG orthology prediction of bacterial proteins using natural language

    Our pipeline excels in predicting distant relatives, providing a new solution to address the challenges faced by traditional homology-based methods. For future research, we suggest exploring the integration of NLP-based methods with traditional methods to fully use their complementary advantages in KO prediction, thus improving prediction ...

  29. Healthy Living with Diabetes

    Healthy living is a way to manage diabetes. To have a healthy lifestyle, take steps now to plan healthy meals and snacks, do physical activities, get enough sleep, and quit smoking or using tobacco products. Healthy living may help keep your body's blood pressure, cholesterol, and blood glucose level, also called blood sugar level, in the ...