• Privacy Policy

Buy Me a Coffee

Research Method

Home » Evaluating Research – Process, Examples and Methods

Evaluating Research – Process, Examples and Methods

Table of Contents

Evaluating Research

Evaluating Research


Evaluating Research refers to the process of assessing the quality, credibility, and relevance of a research study or project. This involves examining the methods, data, and results of the research in order to determine its validity, reliability, and usefulness. Evaluating research can be done by both experts and non-experts in the field, and involves critical thinking, analysis, and interpretation of the research findings.

Research Evaluating Process

The process of evaluating research typically involves the following steps:

Identify the Research Question

The first step in evaluating research is to identify the research question or problem that the study is addressing. This will help you to determine whether the study is relevant to your needs.

Assess the Study Design

The study design refers to the methodology used to conduct the research. You should assess whether the study design is appropriate for the research question and whether it is likely to produce reliable and valid results.

Evaluate the Sample

The sample refers to the group of participants or subjects who are included in the study. You should evaluate whether the sample size is adequate and whether the participants are representative of the population under study.

Review the Data Collection Methods

You should review the data collection methods used in the study to ensure that they are valid and reliable. This includes assessing the measures used to collect data and the procedures used to collect data.

Examine the Statistical Analysis

Statistical analysis refers to the methods used to analyze the data. You should examine whether the statistical analysis is appropriate for the research question and whether it is likely to produce valid and reliable results.

Assess the Conclusions

You should evaluate whether the data support the conclusions drawn from the study and whether they are relevant to the research question.

Consider the Limitations

Finally, you should consider the limitations of the study, including any potential biases or confounding factors that may have influenced the results.

Evaluating Research Methods

Evaluating Research Methods are as follows:

  • Peer review: Peer review is a process where experts in the field review a study before it is published. This helps ensure that the study is accurate, valid, and relevant to the field.
  • Critical appraisal : Critical appraisal involves systematically evaluating a study based on specific criteria. This helps assess the quality of the study and the reliability of the findings.
  • Replication : Replication involves repeating a study to test the validity and reliability of the findings. This can help identify any errors or biases in the original study.
  • Meta-analysis : Meta-analysis is a statistical method that combines the results of multiple studies to provide a more comprehensive understanding of a particular topic. This can help identify patterns or inconsistencies across studies.
  • Consultation with experts : Consulting with experts in the field can provide valuable insights into the quality and relevance of a study. Experts can also help identify potential limitations or biases in the study.
  • Review of funding sources: Examining the funding sources of a study can help identify any potential conflicts of interest or biases that may have influenced the study design or interpretation of results.

Example of Evaluating Research

Example of Evaluating Research sample for students:

Title of the Study: The Effects of Social Media Use on Mental Health among College Students

Sample Size: 500 college students

Sampling Technique : Convenience sampling

  • Sample Size: The sample size of 500 college students is a moderate sample size, which could be considered representative of the college student population. However, it would be more representative if the sample size was larger, or if a random sampling technique was used.
  • Sampling Technique : Convenience sampling is a non-probability sampling technique, which means that the sample may not be representative of the population. This technique may introduce bias into the study since the participants are self-selected and may not be representative of the entire college student population. Therefore, the results of this study may not be generalizable to other populations.
  • Participant Characteristics: The study does not provide any information about the demographic characteristics of the participants, such as age, gender, race, or socioeconomic status. This information is important because social media use and mental health may vary among different demographic groups.
  • Data Collection Method: The study used a self-administered survey to collect data. Self-administered surveys may be subject to response bias and may not accurately reflect participants’ actual behaviors and experiences.
  • Data Analysis: The study used descriptive statistics and regression analysis to analyze the data. Descriptive statistics provide a summary of the data, while regression analysis is used to examine the relationship between two or more variables. However, the study did not provide information about the statistical significance of the results or the effect sizes.

Overall, while the study provides some insights into the relationship between social media use and mental health among college students, the use of a convenience sampling technique and the lack of information about participant characteristics limit the generalizability of the findings. In addition, the use of self-administered surveys may introduce bias into the study, and the lack of information about the statistical significance of the results limits the interpretation of the findings.

Note*: Above mentioned example is just a sample for students. Do not copy and paste directly into your assignment. Kindly do your own research for academic purposes.

Applications of Evaluating Research

Here are some of the applications of evaluating research:

  • Identifying reliable sources : By evaluating research, researchers, students, and other professionals can identify the most reliable sources of information to use in their work. They can determine the quality of research studies, including the methodology, sample size, data analysis, and conclusions.
  • Validating findings: Evaluating research can help to validate findings from previous studies. By examining the methodology and results of a study, researchers can determine if the findings are reliable and if they can be used to inform future research.
  • Identifying knowledge gaps: Evaluating research can also help to identify gaps in current knowledge. By examining the existing literature on a topic, researchers can determine areas where more research is needed, and they can design studies to address these gaps.
  • Improving research quality : Evaluating research can help to improve the quality of future research. By examining the strengths and weaknesses of previous studies, researchers can design better studies and avoid common pitfalls.
  • Informing policy and decision-making : Evaluating research is crucial in informing policy and decision-making in many fields. By examining the evidence base for a particular issue, policymakers can make informed decisions that are supported by the best available evidence.
  • Enhancing education : Evaluating research is essential in enhancing education. Educators can use research findings to improve teaching methods, curriculum development, and student outcomes.

Purpose of Evaluating Research

Here are some of the key purposes of evaluating research:

  • Determine the reliability and validity of research findings : By evaluating research, researchers can determine the quality of the study design, data collection, and analysis. They can determine whether the findings are reliable, valid, and generalizable to other populations.
  • Identify the strengths and weaknesses of research studies: Evaluating research helps to identify the strengths and weaknesses of research studies, including potential biases, confounding factors, and limitations. This information can help researchers to design better studies in the future.
  • Inform evidence-based decision-making: Evaluating research is crucial in informing evidence-based decision-making in many fields, including healthcare, education, and public policy. Policymakers, educators, and clinicians rely on research evidence to make informed decisions.
  • Identify research gaps : By evaluating research, researchers can identify gaps in the existing literature and design studies to address these gaps. This process can help to advance knowledge and improve the quality of research in a particular field.
  • Ensure research ethics and integrity : Evaluating research helps to ensure that research studies are conducted ethically and with integrity. Researchers must adhere to ethical guidelines to protect the welfare and rights of study participants and to maintain the trust of the public.

Characteristics Evaluating Research

Characteristics Evaluating Research are as follows:

  • Research question/hypothesis: A good research question or hypothesis should be clear, concise, and well-defined. It should address a significant problem or issue in the field and be grounded in relevant theory or prior research.
  • Study design: The research design should be appropriate for answering the research question and be clearly described in the study. The study design should also minimize bias and confounding variables.
  • Sampling : The sample should be representative of the population of interest and the sampling method should be appropriate for the research question and study design.
  • Data collection : The data collection methods should be reliable and valid, and the data should be accurately recorded and analyzed.
  • Results : The results should be presented clearly and accurately, and the statistical analysis should be appropriate for the research question and study design.
  • Interpretation of results : The interpretation of the results should be based on the data and not influenced by personal biases or preconceptions.
  • Generalizability: The study findings should be generalizable to the population of interest and relevant to other settings or contexts.
  • Contribution to the field : The study should make a significant contribution to the field and advance our understanding of the research question or issue.

Advantages of Evaluating Research

Evaluating research has several advantages, including:

  • Ensuring accuracy and validity : By evaluating research, we can ensure that the research is accurate, valid, and reliable. This ensures that the findings are trustworthy and can be used to inform decision-making.
  • Identifying gaps in knowledge : Evaluating research can help identify gaps in knowledge and areas where further research is needed. This can guide future research and help build a stronger evidence base.
  • Promoting critical thinking: Evaluating research requires critical thinking skills, which can be applied in other areas of life. By evaluating research, individuals can develop their critical thinking skills and become more discerning consumers of information.
  • Improving the quality of research : Evaluating research can help improve the quality of research by identifying areas where improvements can be made. This can lead to more rigorous research methods and better-quality research.
  • Informing decision-making: By evaluating research, we can make informed decisions based on the evidence. This is particularly important in fields such as medicine and public health, where decisions can have significant consequences.
  • Advancing the field : Evaluating research can help advance the field by identifying new research questions and areas of inquiry. This can lead to the development of new theories and the refinement of existing ones.

Limitations of Evaluating Research

Limitations of Evaluating Research are as follows:

  • Time-consuming: Evaluating research can be time-consuming, particularly if the study is complex or requires specialized knowledge. This can be a barrier for individuals who are not experts in the field or who have limited time.
  • Subjectivity : Evaluating research can be subjective, as different individuals may have different interpretations of the same study. This can lead to inconsistencies in the evaluation process and make it difficult to compare studies.
  • Limited generalizability: The findings of a study may not be generalizable to other populations or contexts. This limits the usefulness of the study and may make it difficult to apply the findings to other settings.
  • Publication bias: Research that does not find significant results may be less likely to be published, which can create a bias in the published literature. This can limit the amount of information available for evaluation.
  • Lack of transparency: Some studies may not provide enough detail about their methods or results, making it difficult to evaluate their quality or validity.
  • Funding bias : Research funded by particular organizations or industries may be biased towards the interests of the funder. This can influence the study design, methods, and interpretation of results.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Data collection

Data Collection – Methods Types and Examples


Delimitations in Research – Types, Examples and...

Research Process

Research Process – Steps, Examples and Tips

Research Design

Research Design – Types, Methods and Examples

Institutional Review Board (IRB)

Institutional Review Board – Application Sample...

Research Questions

Research Questions – Types, Examples and Writing...

  • Search Menu
  • Advance articles
  • Author Guidelines
  • Submission Site
  • Open Access
  • Why Publish?
  • About Science and Public Policy
  • Editorial Board
  • Advertising and Corporate Services
  • Journals Career Network
  • Self-Archiving Policy
  • Dispatch Dates
  • Journals on Oxford Academic
  • Books on Oxford Academic

Issue Cover

Article Contents

1. introduction, 2. background, 4. findings, 5. discussion, 6. conclusion and final remarks, supplementary material, data availability, conflict of interest statement., acknowledgements.

  • < Previous

Evaluation of research proposals by peer review panels: broader panels for broader assessments?

ORCID logo

  • Article contents
  • Figures & tables
  • Supplementary Data

Rebecca Abma-Schouten, Joey Gijbels, Wendy Reijmerink, Ingeborg Meijer, Evaluation of research proposals by peer review panels: broader panels for broader assessments?, Science and Public Policy , Volume 50, Issue 4, August 2023, Pages 619–632, https://doi.org/10.1093/scipol/scad009

  • Permissions Icon Permissions

Panel peer review is widely used to decide which research proposals receive funding. Through this exploratory observational study at two large biomedical and health research funders in the Netherlands, we gain insight into how scientific quality and societal relevance are discussed in panel meetings. We explore, in ten review panel meetings of biomedical and health funding programmes, how panel composition and formal assessment criteria affect the arguments used. We observe that more scientific arguments are used than arguments related to societal relevance and expected impact. Also, more diverse panels result in a wider range of arguments, largely for the benefit of arguments related to societal relevance and impact. We discuss how funders can contribute to the quality of peer review by creating a shared conceptual framework that better defines research quality and societal relevance. We also contribute to a further understanding of the role of diverse peer review panels.

Scientific biomedical and health research is often supported by project or programme grants from public funding agencies such as governmental research funders and charities. Research funders primarily rely on peer review, often a combination of independent written review and discussion in a peer review panel, to inform their funding decisions. Peer review panels have the difficult task of integrating and balancing the various assessment criteria to select and rank the eligible proposals. With the increasing emphasis on societal benefit and being responsive to societal needs, the assessment of research proposals ought to include broader assessment criteria, including both scientific quality and societal relevance, and a broader perspective on relevant peers. This results in new practices of including non-scientific peers in review panels ( Del Carmen Calatrava Moreno et al. 2019 ; Den Oudendammer et al. 2019 ; Van den Brink et al. 2016 ). Relevant peers, in the context of biomedical and health research, include, for example, health-care professionals, (healthcare) policymakers, and patients as the (end-)users of research.

Currently, in scientific and grey literature, much attention is paid to what legitimate criteria are and to deficiencies in the peer review process, for example, focusing on the role of chance and the difficulty of assessing interdisciplinary or ‘blue sky’ research ( Langfeldt 2006 ; Roumbanis 2021a ). Our research primarily builds upon the work of Lamont (2009) , Huutoniemi (2012) , and Kolarz et al. (2016) . Their work articulates how the discourse in peer review panels can be understood by giving insight into disciplinary assessment cultures and social dynamics, as well as how panel members define and value concepts such as scientific excellence, interdisciplinarity, and societal impact. At the same time, there is little empirical work on what actually is discussed in peer review meetings and to what extent this is related to the specific objectives of the research funding programme. Such observational work is especially lacking in the biomedical and health domain.

The aim of our exploratory study is to learn what arguments panel members use in a review meeting when assessing research proposals in biomedical and health research programmes. We explore how arguments used in peer review panels are affected by (1) the formal assessment criteria and (2) the inclusion of non-scientific peers in review panels, also called (end-)users of research, societal stakeholders, or societal actors. We add to the existing literature by focusing on the actual arguments used in peer review assessment in practice.

To this end, we observed ten panel meetings in a variety of eight biomedical and health research programmes at two large research funders in the Netherlands: the governmental research funder The Netherlands Organisation for Health Research and Development (ZonMw) and the charitable research funder the Dutch Heart Foundation (DHF). Our first research question focuses on what arguments panel members use when assessing research proposals in a review meeting. The second examines to what extent these arguments correspond with the formal −as described in the programme brochure and assessment form− criteria on scientific quality and societal impact creation. The third question focuses on how arguments used differ between panel members with different perspectives.

2.1 Relation between science and society

To understand the dual focus of scientific quality and societal relevance in research funding, a theoretical understanding and a practical operationalisation of the relation between science and society are needed. The conceptualisation of this relationship affects both who are perceived as relevant peers in the review process and the criteria by which research proposals are assessed.

The relationship between science and society is not constant over time nor static, yet a relation that is much debated. Scientific knowledge can have a huge impact on societies, either intended or unintended. Vice versa, the social environment and structure in which science takes place influence the rate of development, the topics of interest, and the content of science. However, the second part of this inter-relatedness between science and society generally receives less attention ( Merton 1968 ; Weingart 1999 ).

From a historical perspective, scientific and technological progress contributed to the view that science was valuable on its own account and that science and the scientist stood independent of society. While this protected science from unwarranted political influence, societal disengagement with science resulted in less authority by science and debate about its contribution to society. This interdependence and mutual influence contributed to a modern view of science in which knowledge development is valued both on its own merit and for its impact on, and interaction with, society. As such, societal factors and problems are important drivers for scientific research. This warrants that the relation and boundaries between science, society, and politics need to be organised and constantly reinforced and reiterated ( Merton 1968 ; Shapin 2008 ; Weingart 1999 ).

Glerup and Horst (2014) conceptualise the value of science to society and the role of society in science in four rationalities that reflect different justifications for their relation and thus also for who is responsible for (assessing) the societal value of science. The rationalities are arranged along two axes: one is related to the internal or external regulation of science and the other is related to either the process or the outcome of science as the object of steering. The first two rationalities of Reflexivity and Demarcation focus on internal regulation in the scientific community. Reflexivity focuses on the outcome. Central is that science, and thus, scientists should learn from societal problems and provide solutions. Demarcation focuses on the process: science should continuously question its own motives and methods. The latter two rationalities of Contribution and Integration focus on external regulation. The core of the outcome-oriented Contribution rationality is that scientists do not necessarily see themselves as ‘working for the public good’. Science should thus be regulated by society to ensure that outcomes are useful. The central idea of the process-oriented Integration rationality is that societal actors should be involved in science in order to influence the direction of research.

Research funders can be seen as external or societal regulators of science. They can focus on organising the process of science, Integration, or on scientific outcomes that function as solutions for societal challenges, Contribution. In the Contribution perspective, a funder could enhance outside (societal) involvement in science to ensure that scientists take responsibility to deliver results that are needed and used by society. From Integration follows that actors from science and society need to work together in order to produce the best results. In this perspective, there is a lack of integration between science and society and more collaboration and dialogue are needed to develop a new kind of integrative responsibility ( Glerup and Horst 2014 ). This argues for the inclusion of other types of evaluators in research assessment. In reality, these rationalities are not mutually exclusive and also not strictly separated. As a consequence, multiple rationalities can be recognised in the reasoning of scientists and in the policies of research funders today.

2.2 Criteria for research quality and societal relevance

The rationalities of Glerup and Horst have consequences for which language is used to discuss societal relevance and impact in research proposals. Even though the main ingredients are quite similar, as a consequence of the coexisting rationalities in science, societal aspects can be defined and operationalised in different ways ( Alla et al. 2017 ). In the definition of societal impact by Reed, emphasis is placed on the outcome : the contribution to society. It includes the significance for society, the size of potential impact, and the reach , the number of people or organisations benefiting from the expected outcomes ( Reed et al. 2021 ). Other models and definitions focus more on the process of science and its interaction with society. Spaapen and Van Drooge introduced productive interactions in the assessment of societal impact, highlighting a direct contact between researchers and other actors. A key idea is that the interaction in different domains leads to impact in different domains ( Meijer 2012 ; Spaapen and Van Drooge 2011 ). Definitions that focus on the process often refer to societal impact as (1) something that can take place in distinguishable societal domains, (2) something that needs to be actively pursued, and (3) something that requires interactions with societal stakeholders (or users of research) ( Hughes and Kitson 2012 ; Spaapen and Van Drooge 2011 ).

Glerup and Horst show that process and outcome-oriented aspects can be combined in the operationalisation of criteria for assessing research proposals on societal aspects. Also, the funders participating in this study include the outcome—the value created in different domains—and the process—productive interactions with stakeholders—in their formal assessment criteria for societal relevance and impact. Different labels are used for these criteria, such as societal relevance , societal quality , and societal impact ( Abma-Schouten 2017 ; Reijmerink and Oortwijn 2017 ). In this paper, we use societal relevance or societal relevance and impact .

Scientific quality in research assessment frequently refers to all aspects and activities in the study that contribute to the validity and reliability of the research results and that contribute to the integrity and quality of the research process itself. The criteria commonly include the relevance of the proposal for the funding programme, the scientific relevance, originality, innovativeness, methodology, and feasibility ( Abdoul et al. 2012 ). Several studies demonstrated that quality is seen as not only a rich concept but also a complex concept in which excellence and innovativeness, methodological aspects, engagement of stakeholders, multidisciplinary collaboration, and societal relevance all play a role ( Geurts 2016 ; Roumbanis 2019 ; Scholten et al. 2018 ). Another study showed a comprehensive definition of ‘good’ science, which includes creativity, reproducibility, perseverance, intellectual courage, and personal integrity. It demonstrated that ‘good’ science involves not only scientific excellence but also personal values and ethics, and engagement with society ( Van den Brink et al. 2016 ). Noticeable in these studies is the connection made between societal relevance and scientific quality.

In summary, the criteria for scientific quality and societal relevance are conceptualised in different ways, and perspectives on the role of societal value creation and the involvement of societal actors vary strongly. Research funders hence have to pay attention to the meaning of the criteria for the panel members they recruit to help them, and navigate and negotiate how the criteria are applied in assessing research proposals. To be able to do so, more insight is needed in which elements of scientific quality and societal relevance are discussed in practice by peer review panels.

2.3 Role of funders and societal actors in peer review

National governments and charities are important funders of biomedical and health research. How this funding is distributed varies per country. Project funding is frequently allocated based on research programming by specialised public funding organisations, such as the Dutch Research Council in the Netherlands and ZonMw for health research. The DHF, the second largest private non-profit research funder in the Netherlands, provides project funding ( Private Non-Profit Financiering 2020 ). Funders, as so-called boundary organisations, can act as key intermediaries between government, science, and society ( Jasanoff 2011 ). Their responsibility is to develop effective research policies connecting societal demands and scientific ‘supply’. This includes setting up and executing fair and balanced assessment procedures ( Sarewitz and Pielke 2007 ). Herein, the role of societal stakeholders is receiving increasing attention ( Benedictus et al. 2016 ; De Rijcke et al. 2016 ; Dijstelbloem et al. 2013 ; Scholten et al. 2018 ).

All charitable health research funders in the Netherlands have, in the last decade, included patients at different stages of the funding process, including in assessing research proposals ( Den Oudendammer et al. 2019 ). To facilitate research funders in involving patients in assessing research proposals, the federation of Dutch patient organisations set up an independent reviewer panel with (at-risk) patients and direct caregivers ( Patiëntenfederatie Nederland, n.d .). Other foundations have set up societal advisory panels including a wider range of societal actors than patients alone. The Committee Societal Quality (CSQ) of the DHF includes, for example, (at-risk) patients and a wide range of cardiovascular health-care professionals who are not active as academic researchers. This model is also applied by the Diabetes Foundation and the Princess Beatrix Muscle Foundation in the Netherlands ( Diabetesfonds, n.d .; Prinses Beatrix Spierfonds, n.d .).

In 2014, the Lancet presented a series of five papers about biomedical and health research known as the ‘increasing value, reducing waste’ series ( Macleod et al. 2014 ). The authors addressed several issues as well as potential solutions that funders can implement. They highlight, among others, the importance of improving the societal relevance of the research questions and including the burden of disease in research assessment in order to increase the value of biomedical and health science for society. A better understanding of and an increasing role of users of research are also part of the described solutions ( Chalmers et al. 2014 ; Van den Brink et al. 2016 ). This is also in line with the recommendations of the 2013 Declaration on Research Assessment (DORA) ( DORA 2013 ). These recommendations influence the way in which research funders operationalise their criteria in research assessment, how they balance the judgement of scientific and societal aspects, and how they involve societal stakeholders in peer review.

2.4 Panel peer review of research proposals

To assess research proposals, funders rely on the services of peer experts to review the thousands or perhaps millions of research proposals seeking funding each year. While often associated with scholarly publishing, peer review also includes the ex ante assessment of research grant and fellowship applications ( Abdoul et al. 2012 ). Peer review of proposals often includes a written assessment of a proposal by an anonymous peer and a peer review panel meeting to select the proposals eligible for funding. Peer review is an established component of professional academic practice, is deeply embedded in the research culture, and essentially consists of experts in a given domain appraising the professional performance, creativity, and/or quality of scientific work produced by others in their field of competence ( Demicheli and Di Pietrantonj 2007 ). The history of peer review as the default approach for scientific evaluation and accountability is, however, relatively young. While the term was unheard of in the 1960s, by 1970, it had become the standard. Since that time, peer review has become increasingly diverse and formalised, resulting in more public accountability ( Reinhart and Schendzielorz 2021 ).

While many studies have been conducted concerning peer review in scholarly publishing, peer review in grant allocation processes has been less discussed ( Demicheli and Di Pietrantonj 2007 ). The most extensive work on this topic has been conducted by Lamont (2009) . Lamont studied peer review panels in five American research funding organisations, including observing three panels. Other examples include Roumbanis’s ethnographic observations of ten review panels at the Swedish Research Council in natural and engineering sciences ( Roumbanis 2017 , 2021a ). Also, Huutoniemi was able to study, but not observe, four panels on environmental studies and social sciences of the Academy of Finland ( Huutoniemi 2012 ). Additionally, Van Arensbergen and Van den Besselaar (2012) analysed peer review through interviews and by analysing the scores and outcomes at different stages of the peer review process in a talent funding programme. In particular, interesting is the study by Luo and colleagues on 164 written panel review reports, showing that the reviews from panels that included non-scientific peers described broader and more concrete impact topics. Mixed panels also more often connected research processes and characteristics of applicants with impact creation ( Luo et al. 2021 ).

While these studies primarily focused on peer review panels in other disciplinary domains or are based on interviews or reports instead of direct observations, we believe that many of the findings are relevant to the functioning of panels in the context of biomedical and health research. From this literature, we learn to have realistic expectations of peer review. It is inherently difficult to predict in advance which research projects will provide the most important findings or breakthroughs ( Lee et al. 2013 ; Pier et al. 2018 ; Roumbanis 2021a , 2021b ). At the same time, these limitations may not substantiate the replacement of peer review by another assessment approach ( Wessely 1998 ). Many topics addressed in the literature are inter-related and relevant to our study, such as disciplinary differences and interdisciplinarity, social dynamics and their consequences for consistency and bias, and suggestions to improve panel peer review ( Lamont and Huutoniemi 2011 ; Lee et al. 2013 ; Pier et al. 2018 ; Roumbanis 2021a , b ; Wessely 1998 ).

Different scientific disciplines show different preferences and beliefs about how to build knowledge and thus have different perceptions of excellence. However, panellists are willing to respect and acknowledge other standards of excellence ( Lamont 2009 ). Evaluation cultures also differ between scientific fields. Science, technology, engineering, and mathematics panels might, in comparison with panellists from social sciences and humanities, be more concerned with the consistency of the assessment across panels and therefore with clear definitions and uses of assessment criteria ( Lamont and Huutoniemi 2011 ). However, much is still to learn about how panellists’ cognitive affiliations with particular disciplines unfold in the evaluation process. Therefore, the assessment of interdisciplinary research is much more complex than just improving the criteria or procedure because less explicit repertoires would also need to change ( Huutoniemi 2012 ).

Social dynamics play a role as panellists may differ in their motivation to engage in allocation processes, which could create bias ( Lee et al. 2013 ). Placing emphasis on meeting established standards or thoroughness in peer review may promote uncontroversial and safe projects, especially in a situation where strong competition puts pressure on experts to reach a consensus ( Langfeldt 2001 ,2006 ). Personal interest and cognitive similarity may also contribute to conservative bias, which could negatively affect controversial or frontier science ( Luukkonen 2012 ; Roumbanis 2021a ; Travis and Collins 1991 ). Central in this part of literature is that panel conclusions are the outcome of and are influenced by the group interaction ( Van Arensbergen et al. 2014a ). Differences in, for example, the status and expertise of the panel members can play an important role in group dynamics. Insights from social psychology on group dynamics can help in understanding and avoiding bias in peer review panels ( Olbrecht and Bornmann 2010 ). For example, group performance research shows that more diverse groups with complementary skills make better group decisions than homogenous groups. Yet, heterogeneity can also increase conflict within the group ( Forsyth 1999 ). Therefore, it is important to pay attention to power dynamics and maintain team spirit and good communication ( Van Arensbergen et al. 2014a ), especially in meetings that include both scientific and non-scientific peers.

The literature also provides funders with starting points to improve the peer review process. For example, the explicitness of review procedures positively influences the decision-making processes ( Langfeldt 2001 ). Strategic voting and decision-making appear to be less frequent in panels that rate than in panels that rank proposals. Also, an advisory instead of a decisional role may improve the quality of the panel assessment ( Lamont and Huutoniemi 2011 ).

Despite different disciplinary evaluative cultures, formal procedures, and criteria, panel members with different backgrounds develop shared customary rules of deliberation that facilitate agreement and help avoid situations of conflict ( Huutoniemi 2012 ; Lamont 2009 ). This is a necessary prerequisite for opening up peer review panels to include non-academic experts. When doing so, it is important to realise that panel review is a social, emotional, and interactional process. It is therefore important to also take these non-cognitive aspects into account when studying cognitive aspects ( Lamont and Guetzkow 2016 ), as we do in this study.

In summary, what we learn from the literature is that (1) the specific criteria to operationalise scientific quality and societal relevance of research are important, (2) the rationalities from Glerup and Horst predict that not everyone values societal aspects and involve non-scientists in peer review to the same extent and in the same way, (3) this may affect the way peer review panels discuss these aspects, and (4) peer review is a challenging group process that could accommodate other rationalities in order to prevent bias towards specific scientific criteria. To disentangle these aspects, we have carried out an observational study of a diverse range of peer review panel sessions using a fixed set of criteria focusing on scientific quality and societal relevance.

3.1 Research assessment at ZonMw and the DHF

The peer review approach and the criteria used by both the DHF and ZonMw are largely comparable. Funding programmes at both organisations start with a brochure describing the purposes, goals, and conditions for research applications, as well as the assessment procedure and criteria. Both organisations apply a two-stage process. In the first phase, reviewers are asked to write a peer review. In the second phase, a panel reviews the application based on the advice of the written reviews and the applicants’ rebuttal. The panels advise the board on eligible proposals for funding including a ranking of these proposals.

There are also differences between the two organisations. At ZonMw, the criteria for societal relevance and quality are operationalised in the ZonMw Framework Fostering Responsible Research Practices ( Reijmerink and Oortwijn 2017 ). This contributes to a common operationalisation of both quality and societal relevance on the level of individual funding programmes. Important elements in the criteria for societal relevance are, for instance, stakeholder participation, (applying) holistic health concepts, and the added value of knowledge in practice, policy, and education. The framework was developed to optimise the funding process from the perspective of knowledge utilisation and includes concepts like productive interactions and Open Science. It is part of the ZonMw Impact Assessment Framework aimed at guiding the planning, monitoring, and evaluation of funding programmes ( Reijmerink et al. 2020 ). At ZonMw, interdisciplinary panels are set up specifically for each funding programme. Panels are interdisciplinary in nature with academics of a wide range of disciplines and often include non-academic peers, like policymakers, health-care professionals, and patients.

At the DHF, the criteria for scientific quality and societal relevance, at the DHF called societal impact , find their origin in the strategy report of the advisory committee CardioVascular Research Netherlands ( Reneman et al. 2010 ). This report forms the basis of the DHF research policy focusing on scientific and societal impact by creating national collaborations in thematic, interdisciplinary research programmes (the so-called consortia) connecting preclinical and clinical expertise into one concerted effort. An International Scientific Advisory Committee (ISAC) was established to assess these thematic consortia. This panel consists of international scientists, primarily with expertise in the broad cardiovascular research field. The DHF criteria for societal impact were redeveloped in 2013 in collaboration with their CSQ. This panel assesses and advises on the societal aspects of proposed studies. The societal impact criteria include the relevance of the health-care problem, the expected contribution to a solution, attention to the next step in science and towards implementation in practice, and the involvement of and interaction with (end-)users of research (R.Y. Abma-Schouten and I.M. Meijer, unpublished data). Peer review panels for consortium funding are generally composed of members of the ISAC, members of the CSQ, and ad hoc panel members relevant to the specific programme. CSQ members often have a pre-meeting before the final panel meetings to prepare and empower CSQ representatives participating in the peer review panel.

3.2 Selection of funding programmes

To compare and evaluate observations between the two organisations, we selected funding programmes that were relatively comparable in scope and aims. The criteria were (1) a translational and/or clinical objective and (2) the selection procedure consisted of review panels that were responsible for the (final) relevance and quality assessment of grant applications. In total, we selected eight programmes: four at each organisation. At the DHF, two programmes were chosen in which the CSQ did not participate to better disentangle the role of the panel composition. For each programme, we observed the selection process varying from one session on one day (taking 2–8 h) to multiple sessions over several days. Ten sessions were observed in total, of which eight were final peer review panel meetings and two were CSQ meetings preparing for the panel meeting.

After management approval for the study in both organisations, we asked programme managers and panel chairpersons of the programmes that were selected for their consent for observation; none refused participation. Panel members were, in a passive consent procedure, informed about the planned observation and anonymous analyses.

To ensure the independence of this evaluation, the selection of the grant programmes, and peer review panels observed, was at the discretion of the project team of this study. The observations and supervision of the analyses were performed by the senior author not affiliated with the funders.

3.3 Observation matrix

Given the lack of a common operationalisation for scientific quality and societal relevance, we decided to use an observation matrix with a fixed set of detailed aspects as a gold standard to score the brochures, the assessment forms, and the arguments used in panel meetings. The matrix used for the observations of the review panels was based upon and adapted from a ‘grant committee observation matrix’ developed by Van Arensbergen. The original matrix informed a literature review on the selection of talent through peer review and the social dynamics in grant review committees ( van Arensbergen et al. 2014b ). The matrix includes four categories of aspects that operationalise societal relevance, scientific quality, committee, and applicant (see  Table 1 ). The aspects of scientific quality and societal relevance were adapted to fit the operationalisation of scientific quality and societal relevance of the organisations involved. The aspects concerning societal relevance were derived from the CSQ criteria, and the aspects concerning scientific quality were based on the scientific criteria of the first panel observed. The four argument types related to the panel were kept as they were. This committee-related category reflects statements that are related to the personal experience or preference of a panel member and can be seen as signals for bias. This category also includes statements that compare a project with another project without further substantiation. The three applicant-related arguments in the original observation matrix were extended with a fourth on social skills in communication with society. We added health technology assessment (HTA) because one programme specifically focused on this aspect. We tested our version of the observation matrix in pilot observations.

Aspects included in the observation matrix and examples of arguments.

3.4 Observations

Data were primarily collected through observations. Our observations of review panel meetings were non-participatory: the observer and goal of the observation were introduced at the start of the meeting, without further interactions during the meeting. To aid in the processing of observations, some meetings were audiotaped (sound only). Presentations or responses of applicants were not noted and were not part of the analysis. The observer made notes on the ongoing discussion and scored the arguments while listening. One meeting was not attended in person and only observed and scored by listening to the audiotape recording. Because this made identification of the panel members unreliable, this panel meeting was excluded from the analysis of the third research question on how arguments used differ between panel members with different perspectives.

3.5 Grant programmes and the assessment criteria

We gathered and analysed all brochures and assessment forms used by the review panels in order to answer our second research question on the correspondence of arguments used with the formal criteria. Several programmes consisted of multiple grant calls: in that case, the specific call brochure was gathered and analysed, not the overall programme brochure. Additional documentation (e.g. instructional presentations at the start of the panel meeting) was not included in the document analysis. All included documents were marked using the aforementioned observation matrix. The panel-related arguments were not used because this category reflects the personal arguments of panel members that are not part of brochures or instructions. To avoid potential differences in scoring methods, two of the authors independently scored half of the documents that were checked and validated afterwards by the other. Differences were discussed until a consensus was reached.

3.6 Panel composition

In order to answer the third research question, background information on panel members was collected. We categorised the panel members into five common types of panel members: scientific, clinical scientific, health-care professional/clinical, patient, and policy. First, a list of all panel members was composed including their scientific and professional backgrounds and affiliations. The theoretical notion that reviewers represent different types of users of research and therefore potential impact domains (academic, social, economic, and cultural) was leading in the categorisation ( Meijer 2012 ; Spaapen and Van Drooge 2011 ). Because clinical researchers play a dual role in both advancing research as a fellow academic and as a user of the research output in health-care practice, we divided the academic members into two categories of non-clinical and clinical researchers. Multiple types of professional actors participated in each review panel. These were divided into two groups for the analysis: health-care professionals (without current academic activity) and policymakers in the health-care sector. No representatives of the private sector participated in the observed review panels. From the public domain, (at-risk) patients and patient representatives were part of several review panels. Only publicly available information was used to classify the panel members. Members were assigned to one category only: categorisation took place based on the specific role and expertise for which they were appointed to the panel.

In two of the four DHF programmes, the assessment procedure included the CSQ. In these two programmes, representatives of this CSQ participated in the scientific panel to articulate the findings of the CSQ meeting during the final assessment meeting. Two grant programmes were assessed by a review panel with solely (clinical) scientific members.

3.7 Analysis

Data were processed using ATLAS.ti 8 and Microsoft Excel 2010 to produce descriptive statistics. All observed arguments were coded and given a randomised identification code for the panel member using that particular argument. The number of times an argument type was observed was used as an indicator for the relative importance of that argument in the appraisal of proposals. With this approach, a practical and reproducible method for research funders to evaluate the effect of policy changes on peer review was developed. If codes or notes were unclear, post-observation validation of codes was carried out based on observation matrix notes. Arguments that were noted by the observer but could not be matched with an existing code were first coded as a ‘non-existing’ code, and these were resolved by listening back to the audiotapes. Arguments that could not be assigned to a panel member were assigned a ‘missing panel member’ code. A total of 4.7 per cent of all codes were assigned a ‘missing panel member’ code.

After the analyses, two meetings were held to reflect on the results: one with the CSQ and the other with the programme coordinators of both organisations. The goal of these meetings was to improve our interpretation of the findings, disseminate the results derived from this project, and identify topics for further analyses or future studies.

3.8 Limitations

Our study focuses on studying the final phase of the peer review process of research applications in a real-life setting. Our design, a non-participant observation of peer review panels, also introduced several challenges ( Liu and Maitlis 2010 ).

First, the independent review phase or pre-application phase was not part of our study. We therefore could not assess to what extent attention to certain aspects of scientific quality or societal relevance and impact in the review phase influenced the topics discussed during the meeting.

Second, the most important challenge of overt non-participant observations is the observer effect: the danger of causing reactivity in those under study. We believe that the consequences of this effect on our conclusions were limited because panellists are used to external observers in the meetings of these two funders. The observer briefly explained the goal of the study during the introductory round of the panel in general terms. The observer sat as unobtrusively as possible and avoided reactivity to discussions. Similar to previous observations of panels, we experienced that the fact that an observer was present faded into the background during a meeting ( Roumbanis 2021a ). However, a limited observer effect can never be entirely excluded.

Third, our design to only score the arguments raised, and not the responses of the applicant, or information on the content of the proposals, has its positives and negatives. With this approach, we could assure the anonymity of the grant procedures reviewed, the applicants and proposals, panels, and individual panellists. This was an important condition for the funders involved. We took the frequency arguments used as a proxy for the relative importance of that argument in decision-making, which undeniably also has its caveats. Our data collection approach limits more in-depth reflection on which arguments were decisive in decision-making and on group dynamics during the interaction with the applicants as non-verbal and non-content-related comments were not captured in this study.

Fourth, despite this being one of the largest observational studies on the peer review assessment of grant applications with the observation of ten panels in eight grant programmes, many variables might explain differences in arguments used within and beyond our view. Examples of ‘confounding’ variables are the many variations in panel composition, the differences in objectives of the programmes, and the range of the funding programmes. Our study should therefore be seen as exploratory and thus warrants caution in drawing conclusions.

4.1 Overview of observational data

The grant programmes included in this study reflected a broad range of biomedical and health funding programmes, ranging from fellowship grants to translational research and applied health research. All formal documents available to the applicants and to the review panel were retrieved for both ZonMw and the DHF. In total, eighteen documents corresponding to the eight grant programmes were studied. The number of proposals assessed per programme varied from three to thirty-three. The duration of the panel meetings varied between 2 h and two consecutive days. Together, this resulted in a large spread in the number of total arguments used in an individual meeting and in a grant programme as a whole. In the shortest meeting, 49 arguments were observed versus 254 in the longest, with a mean of 126 arguments per meeting and on average 15 arguments per proposal.

We found consistency between how criteria were operationalised in the grant programme’s brochures and in the assessment forms of the review panels overall. At the same time, because the number of elements included in the observation matrix is limited, there was a considerable diversity in the arguments that fall within each aspect (see examples in  Table 1 ). Some of these differences could possibly be explained by differences in language used and the level of detail in the observation matrix, the brochure, and the panel’s instructions. This was especially the case in the applicant-related aspects in which the observation matrix was more detailed than the text in the brochure and assessment forms.

In interpretating our findings, it is important to take into account that, even though our data were largely complete and the observation matrix matched well with the description of the criteria in the brochures and assessment forms, there was a large diversity in the type and number of arguments used and in the number of proposals assessed in the grant programmes included in our study.

4.2 Wide range of arguments used by panels: scientific arguments used most

For our first research question, we explored the number and type of arguments used in the panel meetings. Figure 1 provides an overview of the arguments used. Scientific quality was discussed most. The number of times the feasibility of the aims was discussed clearly stands out in comparison to all other arguments. Also, the match between the science and the problem studied and the plan of work were frequently discussed aspects of scientific quality. International competitiveness of the proposal was discussed the least of all five scientific arguments.

The number of arguments used in panel meetings.

The number of arguments used in panel meetings.

Attention was paid to societal relevance and impact in the panel meetings of both organisations. Yet, the language used differed somewhat between organisations. The contribution to a solution and the next step in science were the most often used societal arguments. At ZonMw, the impact of the health-care problem studied and the activities towards partners were less frequently discussed than the other three societal arguments. At the DHF, the five societal arguments were used equally often.

With the exception of the fellowship programme meeting, applicant-related arguments were not often used. The fellowship panel used arguments related to the applicant and to scientific quality about equally often. Committee-related arguments were also rarely used in the majority of the eight grant programmes observed. In three out of the ten panel meetings, one or two arguments were observed, which were related to personal experience with the applicant or their direct network. In seven out of ten meetings, statements were observed, which were unasserted or were explicitly announced as reflecting a personal preference. The frequency varied between one and seven statements (sixteen in total), which is low in comparison to the other arguments used (see  Fig. 1 for examples).

4.3 Use of arguments varied strongly per panel meeting

The balance in the use of scientific and societal arguments varied strongly per grant programme, panel, and organisation. At ZonMw, two meetings had approximately an equal balance in societal and scientific arguments. In the other two meetings, scientific arguments were used twice to four times as often as societal arguments. At the DHF, three types of panels were observed. Different patterns in the relative use of societal and scientific arguments were observed for each of these panel types. In the two CSQ-only meetings the societal arguments were used approximately twice as often as scientific arguments. In the two meetings of the scientific panels, societal arguments were infrequently used (between zero and four times per argument category). In the combined societal and scientific panel meetings, the use of societal and scientific arguments was more balanced.

4.4 Match of arguments used by panels with the assessment criteria

In order to answer our second research question, we looked into the relation of the arguments used with the formal criteria. We observed that a broader range of arguments were often used in comparison to how the criteria were described in the brochure and assessment instruction. However, arguments related to aspects that were consequently included in the brochure and instruction seemed to be discussed more frequently than in programmes where those aspects were not consistently included or were not included at all. Although the match of the science with the health-care problem and the background and reputation of the applicant were not always made explicit in the brochure or instructions, they were discussed in many panel meetings. Supplementary Fig. S1 provides a visualisation of how arguments used differ between the programmes in which those aspects were, were not, consistently included in the brochure and instruction forms.

4.5 Two-thirds of the assessment was driven by scientific panel members

To answer our third question, we looked into the differences in arguments used between panel members representing a scientific, clinical scientific, professional, policy, or patient perspective. In each research programme, the majority of panellists had a scientific background ( n  = 35), thirty-four members had a clinical scientific background, twenty had a health professional/clinical background, eight members represented a policy perspective, and fifteen represented a patient perspective. From the total number of arguments (1,097), two-thirds were made by members with a scientific or clinical scientific perspective. Members with a scientific background engaged most actively in the discussion with a mean of twelve arguments per member. Similarly, clinical scientists and health-care professionals participated with a mean of nine arguments, and members with a policy and patient perspective put forward the least number of arguments on average, namely, seven and eight. Figure 2 provides a complete overview of the total and mean number of arguments used by the different disciplines in the various panels.

The total and mean number of arguments displayed per subgroup of panel members.

The total and mean number of arguments displayed per subgroup of panel members.

4.6 Diverse use of arguments by panellists, but background matters

In meetings of both organisations, we observed a diverse use of arguments by the panel members. Yet, the use of arguments varied depending on the background of the panel member (see  Fig. 3 ). Those with a scientific and clinical scientific perspective used primarily scientific arguments. As could be expected, health-care professionals and patients used societal arguments more often.

The use of arguments differentiated by panel member background.

The use of arguments differentiated by panel member background.

Further breakdown of arguments across backgrounds showed clear differences in the use of scientific arguments between the different disciplines of panellists. Scientists and clinical scientists discussed the feasibility of the aims more than twice as often as their second most often uttered element of scientific quality, which was the match between the science and the problem studied . Patients and members with a policy or health professional background put forward fewer but more varied scientific arguments.

Patients and health-care professionals accounted for approximately half of the societal arguments used, despite being a much smaller part of the panel’s overall composition. In other words, members with a scientific perspective were less likely to use societal arguments. The relevance of the health-care problem studied, activities towards partners , and arguments related to participation and diversity were not used often by this group. Patients often used arguments related to patient participation and diversity and activities towards partners , although the frequency of the use of the latter differed per organisation.

The majority of the applicant-related arguments were put forward by scientists, including clinical scientists. Committee-related arguments were very rare and are therefore not differentiated by panel member background, except comments related to a comparison with other applications. These arguments were mainly put forward by panel members with a scientific background. HTA -related arguments were often used by panel members with a scientific perspective. Panel members with other perspectives used this argument scarcely (see Supplementary Figs S2–S4 for the visual presentation of the differences between panel members on all aspects included in the matrix).

5.1 Explanations for arguments used in panels

Our observations show that most arguments for scientific quality were often used. However, except for the feasibility , the frequency of arguments used varied strongly between the meetings and between the individual proposals that were discussed. The fact that most arguments were not consistently used is not surprising given the results from previous studies that showed heterogeneity in grant application assessments and low consistency in comments and scores by independent reviewers ( Abdoul et al. 2012 ; Pier et al. 2018 ). In an analysis of written assessments on nine observed dimensions, no dimension was used in more than 45 per cent of the reviews ( Hartmann and Neidhardt 1990 ).

There are several possible explanations for this heterogeneity. Roumbanis (2021a) described how being responsive to the different challenges in the proposals and to the points of attention arising from the written assessments influenced discussion in panels. Also when a disagreement arises, more time is spent on discussion ( Roumbanis 2021a ). One could infer that unambiguous, and thus not debated, aspects might remain largely undetected in our study. We believe, however, that the main points relevant to the assessment will not remain entirely unmentioned, because most panels in our study started the discussion with a short summary of the proposal, the written assessment, and the rebuttal. Lamont (2009) , however, points out that opening statements serve more goals than merely decision-making. They can also increase the credibility of the panellist, showing their comprehension and balanced assessment of an application. We can therefore not entirely disentangle whether the arguments observed most were also found to be most important or decisive or those were simply the topics that led to most disagreement.

An interesting difference with Roumbanis’ study was the available discussion time per proposal. In our study, most panels handled a limited number of proposals, allowing for longer discussions in comparison with the often 2-min time frame that Roumbanis (2021b) described, potentially contributing to a wider range of arguments being discussed. Limited time per proposal might also limit the number of panellists contributing to the discussion per proposal ( De Bont 2014 ).

5.2 Reducing heterogeneity by improving operationalisation and the consequent use of assessment criteria

We found that the language used for the operationalisation of the assessment criteria in programme brochures and in the observation matrix was much more detailed than in the instruction for the panel, which was often very concise. The exercise also illustrated that many terms were used interchangeably.

This was especially true for the applicant-related aspects. Several panels discussed how talent should be assessed. This confusion is understandable when considering the changing values in research and its assessment ( Moher et al. 2018 ) and the fact that the instruction of the funders was very concise. For example, it was not explicated whether the individual or the team should be assessed. Arensbergen et al. (2014b) described how in grant allocation processes, talent is generally assessed using limited characteristics. More objective and quantifiable outputs often prevailed at the expense of recognising and rewarding a broad variety of skills and traits combining professional, social, and individual capital ( DORA 2013 ).

In addition, committee-related arguments, like personal experiences with the applicant or their institute, were rarely used in our study. Comparisons between proposals were sometimes made without further argumentation, mainly by scientific panel members. This was especially pronounced in one (fellowship) grant programme with a high number of proposals. In this programme, the panel meeting concentrated on quickly comparing the quality of the applicants and of the proposals based on the reviewer’s judgement, instead of a more in-depth discussion of the different aspects of the proposals. Because the review phase was not part of this study, the question of which aspects have been used for the assessment of the proposals in this panel therefore remains partially unanswered. However, weighing and comparing proposals on different aspects and with different inputs is a core element of scientific peer review, both in the review of papers and in the review of grants ( Hirschauer 2010 ). The large role of scientific panel members in comparing proposals is therefore not surprising.

One could anticipate that more consequent language in the operationalising criteria may lead to more clarity for both applicants and panellists and to more consistency in the assessment of research proposals. The trend in our observations was that arguments were used less when the related criteria were not or were consequently included in the brochure and panel instruction. It remains, however, challenging to disentangle the influence of the formal definitions of criteria on the arguments used. Previous studies also encountered difficulties in studying the role of the formal instruction in peer review but concluded that this role is relatively limited ( Langfeldt 2001 ; Reinhart 2010 ).

The lack of a clear operationalisation of criteria can contribute to heterogeneity in peer review as many scholars found that assessors differ in the conceptualisation of good science and to the importance they attach to various aspects of research quality and societal relevance ( Abdoul et al. 2012 ; Geurts 2016 ; Scholten et al. 2018 ; Van den Brink et al. 2016 ). The large variation and absence of a gold standard in the interpretation of scientific quality and societal relevance affect the consistency of peer review. As a consequence, it is challenging to systematically evaluate and improve peer review in order to fund the research that contributes most to science and society. To contribute to responsible research and innovation, it is, therefore, important that funders invest in a more consistent and conscientious peer review process ( Curry et al. 2020 ; DORA 2013 ).

A common conceptualisation of scientific quality and societal relevance and impact could improve the alignment between views on good scientific conduct, programmes’ objectives, and the peer review in practice. Such a conceptualisation could contribute to more transparency and quality in the assessment of research. By involving panel members from all relevant backgrounds, including the research community, health-care professionals, and societal actors, in a better operationalisation of criteria, more inclusive views of good science can be implemented more systematically in the peer review assessment of research proposals. The ZonMw Framework Fostering Responsible Research Practices is an example of an initiative aiming to support standardisation and integration ( Reijmerink et al. 2020 ).

Given the lack of a common definition or conceptualisation of scientific quality and societal relevance, our study made an important decision by choosing to use a fixed set of detailed aspects of two important criteria as a gold standard to score the brochures, the panel instructions, and the arguments used by the panels. This approach proved helpful in disentangling the different components of scientific quality and societal relevance. Having said that, it is important not to oversimplify the causes for heterogeneity in peer review because these substantive arguments are not independent of non-cognitive, emotional, or social aspects ( Lamont and Guetzkow 2016 ; Reinhart 2010 ).

5.3 Do more diverse panels contribute to a broader use of arguments?

Both funders participating in our study have an outspoken public mission that requests sufficient attention to societal aspects in assessment processes. In reality, as observed in several panels, the main focus of peer review meetings is on scientific arguments. Next to the possible explanations earlier, the composition of the panel might play a role in explaining arguments used in panel meetings. Our results have shown that health-care professionals and patients bring in more societal arguments than scientists, including those who are also clinicians. It is, however, not that simple. In the more diverse panels, panel members, regardless of their backgrounds, used more societal arguments than in the less diverse panels.

Observing ten panel meetings was sufficient to explore differences in arguments used by panel members with different backgrounds. The pattern of (primarily) scientific arguments being raised by panels with mainly scientific members is not surprising. After all, it is their main task to assess the scientific content of grant proposals and fit their competencies. As such, one could argue, depending on how one justifies the relationship between science and society, that health-care professionals and patients might be better suited to assess the value for potential users of research results. Scientific panel members and clinical scientists in our study used less arguments that reflect on opening up and connecting science directly to others who can bring it further (being industry, health-care professionals, or other stakeholders). Patients filled this gap since these two types of arguments were the most prevalent type put forward by them. Making an active connection with society apparently needs a broader, more diverse panel for scientists to direct their attention to more societal arguments. Evident from our observations is that in panels with patients and health-care professionals, their presence seemed to increase the attention placed on arguments beyond the scientific arguments put forward by all panel members, including scientists. This conclusion is congruent with the observation that there was a more equal balance in the use of societal and scientific arguments in the scientific panels in which the CSQ participated. This illustrates that opening up peer review panels to non-scientific members creates an opportunity to focus on both the contribution and the integrative rationality ( Glerup and Horst 2014 ) or, in other words, to allow productive interactions between scientific and non-scientific actors. This corresponds with previous research that suggests that with regard to societal aspects, reviews from mixed panels were broader and richer ( Luo et al. 2021 ). In panels with non-scientific experts, more emphasis was placed on the role of the proposed research process to increase the likelihood of societal impact over the causal importance of scientific excellence for broader impacts. This is in line with the findings that panels with more disciplinary diversity, in range and also by including generalist experts, applied more versatile styles to reach consensus and paid more attention to relevance and pragmatic value ( Huutoniemi 2012 ).

Our observations further illustrate that patients and health-care professionals were less vocal in panels than (clinical) scientists and were in the minority. This could reflect their social role and lower perceived authority in the panel. Several guides are available for funders to stimulate the equal participation of patients in science. These guides are also applicable to their involvement in peer review panels. Measures to be taken include the support and training to help prepare patients for their participation in deliberations with renowned scientists and explicitly addressing power differences ( De Wit et al. 2016 ). Panel chairs and programme officers have to set and supervise the conditions for the functioning of both the individual panel members and the panel as a whole ( Lamont 2009 ).

5.4 Suggestions for future studies

In future studies, it is important to further disentangle the role of the operationalisation and appraisal of assessment criteria in reducing heterogeneity in the arguments used by panels. More controlled experimental settings are a valuable addition to the current mainly observational methodologies applied to disentangle some of the cognitive and social factors that influence the functioning and argumentation of peer review panels. Reusing data from the panel observations and the data on the written reports could also provide a starting point for a bottom-up approach to create a more consistent and shared conceptualisation and operationalisation of assessment criteria.

To further understand the effects of opening up review panels to non-scientific peers, it is valuable to compare the role of diversity and interdisciplinarity in solely scientific panels versus panels that also include non-scientific experts.

In future studies, differences between domains and types of research should also be addressed. We hypothesise that biomedical and health research is perhaps more suited for the inclusion of non-scientific peers in panels than other research domains. For example, it is valuable to better understand how potentially relevant users can be well enough identified in other research fields and to what extent non-academics can contribute to assessing the possible value of, especially early or blue sky, research.

The goal of our study was to explore in practice which arguments regarding the main criteria of scientific quality and societal relevance were used by peer review panels of biomedical and health research funding programmes. We showed that there is a wide diversity in the number and range of arguments used, but three main scientific aspects were discussed most frequently. These are the following: is it a feasible approach; does the science match the problem , and is the work plan scientifically sound? Nevertheless, these scientific aspects were accompanied by a significant amount of discussion of societal aspects, of which the contribution to a solution is the most prominent. In comparison with scientific panellists, non-scientific panellists, such as health-care professionals, policymakers, and patients, often use a wider range of arguments and other societal arguments. Even more striking was that, even though non-scientific peers were often outnumbered and less vocal in panels, scientists also used a wider range of arguments when non-scientific peers were present.

It is relevant that two health research funders collaborated in the current study to reflect on and improve peer review in research funding. There are few studies published that describe live observations of peer review panel meetings. Many studies focus on alternatives for peer review or reflect on the outcomes of the peer review process, instead of reflecting on the practice and improvement of peer review assessment of grant proposals. Privacy and confidentiality concerns of funders also contribute to the lack of information on the functioning of peer review panels. In this study, both organisations were willing to participate because of their interest in research funding policies in relation to enhancing the societal value and impact of science. The study provided them with practical suggestions, for example, on how to improve the alignment in language used in programme brochures and instructions of review panels, and contributed to valuable knowledge exchanges between organisations. We hope that this publication stimulates more research funders to evaluate their peer review approach in research funding and share their insights.

For a long time, research funders relied solely on scientists for designing and executing peer review of research proposals, thereby delegating responsibility for the process. Although review panels have a discretionary authority, it is important that funders set and supervise the process and the conditions. We argue that one of these conditions should be the diversification of peer review panels and opening up panels for non-scientific peers.

Supplementary material is available at Science and Public Policy online.

Details of the data and information on how to request access is available from the first author.

Joey Gijbels and Wendy Reijmerink are employed by ZonMw. Rebecca Abma-Schouten is employed by the Dutch Heart Foundation and as external PhD candidate affiliated with the Centre for Science and Technology Studies, Leiden University.

A special thanks to the panel chairs and programme officers of ZonMw and the DHF for their willingness to participate in this project. We thank Diny Stekelenburg, an internship student at ZonMw, for her contributions to the project. Our sincerest gratitude to Prof. Paul Wouters, Sarah Coombs, and Michiel van der Vaart for proofreading and their valuable feedback. Finally, we thank the editors and anonymous reviewers of Science and Public Policy for their thorough and insightful reviews and recommendations. Their contributions are recognisable in the final version of this paper.

Abdoul   H. , Perrey   C. , Amiel   P. , et al.  ( 2012 ) ‘ Peer Review of Grant Applications: Criteria Used and Qualitative Study of Reviewer Practices ’, PLoS One , 7 : 1 – 15 .

Google Scholar

Abma-Schouten   R. Y. ( 2017 ) ‘ Maatschappelijke Kwaliteit van Onderzoeksvoorstellen ’, Dutch Heart Foundation .

Alla   K. , Hall   W. D. , Whiteford   H. A. , et al.  ( 2017 ) ‘ How Do We Define the Policy Impact of Public Health Research? A Systematic Review ’, Health Research Policy and Systems , 15 : 84.

Benedictus   R. , Miedema   F. , and Ferguson   M. W. J. ( 2016 ) ‘ Fewer Numbers, Better Science ’, Nature , 538 : 453 – 4 .

Chalmers   I. , Bracken   M. B. , Djulbegovic   B. , et al.  ( 2014 ) ‘ How to Increase Value and Reduce Waste When Research Priorities Are Set ’, The Lancet , 383 : 156 – 65 .

Curry   S. , De Rijcke   S. , Hatch   A. , et al.  ( 2020 ) ‘ The Changing Role of Funders in Responsible Research Assessment: Progress, Obstacles and the Way Ahead ’, RoRI Working Paper No. 3, London : Research on Research Institute (RoRI) .

De Bont   A. ( 2014 ) ‘ Beoordelen Bekeken. Reflecties op het Werk van Een Programmacommissie van ZonMw ’, ZonMw .

De Rijcke   S. , Wouters   P. F. , Rushforth   A. D. , et al.  ( 2016 ) ‘ Evaluation Practices and Effects of Indicator Use—a Literature Review ’, Research Evaluation , 25 : 161 – 9 .

De Wit   A. M. , Bloemkolk   D. , Teunissen   T. , et al.  ( 2016 ) ‘ Voorwaarden voor Succesvolle Betrokkenheid van Patiënten/cliënten bij Medisch Wetenschappelijk Onderzoek ’, Tijdschrift voor Sociale Gezondheidszorg , 94 : 91 – 100 .

Del Carmen Calatrava Moreno   M. , Warta   K. , Arnold   E. , et al.  ( 2019 ) Science Europe Study on Research Assessment Practices . Technopolis Group Austria .

Google Preview

Demicheli   V. and Di Pietrantonj   C. ( 2007 ) ‘ Peer Review for Improving the Quality of Grant Applications ’, Cochrane Database of Systematic Reviews , 2 : MR000003.

Den Oudendammer   W. M. , Noordhoek   J. , Abma-Schouten   R. Y. , et al.  ( 2019 ) ‘ Patient Participation in Research Funding: An Overview of When, Why and How Amongst Dutch Health Funds ’, Research Involvement and Engagement , 5 .

Diabetesfonds ( n.d. ) Maatschappelijke Adviesraad < https://www.diabetesfonds.nl/over-ons/maatschappelijke-adviesraad > accessed 18 Sept 2022 .

Dijstelbloem   H. , Huisman   F. , Miedema   F. , et al.  ( 2013 ) ‘ Science in Transition Position Paper: Waarom de Wetenschap Niet Werkt Zoals het Moet, En Wat Daar aan te Doen Is ’, Utrecht : Science in Transition .

Forsyth   D. R. ( 1999 ) Group Dynamics , 3rd edn. Belmont : Wadsworth Publishing Company .

Geurts   J. ( 2016 ) ‘ Wat Goed Is, Herken Je Meteen ’, NRC Handelsblad < https://www.nrc.nl/nieuws/2016/10/28/wat-goed-is-herken-je-meteen-4975248-a1529050 > accessed 6 Mar 2022 .

Glerup   C. and Horst   M. ( 2014 ) ‘ Mapping “Social Responsibility” in Science ’, Journal of Responsible Innovation , 1 : 31 – 50 .

Hartmann   I. and Neidhardt   F. ( 1990 ) ‘ Peer Review at the Deutsche Forschungsgemeinschaft ’, Scientometrics , 19 : 419 – 25 .

Hirschauer   S. ( 2010 ) ‘ Editorial Judgments: A Praxeology of “Voting” in Peer Review ’, Social Studies of Science , 40 : 71 – 103 .

Hughes   A. and Kitson   M. ( 2012 ) ‘ Pathways to Impact and the Strategic Role of Universities: New Evidence on the Breadth and Depth of University Knowledge Exchange in the UK and the Factors Constraining Its Development ’, Cambridge Journal of Economics , 36 : 723 – 50 .

Huutoniemi   K. ( 2012 ) ‘ Communicating and Compromising on Disciplinary Expertise in the Peer Review of Research Proposals ’, Social Studies of Science , 42 : 897 – 921 .

Jasanoff   S. ( 2011 ) ‘ Constitutional Moments in Governing Science and Technology ’, Science and Engineering Ethics , 17 : 621 – 38 .

Kolarz   P. , Arnold   E. , Farla   K. , et al.  ( 2016 ) Evaluation of the ESRC Transformative Research Scheme . Brighton : Technopolis Group .

Lamont   M. ( 2009 ) How Professors Think : Inside the Curious World of Academic Judgment . Cambridge : Harvard University Press .

Lamont   M. Guetzkow   J. ( 2016 ) ‘How Quality Is Recognized by Peer Review Panels: The Case of the Humanities’, in M.   Ochsner , S. E.   Hug , and H.-D.   Daniel (eds) Research Assessment in the Humanities , pp. 31 – 41 . Cham : Springer International Publishing .

Lamont   M. Huutoniemi   K. ( 2011 ) ‘Comparing Customary Rules of Fairness: Evaluative Practices in Various Types of Peer Review Panels’, in C.   Charles   G.   Neil and L.   Michèle (eds) Social Knowledge in the Making , pp. 209–32. Chicago : The University of Chicago Press .

Langfeldt   L. ( 2001 ) ‘ The Decision-making Constraints and Processes of Grant Peer Review, and Their Effects on the Review Outcome ’, Social Studies of Science , 31 : 820 – 41 .

——— ( 2006 ) ‘ The Policy Challenges of Peer Review: Managing Bias, Conflict of Interests and Interdisciplinary Assessments ’, Research Evaluation , 15 : 31 – 41 .

Lee   C. J. , Sugimoto   C. R. , Zhang   G. , et al.  ( 2013 ) ‘ Bias in Peer Review ’, Journal of the American Society for Information Science and Technology , 64 : 2 – 17 .

Liu   F. Maitlis   S. ( 2010 ) ‘Nonparticipant Observation’, in A. J.   Mills , G.   Durepos , and E.   Wiebe (eds) Encyclopedia of Case Study Research , pp. 609 – 11 . Los Angeles : SAGE .

Luo   J. , Ma   L. , and Shankar   K. ( 2021 ) ‘ Does the Inclusion of Non-academix Reviewers Make Any Difference for Grant Impact Panels? ’, Science & Public Policy , 48 : 763 – 75 .

Luukkonen   T. ( 2012 ) ‘ Conservatism and Risk-taking in Peer Review: Emerging ERC Practices ’, Research Evaluation , 21 : 48 – 60 .

Macleod   M. R. , Michie   S. , Roberts   I. , et al.  ( 2014 ) ‘ Biomedical Research: Increasing Value, Reducing Waste ’, The Lancet , 383 : 101 – 4 .

Meijer   I. M. ( 2012 ) ‘ Societal Returns of Scientific Research. How Can We Measure It? ’, Leiden : Center for Science and Technology Studies, Leiden University .

Merton   R. K. ( 1968 ) Social Theory and Social Structure , Enlarged edn. [Nachdr.] . New York : The Free Press .

Moher   D. , Naudet   F. , Cristea   I. A. , et al.  ( 2018 ) ‘ Assessing Scientists for Hiring, Promotion, And Tenure ’, PLoS Biology , 16 : e2004089.

Olbrecht   M. and Bornmann   L. ( 2010 ) ‘ Panel Peer Review of Grant Applications: What Do We Know from Research in Social Psychology on Judgment and Decision-making in Groups? ’, Research Evaluation , 19 : 293 – 304 .

Patiëntenfederatie Nederland ( n.d. ) Ervaringsdeskundigen Referentenpanel < https://www.patientenfederatie.nl/zet-je-ervaring-in/lid-worden-van-ons-referentenpanel > accessed 18 Sept 2022.

Pier   E. L. , M.   B. , Filut   A. , et al.  ( 2018 ) ‘ Low Agreement among Reviewers Evaluating the Same NIH Grant Applications ’, Proceedings of the National Academy of Sciences , 115 : 2952 – 7 .

Prinses Beatrix Spierfonds ( n.d. ) Gebruikerscommissie < https://www.spierfonds.nl/wie-wij-zijn/gebruikerscommissie > accessed 18 Sep 2022 .

( 2020 ) Private Non-profit Financiering van Onderzoek in Nederland < https://www.rathenau.nl/nl/wetenschap-cijfers/geld/wat-geeft-nederland-uit-aan-rd/private-non-profit-financiering-van#:∼:text=R%26D%20in%20Nederland%20wordt%20gefinancierd,aan%20wetenschappelijk%20onderzoek%20in%20Nederland > accessed 6 Mar 2022 .

Reneman   R. S. , Breimer   M. L. , Simoons   J. , et al.  ( 2010 ) ‘ De toekomst van het cardiovasculaire onderzoek in Nederland. Sturing op synergie en impact ’, Den Haag : Nederlandse Hartstichting .

Reed   M. S. , Ferré   M. , Marin-Ortega   J. , et al.  ( 2021 ) ‘ Evaluating Impact from Research: A Methodological Framework ’, Research Policy , 50 : 104147.

Reijmerink   W. and Oortwijn   W. ( 2017 ) ‘ Bevorderen van Verantwoorde Onderzoekspraktijken Door ZonMw ’, Beleidsonderzoek Online. accessed 6 Mar 2022.

Reijmerink   W. , Vianen   G. , Bink   M. , et al.  ( 2020 ) ‘ Ensuring Value in Health Research by Funders’ Implementation of EQUATOR Reporting Guidelines: The Case of ZonMw ’, Berlin : REWARD|EQUATOR .

Reinhart   M. ( 2010 ) ‘ Peer Review Practices: A Content Analysis of External Reviews in Science Funding ’, Research Evaluation , 19 : 317 – 31 .

Reinhart   M. and Schendzielorz   C. ( 2021 ) Trends in Peer Review . SocArXiv . < https://osf.io/preprints/socarxiv/nzsp5 > accessed 29 Aug 2022.

Roumbanis   L. ( 2017 ) ‘ Academic Judgments under Uncertainty: A Study of Collective Anchoring Effects in Swedish Research Council Panel Groups ’, Social Studies of Science , 47 : 95 – 116 .

——— ( 2021a ) ‘ Disagreement and Agonistic Chance in Peer Review ’, Science, Technology & Human Values , 47 : 1302 – 33 .

——— ( 2021b ) ‘ The Oracles of Science: On Grant Peer Review and Competitive Funding ’, Social Science Information , 60 : 356 – 62 .

( 2019 ) ‘ Ruimte voor ieders talent (Position Paper) ’, Den Haag : VSNU, NFU, KNAW, NWO en ZonMw . < https://www.universiteitenvannederland.nl/recognitionandrewards/wp-content/uploads/2019/11/Position-paper-Ruimte-voor-ieders-talent.pdf >.

( 2013 ) San Francisco Declaration on Research Assessment . The Declaration . < https://sfdora.org > accessed 2 Jan 2022 .

Sarewitz   D. and Pielke   R. A.  Jr. ( 2007 ) ‘ The Neglected Heart of Science Policy: Reconciling Supply of and Demand for Science ’, Environmental Science & Policy , 10 : 5 – 16 .

Scholten   W. , Van Drooge   L. , and Diederen   P. ( 2018 ) Excellent Is Niet Gewoon. Dertig Jaar Focus op Excellentie in het Nederlandse Wetenschapsbeleid . The Hague : Rathenau Instituut .

Shapin   S. ( 2008 ) The Scientific Life : A Moral History of a Late Modern Vocation . Chicago : University of Chicago press .

Spaapen   J. and Van Drooge   L. ( 2011 ) ‘ Introducing “Productive Interactions” in Social Impact Assessment ’, Research Evaluation , 20 : 211 – 8 .

Travis   G. D. L. and Collins   H. M. ( 1991 ) ‘ New Light on Old Boys: Cognitive and Institutional Particularism in the Peer Review System ’, Science, Technology & Human Values , 16 : 322 – 41 .

Van Arensbergen   P. and Van den Besselaar   P. ( 2012 ) ‘ The Selection of Scientific Talent in the Allocation of Research Grants ’, Higher Education Policy , 25 : 381 – 405 .

Van Arensbergen   P. , Van der Weijden   I. , and Van den Besselaar   P. V. D. ( 2014a ) ‘ The Selection of Talent as a Group Process: A Literature Review on the Social Dynamics of Decision Making in Grant Panels ’, Research Evaluation , 23 : 298 – 311 .

—— ( 2014b ) ‘ Different Views on Scholarly Talent: What Are the Talents We Are Looking for in Science? ’, Research Evaluation , 23 : 273 – 84 .

Van den Brink , G. , Scholten , W. , and Jansen , T. , eds ( 2016 ) Goed Werk voor Academici . Culemborg : Stichting Beroepseer .

Weingart   P. ( 1999 ) ‘ Scientific Expertise and Political Accountability: Paradoxes of Science in Politics ’, Science & Public Policy , 26 : 151 – 61 .

Wessely   S. ( 1998 ) ‘ Peer Review of Grant Applications: What Do We Know? ’, The Lancet , 352 : 301 – 5 .

Supplementary data

Email alerts, citing articles via.

  • Recommend to your Library


  • Online ISSN 1471-5430
  • Print ISSN 0302-3427
  • Copyright © 2024 Oxford University Press
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

Research Evaluation

  • First Online: 23 June 2020

Cite this chapter

Book cover

  • Carlo Ghezzi 2  

948 Accesses

1 Citations

  • The original version of this chapter was revised. A correction to this chapter can be found at https://doi.org/10.1007/978-3-030-45157-8_7

This chapter is about research evaluation. Evaluation is quintessential to research. It is traditionally performed through qualitative expert judgement. The chapter presents the main evaluation activities in which researchers can be engaged. It also introduces the current efforts towards devising quantitative research evaluation based on bibliometric indicators and critically discusses their limitations, along with their possible (limited and careful) use.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Change history

19 october 2021.

The original version of the chapter was inadvertently published with an error. The chapter has now been corrected.

Notice that the taxonomy presented in Box 5.1 does not cover all kinds of scientific papers. As an example, it does not cover survey papers, which normally are not submitted to a conference.

Private institutions and industry may follow different schemes.

Adler, R., Ewing, J., Taylor, P.: Citation statistics: A report from the international mathematical union (imu) in cooperation with the international council of industrial and applied mathematics (iciam) and the institute of mathematical statistics (ims). Statistical Science 24 (1), 1–14 (2009). URL http://www.jstor.org/stable/20697661

Esposito, F., Ghezzi, C., Hermenegildo, M., Kirchner, H., Ong, L.: Informatics Research Evaluation. Informatics Europe (2018). URL https://www.informatics-europe.org/publications.html

Friedman, B., Schneider, F.B.: Incentivizing quality and impact: Evaluating scholarship in hiring, tenure, and promotion. Computing Research Association (2016). URL https://cra.org/resources/best-practice-memos/incentivizing-quality-and-impact-evaluating-scholarship-in-hiring-tenure-and-promotion/

Hicks, D., Wouters, P., Waltman, L., de Rijcke, S., Rafols, I.: Bibliometrics: The leiden manifesto for research metrics. Nature News 520 (7548), 429 (2015). https://doi.org/10.1038/520429a . URL http://www.nature.com/news/bibliometrics-the-leiden-manifesto-for-research-metrics-1.17351

Parnas, D.L.: Stop the numbers game. Commun. ACM 50 (11), 19–21 (2007). https://doi.org/10.1145/1297797.1297815 . URL http://doi.acm.org/10.1145/1297797.1297815

Patterson, D., Snyder, L., Ullman, J.: Evaluating computer scientists and engineers for promotion and tenure. Computing Research Association (1999). URL https://cra.org/resources/best-practice-memos/incentivizing-quality-and-impact-evaluating-scholarship-in-hiring-tenure-and-promotion/

Saenen, B., Borrell-Damian, L.: Reflections on University Research Assessment: key concepts, issues and actors. European University Association (2019). URL https://eua.eu/component/attachments/attachments.html?id=2144

Download references

Author information

Authors and affiliations.

Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Milano, Italy

Carlo Ghezzi

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Carlo Ghezzi .

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Ghezzi, C. (2020). Research Evaluation. In: Being a Researcher. Springer, Cham. https://doi.org/10.1007/978-3-030-45157-8_5

Download citation

DOI : https://doi.org/10.1007/978-3-030-45157-8_5

Published : 23 June 2020

Publisher Name : Springer, Cham

Print ISBN : 978-3-030-45156-1

Online ISBN : 978-3-030-45157-8

eBook Packages : Computer Science Computer Science (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

  • Knowledge Base
  • Research process
  • How to Write a Research Proposal | Examples & Templates

How to Write a Research Proposal | Examples & Templates

Published on 30 October 2022 by Shona McCombes and Tegan George. Revised on 13 June 2023.

Structure of a research proposal

A research proposal describes what you will investigate, why it’s important, and how you will conduct your research.

The format of a research proposal varies between fields, but most proposals will contain at least these elements:


Literature review.

  • Research design

Reference list

While the sections may vary, the overall objective is always the same. A research proposal serves as a blueprint and guide for your research plan, helping you get organised and feel confident in the path forward you choose to take.

Table of contents

Research proposal purpose, research proposal examples, research design and methods, contribution to knowledge, research schedule, frequently asked questions.

Academics often have to write research proposals to get funding for their projects. As a student, you might have to write a research proposal as part of a grad school application , or prior to starting your thesis or dissertation .

In addition to helping you figure out what your research can look like, a proposal can also serve to demonstrate why your project is worth pursuing to a funder, educational institution, or supervisor.

Research proposal length

The length of a research proposal can vary quite a bit. A bachelor’s or master’s thesis proposal can be just a few pages, while proposals for PhD dissertations or research funding are usually much longer and more detailed. Your supervisor can help you determine the best length for your work.

One trick to get started is to think of your proposal’s structure as a shorter version of your thesis or dissertation , only without the results , conclusion and discussion sections.

Download our research proposal template

Prevent plagiarism, run a free check.

Writing a research proposal can be quite challenging, but a good starting point could be to look at some examples. We’ve included a few for you below.

  • Example research proposal #1: ‘A Conceptual Framework for Scheduling Constraint Management’
  • Example research proposal #2: ‘ Medical Students as Mediators of Change in Tobacco Use’

Like your dissertation or thesis, the proposal will usually have a title page that includes:

  • The proposed title of your project
  • Your supervisor’s name
  • Your institution and department

The first part of your proposal is the initial pitch for your project. Make sure it succinctly explains what you want to do and why.

Your introduction should:

  • Introduce your topic
  • Give necessary background and context
  • Outline your  problem statement  and research questions

To guide your introduction , include information about:

  • Who could have an interest in the topic (e.g., scientists, policymakers)
  • How much is already known about the topic
  • What is missing from this current knowledge
  • What new insights your research will contribute
  • Why you believe this research is worth doing

As you get started, it’s important to demonstrate that you’re familiar with the most important research on your topic. A strong literature review  shows your reader that your project has a solid foundation in existing knowledge or theory. It also shows that you’re not simply repeating what other people have already done or said, but rather using existing research as a jumping-off point for your own.

In this section, share exactly how your project will contribute to ongoing conversations in the field by:

  • Comparing and contrasting the main theories, methods, and debates
  • Examining the strengths and weaknesses of different approaches
  • Explaining how will you build on, challenge, or synthesise prior scholarship

Following the literature review, restate your main  objectives . This brings the focus back to your own project. Next, your research design or methodology section will describe your overall approach, and the practical steps you will take to answer your research questions.

To finish your proposal on a strong note, explore the potential implications of your research for your field. Emphasise again what you aim to contribute and why it matters.

For example, your results might have implications for:

  • Improving best practices
  • Informing policymaking decisions
  • Strengthening a theory or model
  • Challenging popular or scientific beliefs
  • Creating a basis for future research

Last but not least, your research proposal must include correct citations for every source you have used, compiled in a reference list . To create citations quickly and easily, you can use our free APA citation generator .

Some institutions or funders require a detailed timeline of the project, asking you to forecast what you will do at each stage and how long it may take. While not always required, be sure to check the requirements of your project.

Here’s an example schedule to help you get started. You can also download a template at the button below.

Download our research schedule template

If you are applying for research funding, chances are you will have to include a detailed budget. This shows your estimates of how much each part of your project will cost.

Make sure to check what type of costs the funding body will agree to cover. For each item, include:

  • Cost : exactly how much money do you need?
  • Justification : why is this cost necessary to complete the research?
  • Source : how did you calculate the amount?

To determine your budget, think about:

  • Travel costs : do you need to go somewhere to collect your data? How will you get there, and how much time will you need? What will you do there (e.g., interviews, archival research)?
  • Materials : do you need access to any tools or technologies?
  • Help : do you need to hire any research assistants for the project? What will they do, and how much will you pay them?

Once you’ve decided on your research objectives , you need to explain them in your paper, at the end of your problem statement.

Keep your research objectives clear and concise, and use appropriate verbs to accurately convey the work that you will carry out for each one.

I will compare …

A research aim is a broad statement indicating the general purpose of your research project. It should appear in your introduction at the end of your problem statement , before your research objectives.

Research objectives are more specific than your research aim. They indicate the specific ways you’ll address the overarching aim.

A PhD, which is short for philosophiae doctor (doctor of philosophy in Latin), is the highest university degree that can be obtained. In a PhD, students spend 3–5 years writing a dissertation , which aims to make a significant, original contribution to current knowledge.

A PhD is intended to prepare students for a career as a researcher, whether that be in academia, the public sector, or the private sector.

A master’s is a 1- or 2-year graduate degree that can prepare you for a variety of careers.

All master’s involve graduate-level coursework. Some are research-intensive and intend to prepare students for further study in a PhD; these usually require their students to write a master’s thesis . Others focus on professional training for a specific career.

Critical thinking refers to the ability to evaluate information and to be aware of biases or assumptions, including your own.

Like information literacy , it involves evaluating arguments, identifying and solving problems in an objective and systematic way, and clearly communicating your ideas.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

McCombes, S. & George, T. (2023, June 13). How to Write a Research Proposal | Examples & Templates. Scribbr. Retrieved 2 April 2024, from https://www.scribbr.co.uk/the-research-process/research-proposal-explained/

Is this article helpful?

Shona McCombes

Shona McCombes

Other students also liked, what is a research methodology | steps & tips, what is a literature review | guide, template, & examples, how to write a results section | tips & examples.

Readex ResearchEvaluating Research Proposals - Readex Research

Evaluating Research Proposals

Comparing proposals “apples-to-apples” is crucial to establishing which one will best meet your needs. Consider these ideas to help you focus on the details that contribute to a successful survey.

Make sure the proposal responds to your objectives.

The proposal process begins well before you ask any research firm for quote. The process really begins with the discussions you and your team have about objectives. What are your goals? What are the decisions you want to make when the project is done and you have data in hand?

Once you have a solid vision of the survey, then it’s time to start talking with potential partners Throughout your conversations, take note: Do the various firms ask you specific questions about your objectives, the group of people you’d like to survey, and your ultimate goals? Do they, indeed, ask about decisions that you wish to make? Details regarding your specific need should always be front and center during the conversations.

Sampling plan.

When reviewing the sampling plan, make sure the proposal mentions sample size, response rate estimates, number of responses, and maximum sampling error. If you’re unsure of the impact these figures have on the quality of your results, ask the researcher. They should be able to explain them in terms you can understand.


The quantity and types of information sought from respondents will impact cost. Quantity encompasses the number of questions and number of variables to process. Type refers to how the questions will be processed, the data entry involved and whether all or just some data will be cleaned.

No evaluation is complete until you know the approximate number and types of questions planned for the survey. The number of open-ended questions should be included as well because open-ended questions that capture verbatim responses can impact the response rate and possibly the price of your survey, especially if done by mail.

In addition, make sure the proposal clearly indicates who will develop the questionnaire content. Also, determine if it includes enough collaboration time to be sufficiently customized to meet your particular needs.

Data collection approach.

For online surveys paying attention to the data collection series and who is responsible for sending survey invitations. Multiple emails to sample members can encourage response. As well, the invitation process should be sensitive to data privacy issues such as those indicated by GDPR and others. Proposals for mailed surveys should clearly outline the data collection series and each component of the survey kit.

Data processing.

Any proposal you receive should highlight the steps the research company will take to make sure that the data is accurate and representative. Depending on the type of survey, checking logic, consistency, and outliers can take a significant amount of time. You must have some process noted to identify inconsistent answers for surveys that collect a significant amount of numerical data (salary survey, market studies, budget planning). Finally, some percentage of mailed surveys need to be verified for data entry accuracy.

A straightforward analysis of survey data can meet many objectives. In other cases, a multivariate statistical analysis will provide deeper insights to achieve your objectives— making results easier to use. If your objectives include learning about separate segments of your circulation, crosstabulations should be specified.


A variety of reporting options exist for a survey. These include but are not limited to data tables, a summary of the results, in-depth analysis, and graphed presentations. As a result, you need to understand exactly what you’ll receive following your survey and in what format.

No surprises!

Make sure the proposal covers all the bases: what you need to do and provide, what the firm will do when they will do it and how much it will cost. There should be no surprises in what you need to supply. No “you need how much letterhead and envelopes?” a week before your survey is scheduled to mail. Review the price carefully and understand what it includes and doesn’t include. As with many things in life, you usually get what you pay for.

Share this:

Related posts:, notes on the pre- and post-survey.

Notes on the Pre- and Post-Survey The Pre-Post survey is used to look at how things may change over time. It may be how brand awareness levels change after a new ad campaign is introduced or how opinions of a political candidate move after a speech. The catalyst to potential change, sometimes called the event […]

The Importance of Questionnaire Design

The Importance of Questionnaire Design Planning a survey requires many steps and decisions along the way: “How many people do I need to survey? How am I going to distribute the survey?” And, while people often figure out what questions they want to ask, many overlook the importance of expert, unbiased questionnaire design. When you […]

Will Color Printing Give Your Survey a Boost?

Will Color Printing Give Your Survey a Boost? Occasionally we are asked if color printing (versus black and white) is better to use as part of a survey mailing? Will this treatment generate more attention and ultimately a better response? Our Opinion: If you’re looking to use color to boost your survey’s response rate, it […]

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed .

Grad Coach

How To Write A Research Proposal

A Straightforward How-To Guide (With Examples)

By: Derek Jansen (MBA) | Reviewed By: Dr. Eunice Rautenbach | August 2019 (Updated April 2023)

Writing up a strong research proposal for a dissertation or thesis is much like a marriage proposal. It’s a task that calls on you to win somebody over and persuade them that what you’re planning is a great idea. An idea they’re happy to say ‘yes’ to. This means that your dissertation proposal needs to be   persuasive ,   attractive   and well-planned. In this post, I’ll show you how to write a winning dissertation proposal, from scratch.

Before you start:

– Understand exactly what a research proposal is – Ask yourself these 4 questions

The 5 essential ingredients:

  • The title/topic
  • The introduction chapter
  • The scope/delimitations
  • Preliminary literature review
  • Design/ methodology
  • Practical considerations and risks 

What Is A Research Proposal?

The research proposal is literally that: a written document that communicates what you propose to research, in a concise format. It’s where you put all that stuff that’s spinning around in your head down on to paper, in a logical, convincing fashion.

Convincing   is the keyword here, as your research proposal needs to convince the assessor that your research is   clearly articulated   (i.e., a clear research question) ,   worth doing   (i.e., is unique and valuable enough to justify the effort), and   doable   within the restrictions you’ll face (time limits, budget, skill limits, etc.). If your proposal does not address these three criteria, your research won’t be approved, no matter how “exciting” the research idea might be.

PS – if you’re completely new to proposal writing, we’ve got a detailed walkthrough video covering two successful research proposals here . 

Free Webinar: How To Write A Research Proposal

How do I know I’m ready?

Before starting the writing process, you need to   ask yourself 4 important questions .  If you can’t answer them succinctly and confidently, you’re not ready – you need to go back and think more deeply about your dissertation topic .

You should be able to answer the following 4 questions before starting your dissertation or thesis research proposal:

  • WHAT is my main research question? (the topic)
  • WHO cares and why is this important? (the justification)
  • WHAT data would I need to answer this question, and how will I analyse it? (the research design)
  • HOW will I manage the completion of this research, within the given timelines? (project and risk management)

If you can’t answer these questions clearly and concisely,   you’re not yet ready   to write your research proposal – revisit our   post on choosing a topic .

If you can, that’s great – it’s time to start writing up your dissertation proposal. Next, I’ll discuss what needs to go into your research proposal, and how to structure it all into an intuitive, convincing document with a linear narrative.

The 5 Essential Ingredients

Research proposals can vary in style between institutions and disciplines, but here I’ll share with you a   handy 5-section structure   you can use. These 5 sections directly address the core questions we spoke about earlier, ensuring that you present a convincing proposal. If your institution already provides a proposal template, there will likely be substantial overlap with this, so you’ll still get value from reading on.

For each section discussed below, make sure you use headers and sub-headers (ideally, numbered headers) to help the reader navigate through your document, and to support them when they need to revisit a previous section. Don’t just present an endless wall of text, paragraph after paragraph after paragraph…

Top Tip:   Use MS Word Styles to format headings. This will allow you to be clear about whether a sub-heading is level 2, 3, or 4. Additionally, you can view your document in ‘outline view’ which will show you only your headings. This makes it much easier to check your structure, shift things around and make decisions about where a section needs to sit. You can also generate a 100% accurate table of contents using Word’s automatic functionality.

how to evaluate research proposal

Ingredient #1 – Topic/Title Header

Your research proposal’s title should be your main research question in its simplest form, possibly with a sub-heading providing basic details on the specifics of the study. For example:

“Compliance with equality legislation in the charity sector: a study of the ‘reasonable adjustments’ made in three London care homes”

As you can see, this title provides a clear indication of what the research is about, in broad terms. It paints a high-level picture for the first-time reader, which gives them a taste of what to expect.   Always aim for a clear, concise title . Don’t feel the need to capture every detail of your research in your title – your proposal will fill in the gaps.

Need a helping hand?

how to evaluate research proposal

Ingredient #2 – Introduction

In this section of your research proposal, you’ll expand on what you’ve communicated in the title, by providing a few paragraphs which offer more detail about your research topic. Importantly, the focus here is the   topic   – what will you research and why is that worth researching? This is not the place to discuss methodology, practicalities, etc. – you’ll do that later.

You should cover the following:

  • An overview of the   broad area   you’ll be researching – introduce the reader to key concepts and language
  • An explanation of the   specific (narrower) area   you’ll be focusing, and why you’ll be focusing there
  • Your research   aims   and   objectives
  • Your   research question (s) and sub-questions (if applicable)

Importantly, you should aim to use short sentences and plain language – don’t babble on with extensive jargon, acronyms and complex language. Assume that the reader is an intelligent layman – not a subject area specialist (even if they are). Remember that the   best writing is writing that can be easily understood   and digested. Keep it simple.

The introduction section serves to expand on the  research topic – what will you study and why is that worth dedicating time and effort to?

Note that some universities may want some extra bits and pieces in your introduction section. For example, personal development objectives, a structural outline, etc. Check your brief to see if there are any other details they expect in your proposal, and make sure you find a place for these.

Ingredient #3 – Scope

Next, you’ll need to specify what the scope of your research will be – this is also known as the delimitations . In other words, you need to make it clear what you will be covering and, more importantly, what you won’t be covering in your research. Simply put, this is about ring fencing your research topic so that you have a laser-sharp focus.

All too often, students feel the need to go broad and try to address as many issues as possible, in the interest of producing comprehensive research. Whilst this is admirable, it’s a mistake. By tightly refining your scope, you’ll enable yourself to   go deep   with your research, which is what you need to earn good marks. If your scope is too broad, you’re likely going to land up with superficial research (which won’t earn marks), so don’t be afraid to narrow things down.

Ingredient #4 – Literature Review

In this section of your research proposal, you need to provide a (relatively) brief discussion of the existing literature. Naturally, this will not be as comprehensive as the literature review in your actual dissertation, but it will lay the foundation for that. In fact, if you put in the effort at this stage, you’ll make your life a lot easier when it’s time to write your actual literature review chapter.

There are a few things you need to achieve in this section:

  • Demonstrate that you’ve done your reading and are   familiar with the current state of the research   in your topic area.
  • Show that   there’s a clear gap   for your specific research – i.e., show that your topic is sufficiently unique and will add value to the existing research.
  • Show how the existing research has shaped your thinking regarding   research design . For example, you might use scales or questionnaires from previous studies.

When you write up your literature review, keep these three objectives front of mind, especially number two (revealing the gap in the literature), so that your literature review has a   clear purpose and direction . Everything you write should be contributing towards one (or more) of these objectives in some way. If it doesn’t, you need to ask yourself whether it’s truly needed.

Top Tip:  Don’t fall into the trap of just describing the main pieces of literature, for example, “A says this, B says that, C also says that…” and so on. Merely describing the literature provides no value. Instead, you need to   synthesise   it, and use it to address the three objectives above.

 If you put in the effort at the proposal stage, you’ll make your life a lot easier when its time to write your actual literature review chapter.

Ingredient #5 – Research Methodology

Now that you’ve clearly explained both your intended research topic (in the introduction) and the existing research it will draw on (in the literature review section), it’s time to get practical and explain exactly how you’ll be carrying out your own research. In other words, your research methodology.

In this section, you’ll need to   answer two critical questions :

  • How   will you design your research? I.e., what research methodology will you adopt, what will your sample be, how will you collect data, etc.
  • Why   have you chosen this design? I.e., why does this approach suit your specific research aims, objectives and questions?

In other words, this is not just about explaining WHAT you’ll be doing, it’s also about explaining WHY. In fact, the   justification is the most important part , because that justification is how you demonstrate a good understanding of research design (which is what assessors want to see).

Some essential design choices you need to cover in your research proposal include:

  • Your intended research philosophy (e.g., positivism, interpretivism or pragmatism )
  • What methodological approach you’ll be taking (e.g., qualitative , quantitative or mixed )
  • The details of your sample (e.g., sample size, who they are, who they represent, etc.)
  • What data you plan to collect (i.e. data about what, in what form?)
  • How you plan to collect it (e.g., surveys , interviews , focus groups, etc.)
  • How you plan to analyse it (e.g., regression analysis, thematic analysis , etc.)
  • Ethical adherence (i.e., does this research satisfy all ethical requirements of your institution, or does it need further approval?)

This list is not exhaustive – these are just some core attributes of research design. Check with your institution what level of detail they expect. The “ research onion ” by Saunders et al (2009) provides a good summary of the various design choices you ultimately need to make – you can   read more about that here .

Don’t forget the practicalities…

In addition to the technical aspects, you will need to address the   practical   side of the project. In other words, you need to explain   what resources you’ll need   (e.g., time, money, access to equipment or software, etc.) and how you intend to secure these resources. You need to show that your project is feasible, so any “make or break” type resources need to already be secured. The success or failure of your project cannot depend on some resource which you’re not yet sure you have access to.

Another part of the practicalities discussion is   project and risk management . In other words, you need to show that you have a clear project plan to tackle your research with. Some key questions to address:

  • What are the timelines for each phase of your project?
  • Are the time allocations reasonable?
  • What happens if something takes longer than anticipated (risk management)?
  • What happens if you don’t get the response rate you expect?

A good way to demonstrate that you’ve thought this through is to include a Gantt chart and a risk register (in the appendix if word count is a problem). With these two tools, you can show that you’ve got a clear, feasible plan, and you’ve thought about and accounted for the potential risks.

Gantt chart

Tip – Be honest about the potential difficulties – but show that you are anticipating solutions and workarounds. This is much more impressive to an assessor than an unrealistically optimistic proposal which does not anticipate any challenges whatsoever.

Final Touches: Read And Simplify

The final step is to edit and proofread your proposal – very carefully. It sounds obvious, but all too often poor editing and proofreading ruin a good proposal. Nothing is more off-putting for an assessor than a poorly edited, typo-strewn document. It sends the message that you either do not pay attention to detail, or just don’t care. Neither of these are good messages. Put the effort into editing and proofreading your proposal (or pay someone to do it for you) – it will pay dividends.

When you’re editing, watch out for ‘academese’. Many students can speak simply, passionately and clearly about their dissertation topic – but become incomprehensible the moment they turn the laptop on. You are not required to write in any kind of special, formal, complex language when you write academic work. Sure, there may be technical terms, jargon specific to your discipline, shorthand terms and so on. But, apart from those,   keep your written language very close to natural spoken language   – just as you would speak in the classroom. Imagine that you are explaining your project plans to your classmates or a family member. Remember, write for the intelligent layman, not the subject matter experts. Plain-language, concise writing is what wins hearts and minds – and marks!

Let’s Recap: Research Proposal 101

And there you have it – how to write your dissertation or thesis research proposal, from the title page to the final proof. Here’s a quick recap of the key takeaways:

  • The purpose of the research proposal is to   convince   – therefore, you need to make a clear, concise argument of why your research is both worth doing and doable.
  • Make sure you can ask the critical what, who, and how questions of your research   before   you put pen to paper.
  • Title – provides the first taste of your research, in broad terms
  • Introduction – explains what you’ll be researching in more detail
  • Scope – explains the boundaries of your research
  • Literature review – explains how your research fits into the existing research and why it’s unique and valuable
  • Research methodology – explains and justifies how you will carry out your own research

Hopefully, this post has helped you better understand how to write up a winning research proposal. If you enjoyed it, be sure to check out the rest of the Grad Coach Blog . If your university doesn’t provide any template for your proposal, you might want to try out our free research proposal template .

Literature Review Course

Psst… there’s more!

This post is an extract from our bestselling Udemy Course, Research Proposal Bootcamp . If you want to work smart, you don't want to miss this .

You Might Also Like:

How to write the discussion chapter


Mazwakhe Mkhulisi

Thank you so much for the valuable insight that you have given, especially on the research proposal. That is what I have managed to cover. I still need to go back to the other parts as I got disturbed while still listening to Derek’s audio on you-tube. I am inspired. I will definitely continue with Grad-coach guidance on You-tube.

Derek Jansen

Thanks for the kind words :). All the best with your proposal.


First of all, thanks a lot for making such a wonderful presentation. The video was really useful and gave me a very clear insight of how a research proposal has to be written. I shall try implementing these ideas in my RP.

Once again, I thank you for this content.

Bonginkosi Mshengu

I found reading your outline on writing research proposal very beneficial. I wish there was a way of submitting my draft proposal to you guys for critiquing before I submit to the institution.

Hi Bonginkosi

Thank you for the kind words. Yes, we do provide a review service. The best starting point is to have a chat with one of our coaches here: https://gradcoach.com/book/new/ .

Erick Omondi

Hello team GRADCOACH, may God bless you so much. I was totally green in research. Am so happy for your free superb tutorials and resources. Once again thank you so much Derek and his team.

You’re welcome, Erick. Good luck with your research proposal 🙂


thank you for the information. its precise and on point.

Nighat Nighat Ahsan

Really a remarkable piece of writing and great source of guidance for the researchers. GOD BLESS YOU for your guidance. Regards

Delfina Celeste Danca Rangel

Thanks so much for your guidance. It is easy and comprehensive the way you explain the steps for a winning research proposal.

Desiré Forku

Thank you guys so much for the rich post. I enjoyed and learn from every word in it. My problem now is how to get into your platform wherein I can always seek help on things related to my research work ? Secondly, I wish to find out if there is a way I can send my tentative proposal to you guys for examination before I take to my supervisor Once again thanks very much for the insights

Thanks for your kind words, Desire.

If you are based in a country where Grad Coach’s paid services are available, you can book a consultation by clicking the “Book” button in the top right.

Best of luck with your studies.


May God bless you team for the wonderful work you are doing,

If I have a topic, Can I submit it to you so that you can draft a proposal for me?? As I am expecting to go for masters degree in the near future.

Thanks for your comment. We definitely cannot draft a proposal for you, as that would constitute academic misconduct. The proposal needs to be your own work. We can coach you through the process, but it needs to be your own work and your own writing.

Best of luck with your research!

kenate Akuma

I found a lot of many essential concepts from your material. it is real a road map to write a research proposal. so thanks a lot. If there is any update material on your hand on MBA please forward to me.

Ahmed Khalil

GradCoach is a professional website that presents support and helps for MBA student like me through the useful online information on the page and with my 1-on-1 online coaching with the amazing and professional PhD Kerryen.

Thank you Kerryen so much for the support and help 🙂

I really recommend dealing with such a reliable services provider like Gradcoah and a coach like Kerryen.


Hi, Am happy for your service and effort to help students and researchers, Please, i have been given an assignment on research for strategic development, the task one is to formulate a research proposal to support the strategic development of a business area, my issue here is how to go about it, especially the topic or title and introduction. Please, i would like to know if you could help me and how much is the charge.

Marcos A. López Figueroa

This content is practical, valuable, and just great!

Thank you very much!

Eric Rwigamba

Hi Derek, Thank you for the valuable presentation. It is very helpful especially for beginners like me. I am just starting my PhD.


This is quite instructive and research proposal made simple. Can I have a research proposal template?

Mathew Yokie Musa

Great! Thanks for rescuing me, because I had no former knowledge in this topic. But with this piece of information, I am now secured. Thank you once more.

Chulekazi Bula

I enjoyed listening to your video on how to write a proposal. I think I will be able to write a winning proposal with your advice. I wish you were to be my supervisor.

Mohammad Ajmal Shirzad

Dear Derek Jansen,

Thank you for your great content. I couldn’t learn these topics in MBA, but now I learned from GradCoach. Really appreciate your efforts….

From Afghanistan!

Mulugeta Yilma

I have got very essential inputs for startup of my dissertation proposal. Well organized properly communicated with video presentation. Thank you for the presentation.

Siphesihle Macu

Wow, this is absolutely amazing guys. Thank you so much for the fruitful presentation, you’ve made my research much easier.


this helps me a lot. thank you all so much for impacting in us. may god richly bless you all

June Pretzer

How I wish I’d learn about Grad Coach earlier. I’ve been stumbling around writing and rewriting! Now I have concise clear directions on how to put this thing together. Thank you!


Fantastic!! Thank You for this very concise yet comprehensive guidance.

Fikiru Bekele

Even if I am poor in English I would like to thank you very much.

Submit a Comment Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

  • Print Friendly

Logo for M Libraries Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

11.2 Steps in Developing a Research Proposal

Learning objectives.

  • Identify the steps in developing a research proposal.
  • Choose a topic and formulate a research question and working thesis.
  • Develop a research proposal.

Writing a good research paper takes time, thought, and effort. Although this assignment is challenging, it is manageable. Focusing on one step at a time will help you develop a thoughtful, informative, well-supported research paper.

Your first step is to choose a topic and then to develop research questions, a working thesis, and a written research proposal. Set aside adequate time for this part of the process. Fully exploring ideas will help you build a solid foundation for your paper.

Choosing a Topic

When you choose a topic for a research paper, you are making a major commitment. Your choice will help determine whether you enjoy the lengthy process of research and writing—and whether your final paper fulfills the assignment requirements. If you choose your topic hastily, you may later find it difficult to work with your topic. By taking your time and choosing carefully, you can ensure that this assignment is not only challenging but also rewarding.

Writers understand the importance of choosing a topic that fulfills the assignment requirements and fits the assignment’s purpose and audience. (For more information about purpose and audience, see Chapter 6 “Writing Paragraphs: Separating Ideas and Shaping Content” .) Choosing a topic that interests you is also crucial. You instructor may provide a list of suggested topics or ask that you develop a topic on your own. In either case, try to identify topics that genuinely interest you.

After identifying potential topic ideas, you will need to evaluate your ideas and choose one topic to pursue. Will you be able to find enough information about the topic? Can you develop a paper about this topic that presents and supports your original ideas? Is the topic too broad or too narrow for the scope of the assignment? If so, can you modify it so it is more manageable? You will ask these questions during this preliminary phase of the research process.

Identifying Potential Topics

Sometimes, your instructor may provide a list of suggested topics. If so, you may benefit from identifying several possibilities before committing to one idea. It is important to know how to narrow down your ideas into a concise, manageable thesis. You may also use the list as a starting point to help you identify additional, related topics. Discussing your ideas with your instructor will help ensure that you choose a manageable topic that fits the requirements of the assignment.

In this chapter, you will follow a writer named Jorge, who is studying health care administration, as he prepares a research paper. You will also plan, research, and draft your own research paper.

Jorge was assigned to write a research paper on health and the media for an introductory course in health care. Although a general topic was selected for the students, Jorge had to decide which specific issues interested him. He brainstormed a list of possibilities.

If you are writing a research paper for a specialized course, look back through your notes and course activities. Identify reading assignments and class discussions that especially engaged you. Doing so can help you identify topics to pursue.

  • Health Maintenance Organizations (HMOs) in the news
  • Sexual education programs
  • Hollywood and eating disorders
  • Americans’ access to public health information
  • Media portrayal of health care reform bill
  • Depictions of drugs on television
  • The effect of the Internet on mental health
  • Popularized diets (such as low-carbohydrate diets)
  • Fear of pandemics (bird flu, HINI, SARS)
  • Electronic entertainment and obesity
  • Advertisements for prescription drugs
  • Public education and disease prevention

Set a timer for five minutes. Use brainstorming or idea mapping to create a list of topics you would be interested in researching for a paper about the influence of the Internet on social networking. Do you closely follow the media coverage of a particular website, such as Twitter? Would you like to learn more about a certain industry, such as online dating? Which social networking sites do you and your friends use? List as many ideas related to this topic as you can.

Narrowing Your Topic

Once you have a list of potential topics, you will need to choose one as the focus of your essay. You will also need to narrow your topic. Most writers find that the topics they listed during brainstorming or idea mapping are broad—too broad for the scope of the assignment. Working with an overly broad topic, such as sexual education programs or popularized diets, can be frustrating and overwhelming. Each topic has so many facets that it would be impossible to cover them all in a college research paper. However, more specific choices, such as the pros and cons of sexual education in kids’ television programs or the physical effects of the South Beach diet, are specific enough to write about without being too narrow to sustain an entire research paper.

A good research paper provides focused, in-depth information and analysis. If your topic is too broad, you will find it difficult to do more than skim the surface when you research it and write about it. Narrowing your focus is essential to making your topic manageable. To narrow your focus, explore your topic in writing, conduct preliminary research, and discuss both the topic and the research with others.

Exploring Your Topic in Writing

“How am I supposed to narrow my topic when I haven’t even begun researching yet?” In fact, you may already know more than you realize. Review your list and identify your top two or three topics. Set aside some time to explore each one through freewriting. (For more information about freewriting, see Chapter 8 “The Writing Process: How Do I Begin?” .) Simply taking the time to focus on your topic may yield fresh angles.

Jorge knew that he was especially interested in the topic of diet fads, but he also knew that it was much too broad for his assignment. He used freewriting to explore his thoughts so he could narrow his topic. Read Jorge’s ideas.

Conducting Preliminary Research

Another way writers may focus a topic is to conduct preliminary research . Like freewriting, exploratory reading can help you identify interesting angles. Surfing the web and browsing through newspaper and magazine articles are good ways to start. Find out what people are saying about your topic on blogs and online discussion groups. Discussing your topic with others can also inspire you. Talk about your ideas with your classmates, your friends, or your instructor.

Jorge’s freewriting exercise helped him realize that the assigned topic of health and the media intersected with a few of his interests—diet, nutrition, and obesity. Preliminary online research and discussions with his classmates strengthened his impression that many people are confused or misled by media coverage of these subjects.

Jorge decided to focus his paper on a topic that had garnered a great deal of media attention—low-carbohydrate diets. He wanted to find out whether low-carbohydrate diets were as effective as their proponents claimed.

Writing at Work

At work, you may need to research a topic quickly to find general information. This information can be useful in understanding trends in a given industry or generating competition. For example, a company may research a competitor’s prices and use the information when pricing their own product. You may find it useful to skim a variety of reliable sources and take notes on your findings.

The reliability of online sources varies greatly. In this exploratory phase of your research, you do not need to evaluate sources as closely as you will later. However, use common sense as you refine your paper topic. If you read a fascinating blog comment that gives you a new idea for your paper, be sure to check out other, more reliable sources as well to make sure the idea is worth pursuing.

Review the list of topics you created in Note 11.18 “Exercise 1” and identify two or three topics you would like to explore further. For each of these topics, spend five to ten minutes writing about the topic without stopping. Then review your writing to identify possible areas of focus.

Set aside time to conduct preliminary research about your potential topics. Then choose a topic to pursue for your research paper.


Please share your topic list with a classmate. Select one or two topics on his or her list that you would like to learn more about and return it to him or her. Discuss why you found the topics interesting, and learn which of your topics your classmate selected and why.

A Plan for Research

Your freewriting and preliminary research have helped you choose a focused, manageable topic for your research paper. To work with your topic successfully, you will need to determine what exactly you want to learn about it—and later, what you want to say about it. Before you begin conducting in-depth research, you will further define your focus by developing a research question , a working thesis, and a research proposal.

Formulating a Research Question

In forming a research question, you are setting a goal for your research. Your main research question should be substantial enough to form the guiding principle of your paper—but focused enough to guide your research. A strong research question requires you not only to find information but also to put together different pieces of information, interpret and analyze them, and figure out what you think. As you consider potential research questions, ask yourself whether they would be too hard or too easy to answer.

To determine your research question, review the freewriting you completed earlier. Skim through books, articles, and websites and list the questions you have. (You may wish to use the 5WH strategy to help you formulate questions. See Chapter 8 “The Writing Process: How Do I Begin?” for more information about 5WH questions.) Include simple, factual questions and more complex questions that would require analysis and interpretation. Determine your main question—the primary focus of your paper—and several subquestions that you will need to research to answer your main question.

Here are the research questions Jorge will use to focus his research. Notice that his main research question has no obvious, straightforward answer. Jorge will need to research his subquestions, which address narrower topics, to answer his main question.

Using the topic you selected in Note 11.24 “Exercise 2” , write your main research question and at least four to five subquestions. Check that your main research question is appropriately complex for your assignment.

Constructing a Working ThesIs

A working thesis concisely states a writer’s initial answer to the main research question. It does not merely state a fact or present a subjective opinion. Instead, it expresses a debatable idea or claim that you hope to prove through additional research. Your working thesis is called a working thesis for a reason—it is subject to change. As you learn more about your topic, you may change your thinking in light of your research findings. Let your working thesis serve as a guide to your research, but do not be afraid to modify it based on what you learn.

Jorge began his research with a strong point of view based on his preliminary writing and research. Read his working thesis statement, which presents the point he will argue. Notice how it states Jorge’s tentative answer to his research question.

One way to determine your working thesis is to consider how you would complete sentences such as I believe or My opinion is . However, keep in mind that academic writing generally does not use first-person pronouns. These statements are useful starting points, but formal research papers use an objective voice.

Write a working thesis statement that presents your preliminary answer to the research question you wrote in Note 11.27 “Exercise 3” . Check that your working thesis statement presents an idea or claim that could be supported or refuted by evidence from research.

Creating a Research Proposal

A research proposal is a brief document—no more than one typed page—that summarizes the preliminary work you have completed. Your purpose in writing it is to formalize your plan for research and present it to your instructor for feedback. In your research proposal, you will present your main research question, related subquestions, and working thesis. You will also briefly discuss the value of researching this topic and indicate how you plan to gather information.

When Jorge began drafting his research proposal, he realized that he had already created most of the pieces he needed. However, he knew he also had to explain how his research would be relevant to other future health care professionals. In addition, he wanted to form a general plan for doing the research and identifying potentially useful sources. Read Jorge’s research proposal.

Read Jorge's research proposal

Before you begin a new project at work, you may have to develop a project summary document that states the purpose of the project, explains why it would be a wise use of company resources, and briefly outlines the steps involved in completing the project. This type of document is similar to a research proposal. Both documents define and limit a project, explain its value, discuss how to proceed, and identify what resources you will use.

Writing Your Own Research Proposal

Now you may write your own research proposal, if you have not done so already. Follow the guidelines provided in this lesson.

Key Takeaways

  • Developing a research proposal involves the following preliminary steps: identifying potential ideas, choosing ideas to explore further, choosing and narrowing a topic, formulating a research question, and developing a working thesis.
  • A good topic for a research paper interests the writer and fulfills the requirements of the assignment.
  • Defining and narrowing a topic helps writers conduct focused, in-depth research.
  • Writers conduct preliminary research to identify possible topics and research questions and to develop a working thesis.
  • A good research question interests readers, is neither too broad nor too narrow, and has no obvious answer.
  • A good working thesis expresses a debatable idea or claim that can be supported with evidence from research.
  • Writers create a research proposal to present their topic, main research question, subquestions, and working thesis to an instructor for approval or feedback.

Writing for Success Copyright © 2015 by University of Minnesota is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Organizing Your Social Sciences Research Assignments

  • Annotated Bibliography
  • Analyzing a Scholarly Journal Article
  • Group Presentations
  • Dealing with Nervousness
  • Using Visual Aids
  • Grading Someone Else's Paper
  • Types of Structured Group Activities
  • Group Project Survival Skills
  • Leading a Class Discussion
  • Multiple Book Review Essay
  • Reviewing Collected Works
  • Writing a Case Analysis Paper
  • Writing a Case Study
  • About Informed Consent
  • Writing Field Notes
  • Writing a Policy Memo
  • Writing a Reflective Paper
  • Writing a Research Proposal
  • Generative AI and Writing
  • Acknowledgments

The goal of a research proposal is twofold: to present and justify the need to study a research problem and to present the practical ways in which the proposed study should be conducted. The design elements and procedures for conducting research are governed by standards of the predominant discipline in which the problem resides, therefore, the guidelines for research proposals are more exacting and less formal than a general project proposal. Research proposals contain extensive literature reviews. They must provide persuasive evidence that a need exists for the proposed study. In addition to providing a rationale, a proposal describes detailed methodology for conducting the research consistent with requirements of the professional or academic field and a statement on anticipated outcomes and benefits derived from the study's completion.

Krathwohl, David R. How to Prepare a Dissertation Proposal: Suggestions for Students in Education and the Social and Behavioral Sciences . Syracuse, NY: Syracuse University Press, 2005.

How to Approach Writing a Research Proposal

Your professor may assign the task of writing a research proposal for the following reasons:

  • Develop your skills in thinking about and designing a comprehensive research study;
  • Learn how to conduct a comprehensive review of the literature to determine that the research problem has not been adequately addressed or has been answered ineffectively and, in so doing, become better at locating pertinent scholarship related to your topic;
  • Improve your general research and writing skills;
  • Practice identifying the logical steps that must be taken to accomplish one's research goals;
  • Critically review, examine, and consider the use of different methods for gathering and analyzing data related to the research problem; and,
  • Nurture a sense of inquisitiveness within yourself and to help see yourself as an active participant in the process of conducting scholarly research.

A proposal should contain all the key elements involved in designing a completed research study, with sufficient information that allows readers to assess the validity and usefulness of your proposed study. The only elements missing from a research proposal are the findings of the study and your analysis of those findings. Finally, an effective proposal is judged on the quality of your writing and, therefore, it is important that your proposal is coherent, clear, and compelling.

Regardless of the research problem you are investigating and the methodology you choose, all research proposals must address the following questions:

  • What do you plan to accomplish? Be clear and succinct in defining the research problem and what it is you are proposing to investigate.
  • Why do you want to do the research? In addition to detailing your research design, you also must conduct a thorough review of the literature and provide convincing evidence that it is a topic worthy of in-depth study. A successful research proposal must answer the "So What?" question.
  • How are you going to conduct the research? Be sure that what you propose is doable. If you're having difficulty formulating a research problem to propose investigating, go here for strategies in developing a problem to study.

Common Mistakes to Avoid

  • Failure to be concise . A research proposal must be focused and not be "all over the map" or diverge into unrelated tangents without a clear sense of purpose.
  • Failure to cite landmark works in your literature review . Proposals should be grounded in foundational research that lays a foundation for understanding the development and scope of the the topic and its relevance.
  • Failure to delimit the contextual scope of your research [e.g., time, place, people, etc.]. As with any research paper, your proposed study must inform the reader how and in what ways the study will frame the problem.
  • Failure to develop a coherent and persuasive argument for the proposed research . This is critical. In many workplace settings, the research proposal is a formal document intended to argue for why a study should be funded.
  • Sloppy or imprecise writing, or poor grammar . Although a research proposal does not represent a completed research study, there is still an expectation that it is well-written and follows the style and rules of good academic writing.
  • Too much detail on minor issues, but not enough detail on major issues . Your proposal should focus on only a few key research questions in order to support the argument that the research needs to be conducted. Minor issues, even if valid, can be mentioned but they should not dominate the overall narrative.

Procter, Margaret. The Academic Proposal.  The Lab Report. University College Writing Centre. University of Toronto; Sanford, Keith. Information for Students: Writing a Research Proposal. Baylor University; Wong, Paul T. P. How to Write a Research Proposal. International Network on Personal Meaning. Trinity Western University; Writing Academic Proposals: Conferences, Articles, and Books. The Writing Lab and The OWL. Purdue University; Writing a Research Proposal. University Library. University of Illinois at Urbana-Champaign.

Structure and Writing Style

Beginning the Proposal Process

As with writing most college-level academic papers, research proposals are generally organized the same way throughout most social science disciplines. The text of proposals generally vary in length between ten and thirty-five pages, followed by the list of references. However, before you begin, read the assignment carefully and, if anything seems unclear, ask your professor whether there are any specific requirements for organizing and writing the proposal.

A good place to begin is to ask yourself a series of questions:

  • What do I want to study?
  • Why is the topic important?
  • How is it significant within the subject areas covered in my class?
  • What problems will it help solve?
  • How does it build upon [and hopefully go beyond] research already conducted on the topic?
  • What exactly should I plan to do, and can I get it done in the time available?

In general, a compelling research proposal should document your knowledge of the topic and demonstrate your enthusiasm for conducting the study. Approach it with the intention of leaving your readers feeling like, "Wow, that's an exciting idea and I can’t wait to see how it turns out!"

Most proposals should include the following sections:

I.  Introduction

In the real world of higher education, a research proposal is most often written by scholars seeking grant funding for a research project or it's the first step in getting approval to write a doctoral dissertation. Even if this is just a course assignment, treat your introduction as the initial pitch of an idea based on a thorough examination of the significance of a research problem. After reading the introduction, your readers should not only have an understanding of what you want to do, but they should also be able to gain a sense of your passion for the topic and to be excited about the study's possible outcomes. Note that most proposals do not include an abstract [summary] before the introduction.

Think about your introduction as a narrative written in two to four paragraphs that succinctly answers the following four questions :

  • What is the central research problem?
  • What is the topic of study related to that research problem?
  • What methods should be used to analyze the research problem?
  • Answer the "So What?" question by explaining why this is important research, what is its significance, and why should someone reading the proposal care about the outcomes of the proposed study?

II.  Background and Significance

This is where you explain the scope and context of your proposal and describe in detail why it's important. It can be melded into your introduction or you can create a separate section to help with the organization and narrative flow of your proposal. Approach writing this section with the thought that you can’t assume your readers will know as much about the research problem as you do. Note that this section is not an essay going over everything you have learned about the topic; instead, you must choose what is most relevant in explaining the aims of your research.

To that end, while there are no prescribed rules for establishing the significance of your proposed study, you should attempt to address some or all of the following:

  • State the research problem and give a more detailed explanation about the purpose of the study than what you stated in the introduction. This is particularly important if the problem is complex or multifaceted .
  • Present the rationale of your proposed study and clearly indicate why it is worth doing; be sure to answer the "So What? question [i.e., why should anyone care?].
  • Describe the major issues or problems examined by your research. This can be in the form of questions to be addressed. Be sure to note how your proposed study builds on previous assumptions about the research problem.
  • Explain the methods you plan to use for conducting your research. Clearly identify the key sources you intend to use and explain how they will contribute to your analysis of the topic.
  • Describe the boundaries of your proposed research in order to provide a clear focus. Where appropriate, state not only what you plan to study, but what aspects of the research problem will be excluded from the study.
  • If necessary, provide definitions of key concepts, theories, or terms.

III.  Literature Review

Connected to the background and significance of your study is a section of your proposal devoted to a more deliberate review and synthesis of prior studies related to the research problem under investigation . The purpose here is to place your project within the larger whole of what is currently being explored, while at the same time, demonstrating to your readers that your work is original and innovative. Think about what questions other researchers have asked, what methodological approaches they have used, and what is your understanding of their findings and, when stated, their recommendations. Also pay attention to any suggestions for further research.

Since a literature review is information dense, it is crucial that this section is intelligently structured to enable a reader to grasp the key arguments underpinning your proposed study in relation to the arguments put forth by other researchers. A good strategy is to break the literature into "conceptual categories" [themes] rather than systematically or chronologically describing groups of materials one at a time. Note that conceptual categories generally reveal themselves after you have read most of the pertinent literature on your topic so adding new categories is an on-going process of discovery as you review more studies. How do you know you've covered the key conceptual categories underlying the research literature? Generally, you can have confidence that all of the significant conceptual categories have been identified if you start to see repetition in the conclusions or recommendations that are being made.

NOTE: Do not shy away from challenging the conclusions made in prior research as a basis for supporting the need for your proposal. Assess what you believe is missing and state how previous research has failed to adequately examine the issue that your study addresses. Highlighting the problematic conclusions strengthens your proposal. For more information on writing literature reviews, GO HERE .

To help frame your proposal's review of prior research, consider the "five C’s" of writing a literature review:

  • Cite , so as to keep the primary focus on the literature pertinent to your research problem.
  • Compare the various arguments, theories, methodologies, and findings expressed in the literature: what do the authors agree on? Who applies similar approaches to analyzing the research problem?
  • Contrast the various arguments, themes, methodologies, approaches, and controversies expressed in the literature: describe what are the major areas of disagreement, controversy, or debate among scholars?
  • Critique the literature: Which arguments are more persuasive, and why? Which approaches, findings, and methodologies seem most reliable, valid, or appropriate, and why? Pay attention to the verbs you use to describe what an author says/does [e.g., asserts, demonstrates, argues, etc.].
  • Connect the literature to your own area of research and investigation: how does your own work draw upon, depart from, synthesize, or add a new perspective to what has been said in the literature?

IV.  Research Design and Methods

This section must be well-written and logically organized because you are not actually doing the research, yet, your reader must have confidence that you have a plan worth pursuing . The reader will never have a study outcome from which to evaluate whether your methodological choices were the correct ones. Thus, the objective here is to convince the reader that your overall research design and proposed methods of analysis will correctly address the problem and that the methods will provide the means to effectively interpret the potential results. Your design and methods should be unmistakably tied to the specific aims of your study.

Describe the overall research design by building upon and drawing examples from your review of the literature. Consider not only methods that other researchers have used, but methods of data gathering that have not been used but perhaps could be. Be specific about the methodological approaches you plan to undertake to obtain information, the techniques you would use to analyze the data, and the tests of external validity to which you commit yourself [i.e., the trustworthiness by which you can generalize from your study to other people, places, events, and/or periods of time].

When describing the methods you will use, be sure to cover the following:

  • Specify the research process you will undertake and the way you will interpret the results obtained in relation to the research problem. Don't just describe what you intend to achieve from applying the methods you choose, but state how you will spend your time while applying these methods [e.g., coding text from interviews to find statements about the need to change school curriculum; running a regression to determine if there is a relationship between campaign advertising on social media sites and election outcomes in Europe ].
  • Keep in mind that the methodology is not just a list of tasks; it is a deliberate argument as to why techniques for gathering information add up to the best way to investigate the research problem. This is an important point because the mere listing of tasks to be performed does not demonstrate that, collectively, they effectively address the research problem. Be sure you clearly explain this.
  • Anticipate and acknowledge any potential barriers and pitfalls in carrying out your research design and explain how you plan to address them. No method applied to research in the social and behavioral sciences is perfect, so you need to describe where you believe challenges may exist in obtaining data or accessing information. It's always better to acknowledge this than to have it brought up by your professor!

V.  Preliminary Suppositions and Implications

Just because you don't have to actually conduct the study and analyze the results, doesn't mean you can skip talking about the analytical process and potential implications . The purpose of this section is to argue how and in what ways you believe your research will refine, revise, or extend existing knowledge in the subject area under investigation. Depending on the aims and objectives of your study, describe how the anticipated results will impact future scholarly research, theory, practice, forms of interventions, or policy making. Note that such discussions may have either substantive [a potential new policy], theoretical [a potential new understanding], or methodological [a potential new way of analyzing] significance.   When thinking about the potential implications of your study, ask the following questions:

  • What might the results mean in regards to challenging the theoretical framework and underlying assumptions that support the study?
  • What suggestions for subsequent research could arise from the potential outcomes of the study?
  • What will the results mean to practitioners in the natural settings of their workplace, organization, or community?
  • Will the results influence programs, methods, and/or forms of intervention?
  • How might the results contribute to the solution of social, economic, or other types of problems?
  • Will the results influence policy decisions?
  • In what way do individuals or groups benefit should your study be pursued?
  • What will be improved or changed as a result of the proposed research?
  • How will the results of the study be implemented and what innovations or transformative insights could emerge from the process of implementation?

NOTE:   This section should not delve into idle speculation, opinion, or be formulated on the basis of unclear evidence . The purpose is to reflect upon gaps or understudied areas of the current literature and describe how your proposed research contributes to a new understanding of the research problem should the study be implemented as designed.

ANOTHER NOTE : This section is also where you describe any potential limitations to your proposed study. While it is impossible to highlight all potential limitations because the study has yet to be conducted, you still must tell the reader where and in what form impediments may arise and how you plan to address them.

VI.  Conclusion

The conclusion reiterates the importance or significance of your proposal and provides a brief summary of the entire study . This section should be only one or two paragraphs long, emphasizing why the research problem is worth investigating, why your research study is unique, and how it should advance existing knowledge.

Someone reading this section should come away with an understanding of:

  • Why the study should be done;
  • The specific purpose of the study and the research questions it attempts to answer;
  • The decision for why the research design and methods used where chosen over other options;
  • The potential implications emerging from your proposed study of the research problem; and
  • A sense of how your study fits within the broader scholarship about the research problem.

VII.  Citations

As with any scholarly research paper, you must cite the sources you used . In a standard research proposal, this section can take two forms, so consult with your professor about which one is preferred.

  • References -- a list of only the sources you actually used in creating your proposal.
  • Bibliography -- a list of everything you used in creating your proposal, along with additional citations to any key sources relevant to understanding the research problem.

In either case, this section should testify to the fact that you did enough preparatory work to ensure the project will complement and not just duplicate the efforts of other researchers. It demonstrates to the reader that you have a thorough understanding of prior research on the topic.

Most proposal formats have you start a new page and use the heading "References" or "Bibliography" centered at the top of the page. Cited works should always use a standard format that follows the writing style advised by the discipline of your course [e.g., education=APA; history=Chicago] or that is preferred by your professor. This section normally does not count towards the total page length of your research proposal.

Develop a Research Proposal: Writing the Proposal. Office of Library Information Services. Baltimore County Public Schools; Heath, M. Teresa Pereira and Caroline Tynan. “Crafting a Research Proposal.” The Marketing Review 10 (Summer 2010): 147-168; Jones, Mark. “Writing a Research Proposal.” In MasterClass in Geography Education: Transforming Teaching and Learning . Graham Butt, editor. (New York: Bloomsbury Academic, 2015), pp. 113-127; Juni, Muhamad Hanafiah. “Writing a Research Proposal.” International Journal of Public Health and Clinical Sciences 1 (September/October 2014): 229-240; Krathwohl, David R. How to Prepare a Dissertation Proposal: Suggestions for Students in Education and the Social and Behavioral Sciences . Syracuse, NY: Syracuse University Press, 2005; Procter, Margaret. The Academic Proposal. The Lab Report. University College Writing Centre. University of Toronto; Punch, Keith and Wayne McGowan. "Developing and Writing a Research Proposal." In From Postgraduate to Social Scientist: A Guide to Key Skills . Nigel Gilbert, ed. (Thousand Oaks, CA: Sage, 2006), 59-81; Wong, Paul T. P. How to Write a Research Proposal. International Network on Personal Meaning. Trinity Western University; Writing Academic Proposals: Conferences , Articles, and Books. The Writing Lab and The OWL. Purdue University; Writing a Research Proposal. University Library. University of Illinois at Urbana-Champaign.

  • << Previous: Writing a Reflective Paper
  • Next: Generative AI and Writing >>
  • Last Updated: Mar 6, 2024 1:00 PM
  • URL: https://libguides.usc.edu/writingguide/assignments

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base
  • Dissertation
  • What Is a Research Methodology? | Steps & Tips

What Is a Research Methodology? | Steps & Tips

Published on August 25, 2022 by Shona McCombes and Tegan George. Revised on November 20, 2023.

Your research methodology discusses and explains the data collection and analysis methods you used in your research. A key part of your thesis, dissertation , or research paper , the methodology chapter explains what you did and how you did it, allowing readers to evaluate the reliability and validity of your research and your dissertation topic .

It should include:

  • The type of research you conducted
  • How you collected and analyzed your data
  • Any tools or materials you used in the research
  • How you mitigated or avoided research biases
  • Why you chose these methods
  • Your methodology section should generally be written in the past tense .
  • Academic style guides in your field may provide detailed guidelines on what to include for different types of studies.
  • Your citation style might provide guidelines for your methodology section (e.g., an APA Style methods section ).

Instantly correct all language mistakes in your text

Upload your document to correct all your mistakes in minutes


Table of contents

How to write a research methodology, why is a methods section important, step 1: explain your methodological approach, step 2: describe your data collection methods, step 3: describe your analysis method, step 4: evaluate and justify the methodological choices you made, tips for writing a strong methodology chapter, other interesting articles, frequently asked questions about methodology.

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

Your methods section is your opportunity to share how you conducted your research and why you chose the methods you chose. It’s also the place to show that your research was rigorously conducted and can be replicated .

It gives your research legitimacy and situates it within your field, and also gives your readers a place to refer to if they have any questions or critiques in other sections.

You can start by introducing your overall approach to your research. You have two options here.

Option 1: Start with your “what”

What research problem or question did you investigate?

  • Aim to describe the characteristics of something?
  • Explore an under-researched topic?
  • Establish a causal relationship?

And what type of data did you need to achieve this aim?

  • Quantitative data , qualitative data , or a mix of both?
  • Primary data collected yourself, or secondary data collected by someone else?
  • Experimental data gathered by controlling and manipulating variables, or descriptive data gathered via observations?

Option 2: Start with your “why”

Depending on your discipline, you can also start with a discussion of the rationale and assumptions underpinning your methodology. In other words, why did you choose these methods for your study?

  • Why is this the best way to answer your research question?
  • Is this a standard methodology in your field, or does it require justification?
  • Were there any ethical considerations involved in your choices?
  • What are the criteria for validity and reliability in this type of research ? How did you prevent bias from affecting your data?

Once you have introduced your reader to your methodological approach, you should share full details about your data collection methods .

Quantitative methods

In order to be considered generalizable, you should describe quantitative research methods in enough detail for another researcher to replicate your study.

Here, explain how you operationalized your concepts and measured your variables. Discuss your sampling method or inclusion and exclusion criteria , as well as any tools, procedures, and materials you used to gather your data.

Surveys Describe where, when, and how the survey was conducted.

  • How did you design the questionnaire?
  • What form did your questions take (e.g., multiple choice, Likert scale )?
  • Were your surveys conducted in-person or virtually?
  • What sampling method did you use to select participants?
  • What was your sample size and response rate?

Experiments Share full details of the tools, techniques, and procedures you used to conduct your experiment.

  • How did you design the experiment ?
  • How did you recruit participants?
  • How did you manipulate and measure the variables ?
  • What tools did you use?

Existing data Explain how you gathered and selected the material (such as datasets or archival data) that you used in your analysis.

  • Where did you source the material?
  • How was the data originally produced?
  • What criteria did you use to select material (e.g., date range)?

The survey consisted of 5 multiple-choice questions and 10 questions measured on a 7-point Likert scale.

The goal was to collect survey responses from 350 customers visiting the fitness apparel company’s brick-and-mortar location in Boston on July 4–8, 2022, between 11:00 and 15:00.

Here, a customer was defined as a person who had purchased a product from the company on the day they took the survey. Participants were given 5 minutes to fill in the survey anonymously. In total, 408 customers responded, but not all surveys were fully completed. Due to this, 371 survey results were included in the analysis.

  • Information bias
  • Omitted variable bias
  • Regression to the mean
  • Survivorship bias
  • Undercoverage bias
  • Sampling bias

Qualitative methods

In qualitative research , methods are often more flexible and subjective. For this reason, it’s crucial to robustly explain the methodology choices you made.

Be sure to discuss the criteria you used to select your data, the context in which your research was conducted, and the role you played in collecting your data (e.g., were you an active participant, or a passive observer?)

Interviews or focus groups Describe where, when, and how the interviews were conducted.

  • How did you find and select participants?
  • How many participants took part?
  • What form did the interviews take ( structured , semi-structured , or unstructured )?
  • How long were the interviews?
  • How were they recorded?

Participant observation Describe where, when, and how you conducted the observation or ethnography .

  • What group or community did you observe? How long did you spend there?
  • How did you gain access to this group? What role did you play in the community?
  • How long did you spend conducting the research? Where was it located?
  • How did you record your data (e.g., audiovisual recordings, note-taking)?

Existing data Explain how you selected case study materials for your analysis.

  • What type of materials did you analyze?
  • How did you select them?

In order to gain better insight into possibilities for future improvement of the fitness store’s product range, semi-structured interviews were conducted with 8 returning customers.

Here, a returning customer was defined as someone who usually bought products at least twice a week from the store.

Surveys were used to select participants. Interviews were conducted in a small office next to the cash register and lasted approximately 20 minutes each. Answers were recorded by note-taking, and seven interviews were also filmed with consent. One interviewee preferred not to be filmed.

  • The Hawthorne effect
  • Observer bias
  • The placebo effect
  • Response bias and Nonresponse bias
  • The Pygmalion effect
  • Recall bias
  • Social desirability bias
  • Self-selection bias

Mixed methods

Mixed methods research combines quantitative and qualitative approaches. If a standalone quantitative or qualitative study is insufficient to answer your research question, mixed methods may be a good fit for you.

Mixed methods are less common than standalone analyses, largely because they require a great deal of effort to pull off successfully. If you choose to pursue mixed methods, it’s especially important to robustly justify your methods.

The only proofreading tool specialized in correcting academic writing - try for free!

The academic proofreading tool has been trained on 1000s of academic texts and by native English editors. Making it the most accurate and reliable proofreading tool for students.

how to evaluate research proposal

Try for free

Next, you should indicate how you processed and analyzed your data. Avoid going into too much detail: you should not start introducing or discussing any of your results at this stage.

In quantitative research , your analysis will be based on numbers. In your methods section, you can include:

  • How you prepared the data before analyzing it (e.g., checking for missing data , removing outliers , transforming variables)
  • Which software you used (e.g., SPSS, Stata or R)
  • Which statistical tests you used (e.g., two-tailed t test , simple linear regression )

In qualitative research, your analysis will be based on language, images, and observations (often involving some form of textual analysis ).

Specific methods might include:

  • Content analysis : Categorizing and discussing the meaning of words, phrases and sentences
  • Thematic analysis : Coding and closely examining the data to identify broad themes and patterns
  • Discourse analysis : Studying communication and meaning in relation to their social context

Mixed methods combine the above two research methods, integrating both qualitative and quantitative approaches into one coherent analytical process.

Above all, your methodology section should clearly make the case for why you chose the methods you did. This is especially true if you did not take the most standard approach to your topic. In this case, discuss why other methods were not suitable for your objectives, and show how this approach contributes new knowledge or understanding.

In any case, it should be overwhelmingly clear to your reader that you set yourself up for success in terms of your methodology’s design. Show how your methods should lead to results that are valid and reliable, while leaving the analysis of the meaning, importance, and relevance of your results for your discussion section .

  • Quantitative: Lab-based experiments cannot always accurately simulate real-life situations and behaviors, but they are effective for testing causal relationships between variables .
  • Qualitative: Unstructured interviews usually produce results that cannot be generalized beyond the sample group , but they provide a more in-depth understanding of participants’ perceptions, motivations, and emotions.
  • Mixed methods: Despite issues systematically comparing differing types of data, a solely quantitative study would not sufficiently incorporate the lived experience of each participant, while a solely qualitative study would be insufficiently generalizable.

Remember that your aim is not just to describe your methods, but to show how and why you applied them. Again, it’s critical to demonstrate that your research was rigorously conducted and can be replicated.

1. Focus on your objectives and research questions

The methodology section should clearly show why your methods suit your objectives and convince the reader that you chose the best possible approach to answering your problem statement and research questions .

2. Cite relevant sources

Your methodology can be strengthened by referencing existing research in your field. This can help you to:

  • Show that you followed established practice for your type of research
  • Discuss how you decided on your approach by evaluating existing research
  • Present a novel methodological approach to address a gap in the literature

3. Write for your audience

Consider how much information you need to give, and avoid getting too lengthy. If you are using methods that are standard for your discipline, you probably don’t need to give a lot of background or justification.

Regardless, your methodology should be a clear, well-structured text that makes an argument for your approach, not just a list of technical details and procedures.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Normal distribution
  • Measures of central tendency
  • Chi square tests
  • Confidence interval
  • Quartiles & Quantiles


  • Cluster sampling
  • Stratified sampling
  • Thematic analysis
  • Cohort study
  • Peer review
  • Ethnography

Research bias

  • Implicit bias
  • Cognitive bias
  • Conformity bias
  • Hawthorne effect
  • Availability heuristic
  • Attrition bias

Methodology refers to the overarching strategy and rationale of your research project . It involves studying the methods used in your field and the theories or principles behind them, in order to develop an approach that matches your objectives.

Methods are the specific tools and procedures you use to collect and analyze data (for example, experiments, surveys , and statistical tests ).

In shorter scientific papers, where the aim is to report the findings of a specific study, you might simply describe what you did in a methods section .

In a longer or more complex research project, such as a thesis or dissertation , you will probably include a methodology section , where you explain your approach to answering the research questions and cite relevant sources to support your choice of methods.

In a scientific paper, the methodology always comes after the introduction and before the results , discussion and conclusion . The same basic structure also applies to a thesis, dissertation , or research proposal .

Depending on the length and type of document, you might also include a literature review or theoretical framework before the methodology.

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to systematically measure variables and test hypotheses . Qualitative methods allow you to explore concepts and experiences in more detail.

Reliability and validity are both about how well a method measures something:

  • Reliability refers to the  consistency of a measure (whether the results can be reproduced under the same conditions).
  • Validity   refers to the  accuracy of a measure (whether the results really do represent what they are supposed to measure).

If you are doing experimental research, you also have to consider the internal and external validity of your experiment.

A sample is a subset of individuals from a larger population . Sampling means selecting the group that you will actually collect data from in your research. For example, if you are researching the opinions of students in your university, you could survey a sample of 100 students.

In statistics, sampling allows you to test a hypothesis about the characteristics of a population.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

McCombes, S. & George, T. (2023, November 20). What Is a Research Methodology? | Steps & Tips. Scribbr. Retrieved April 4, 2024, from https://www.scribbr.com/dissertation/methodology/

Is this article helpful?

Shona McCombes

Shona McCombes

Other students also liked, what is a theoretical framework | guide to organizing, what is a research design | types, guide & examples, qualitative vs. quantitative research | differences, examples & methods, unlimited academic ai-proofreading.

✔ Document error-free in 5minutes ✔ Unlimited document corrections ✔ Specialized in correcting academic texts

Cornell University

Search This Site

  • Budget & Planning
  • Common Data Set
  • Substantive change process
  • Brown bags and presentations
  • University Factbook
  • Diversity dashboards
  • Undergraduate
  • Graduate School
  • Professional Schools
  • Medical Division
  • Total Enrollment
  • Undergraduate Enrollment
  • Graduate School Enrollment
  • Professional Schools Enrollment
  • Medical Division Enrollment
  • Tuition and Self-Help
  • Degrees Conferred
  • Academic Staff
  • Non-academic Staff
  • External Environment
  • All undergraduate students
  • Incoming undergraduates
  • Graduating seniors
  • Faculty and academics
  • Submit survey proposals
  • Survey calendar
  • So you want to survey Cornell students…
  • Academic program changes
  • Academic program review

Formal Review of Research Proposals

When is Formal Review Required?

Student & Campus Life research projects that will use substantial resources of the Cornell community must be formally reviewed by the committee before they can be initiated. At a minimum, this includes research that draws participants from a major institutional data base, for example, those maintained by the University Registrar; Office of the Dean of Students; Fraternity, Sorority and Independent Living; and Class Councils. Regardless of how potential participants are to be identified, research that meets the following criteria will also require formal review by the committee:

  • Involves more that 100 participants for a quantitative data collection method (e.g., survey research) or 25 participants for a qualitative data collection method (e.g., focus groups or interviews);
  • Is broader in scope than program evaluation (e.g., asks about more than just program-based experiences or includes individuals who did not participate in the target program or event); and
  • Will require a substantial amount of participants’ time (e.g., protocols that will take more than 10 or 15 minutes to complete, or longitudinal research designs).

Conversely, research projects that are very limited in scope, and research that is conducted exclusively for program evaluation purposes (i.e., research that examines the program-related experiences of students who participate in a specific program or event) will generally be exempt from formal review by the committee.

Submitting a Proposal for Formal Review

The committee meets monthly during the fall, winter and spring semesters to formally review research proposals and conduct related business. At least eight weeks before the anticipated launch date of the project, researchers should submit a  SCLRG research proposal form to Leslie Meyerhoff or Marne Einarson . The proposal form asks for information about the purpose and proposed design of the study, as well as draft versions of data collection instruments. Samples of completed research proposals are available here and here .

The following criteria will be used by the committee to evaluate research proposals:

  • Importance: Does the research address an important issue at Cornell? Will it provide useful information for academic planning or providing services to Cornell students?
  • Content and Design : Does the proposed methodology fit the research question(s)? Are the questions well-constructed and easily understood? Is the instrument of reasonable length? Have the questions been pretested?
  • Population and Sampling Methodology: Who is the target population? Is the sampling methodology appropriate to the research question(s)? Has the same student cohort and/or sample been used in other recent research? Could a smaller sample be drawn to achieve the same objective? How will the researcher(s) gain access to the proposed participants?
  • Timing: Does the proposed timing of the research overlap with or follow closely upon other research directed toward the same population? When were data on this issue last collected at Cornell? Is the data collection period scheduled at a time when students are likely to respond?
  • Data Management and Dissemination: Who will have access to the data? What are the provisions for secure storage of the data? Can data from this research be linked to other data sets? What is the plan for analyzing the data and disseminating the results? How will research results contribute to better decision making? How will research results be shared more broadly?
  • Resources : What resources will be required to conduct this research (e.g., instrument design, Web application development, mail and/or e-mail services, data entry and analysis)? From where will these resources be obtained?
  • Overall Impact: What will be the impact of the study? Are there any conceivable negative impacts on the University? Will the study overburden respondents? Overall, do the expected benefits of the study appear to outweigh the costs?

Based on their evaluation of the research proposal, the committee may decide to:

  • Approve the project as submitted
  • Approve the project with recommendations for changes that must be adopted before the project can be initiated
  • Require revisions and re-submission of the project before approval is granted
  • Reject the project (e.g., the potential benefits of the data do not justify the costs of collection; the research design has weaknesses that cannot be rectified)

IRB Approval

If research results will not be used exclusively for internal purposes (e.g., they will be presented or published beyond Cornell; or used for an undergraduate honors thesis, master’s thesis or doctoral dissertation), researchers may also be required to obtain approval from Cornell’s Institutional Review Board for Human Participants (IRB). IRB approval should be sought after the proposal has been reviewed by the SAS Research Group. The committee should subsequently be informed of the decision of the IRB.

© 2024 Cornell University

If you have a disability and are having trouble accessing information on this website or need materials in an alternate format, contact  [email protected]  for assistance.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Indian J Anaesth
  • v.60(9); 2016 Sep

How to write a research proposal?

Department of Anaesthesiology, Bangalore Medical College and Research Institute, Bengaluru, Karnataka, India

Devika Rani Duggappa

Writing the proposal of a research work in the present era is a challenging task due to the constantly evolving trends in the qualitative research design and the need to incorporate medical advances into the methodology. The proposal is a detailed plan or ‘blueprint’ for the intended study, and once it is completed, the research project should flow smoothly. Even today, many of the proposals at post-graduate evaluation committees and application proposals for funding are substandard. A search was conducted with keywords such as research proposal, writing proposal and qualitative using search engines, namely, PubMed and Google Scholar, and an attempt has been made to provide broad guidelines for writing a scientifically appropriate research proposal.


A clean, well-thought-out proposal forms the backbone for the research itself and hence becomes the most important step in the process of conduct of research.[ 1 ] The objective of preparing a research proposal would be to obtain approvals from various committees including ethics committee [details under ‘Research methodology II’ section [ Table 1 ] in this issue of IJA) and to request for grants. However, there are very few universally accepted guidelines for preparation of a good quality research proposal. A search was performed with keywords such as research proposal, funding, qualitative and writing proposals using search engines, namely, PubMed, Google Scholar and Scopus.

Five ‘C’s while writing a literature review

An external file that holds a picture, illustration, etc.
Object name is IJA-60-631-g001.jpg


A proposal needs to show how your work fits into what is already known about the topic and what new paradigm will it add to the literature, while specifying the question that the research will answer, establishing its significance, and the implications of the answer.[ 2 ] The proposal must be capable of convincing the evaluation committee about the credibility, achievability, practicality and reproducibility (repeatability) of the research design.[ 3 ] Four categories of audience with different expectations may be present in the evaluation committees, namely academic colleagues, policy-makers, practitioners and lay audiences who evaluate the research proposal. Tips for preparation of a good research proposal include; ‘be practical, be persuasive, make broader links, aim for crystal clarity and plan before you write’. A researcher must be balanced, with a realistic understanding of what can be achieved. Being persuasive implies that researcher must be able to convince other researchers, research funding agencies, educational institutions and supervisors that the research is worth getting approval. The aim of the researcher should be clearly stated in simple language that describes the research in a way that non-specialists can comprehend, without use of jargons. The proposal must not only demonstrate that it is based on an intelligent understanding of the existing literature but also show that the writer has thought about the time needed to conduct each stage of the research.[ 4 , 5 ]


The contents or formats of a research proposal vary depending on the requirements of evaluation committee and are generally provided by the evaluation committee or the institution.

In general, a cover page should contain the (i) title of the proposal, (ii) name and affiliation of the researcher (principal investigator) and co-investigators, (iii) institutional affiliation (degree of the investigator and the name of institution where the study will be performed), details of contact such as phone numbers, E-mail id's and lines for signatures of investigators.

The main contents of the proposal may be presented under the following headings: (i) introduction, (ii) review of literature, (iii) aims and objectives, (iv) research design and methods, (v) ethical considerations, (vi) budget, (vii) appendices and (viii) citations.[ 4 ]


It is also sometimes termed as ‘need for study’ or ‘abstract’. Introduction is an initial pitch of an idea; it sets the scene and puts the research in context.[ 6 ] The introduction should be designed to create interest in the reader about the topic and proposal. It should convey to the reader, what you want to do, what necessitates the study and your passion for the topic.[ 7 ] Some questions that can be used to assess the significance of the study are: (i) Who has an interest in the domain of inquiry? (ii) What do we already know about the topic? (iii) What has not been answered adequately in previous research and practice? (iv) How will this research add to knowledge, practice and policy in this area? Some of the evaluation committees, expect the last two questions, elaborated under a separate heading of ‘background and significance’.[ 8 ] Introduction should also contain the hypothesis behind the research design. If hypothesis cannot be constructed, the line of inquiry to be used in the research must be indicated.

Review of literature

It refers to all sources of scientific evidence pertaining to the topic in interest. In the present era of digitalisation and easy accessibility, there is an enormous amount of relevant data available, making it a challenge for the researcher to include all of it in his/her review.[ 9 ] It is crucial to structure this section intelligently so that the reader can grasp the argument related to your study in relation to that of other researchers, while still demonstrating to your readers that your work is original and innovative. It is preferable to summarise each article in a paragraph, highlighting the details pertinent to the topic of interest. The progression of review can move from the more general to the more focused studies, or a historical progression can be used to develop the story, without making it exhaustive.[ 1 ] Literature should include supporting data, disagreements and controversies. Five ‘C's may be kept in mind while writing a literature review[ 10 ] [ Table 1 ].

Aims and objectives

The research purpose (or goal or aim) gives a broad indication of what the researcher wishes to achieve in the research. The hypothesis to be tested can be the aim of the study. The objectives related to parameters or tools used to achieve the aim are generally categorised as primary and secondary objectives.

Research design and method

The objective here is to convince the reader that the overall research design and methods of analysis will correctly address the research problem and to impress upon the reader that the methodology/sources chosen are appropriate for the specific topic. It should be unmistakably tied to the specific aims of your study.

In this section, the methods and sources used to conduct the research must be discussed, including specific references to sites, databases, key texts or authors that will be indispensable to the project. There should be specific mention about the methodological approaches to be undertaken to gather information, about the techniques to be used to analyse it and about the tests of external validity to which researcher is committed.[ 10 , 11 ]

The components of this section include the following:[ 4 ]

Population and sample

Population refers to all the elements (individuals, objects or substances) that meet certain criteria for inclusion in a given universe,[ 12 ] and sample refers to subset of population which meets the inclusion criteria for enrolment into the study. The inclusion and exclusion criteria should be clearly defined. The details pertaining to sample size are discussed in the article “Sample size calculation: Basic priniciples” published in this issue of IJA.

Data collection

The researcher is expected to give a detailed account of the methodology adopted for collection of data, which include the time frame required for the research. The methodology should be tested for its validity and ensure that, in pursuit of achieving the results, the participant's life is not jeopardised. The author should anticipate and acknowledge any potential barrier and pitfall in carrying out the research design and explain plans to address them, thereby avoiding lacunae due to incomplete data collection. If the researcher is planning to acquire data through interviews or questionnaires, copy of the questions used for the same should be attached as an annexure with the proposal.

Rigor (soundness of the research)

This addresses the strength of the research with respect to its neutrality, consistency and applicability. Rigor must be reflected throughout the proposal.

It refers to the robustness of a research method against bias. The author should convey the measures taken to avoid bias, viz. blinding and randomisation, in an elaborate way, thus ensuring that the result obtained from the adopted method is purely as chance and not influenced by other confounding variables.


Consistency considers whether the findings will be consistent if the inquiry was replicated with the same participants and in a similar context. This can be achieved by adopting standard and universally accepted methods and scales.


Applicability refers to the degree to which the findings can be applied to different contexts and groups.[ 13 ]

Data analysis

This section deals with the reduction and reconstruction of data and its analysis including sample size calculation. The researcher is expected to explain the steps adopted for coding and sorting the data obtained. Various tests to be used to analyse the data for its robustness, significance should be clearly stated. Author should also mention the names of statistician and suitable software which will be used in due course of data analysis and their contribution to data analysis and sample calculation.[ 9 ]

Ethical considerations

Medical research introduces special moral and ethical problems that are not usually encountered by other researchers during data collection, and hence, the researcher should take special care in ensuring that ethical standards are met. Ethical considerations refer to the protection of the participants' rights (right to self-determination, right to privacy, right to autonomy and confidentiality, right to fair treatment and right to protection from discomfort and harm), obtaining informed consent and the institutional review process (ethical approval). The researcher needs to provide adequate information on each of these aspects.

Informed consent needs to be obtained from the participants (details discussed in further chapters), as well as the research site and the relevant authorities.

When the researcher prepares a research budget, he/she should predict and cost all aspects of the research and then add an additional allowance for unpredictable disasters, delays and rising costs. All items in the budget should be justified.

Appendices are documents that support the proposal and application. The appendices will be specific for each proposal but documents that are usually required include informed consent form, supporting documents, questionnaires, measurement tools and patient information of the study in layman's language.

As with any scholarly research paper, you must cite the sources you used in composing your proposal. Although the words ‘references and bibliography’ are different, they are used interchangeably. It refers to all references cited in the research proposal.

Successful, qualitative research proposals should communicate the researcher's knowledge of the field and method and convey the emergent nature of the qualitative design. The proposal should follow a discernible logic from the introduction to presentation of the appendices.

Financial support and sponsorship

Conflicts of interest.

There are no conflicts of interest.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 02 April 2024

Large language models could change the future of behavioral healthcare: a proposal for responsible development and evaluation

  • Elizabeth C. Stade 1 , 2 , 3 ,
  • Shannon Wiltsey Stirman 1 , 2 ,
  • Lyle H. Ungar 4 ,
  • Cody L. Boland 1 ,
  • H. Andrew Schwartz 5 ,
  • David B. Yaden 6 ,
  • João Sedoc 7 ,
  • Robert J. DeRubeis 8 ,
  • Robb Willer 9 &
  • Johannes C. Eichstaedt 3  

npj Mental Health Research volume  3 , Article number:  12 ( 2024 ) Cite this article

228 Accesses

2 Altmetric

Metrics details

  • Psychiatric disorders

Large language models (LLMs) such as Open AI’s GPT-4 (which power ChatGPT) and Google’s Gemini, built on artificial intelligence, hold immense potential to support, augment, or even eventually automate psychotherapy. Enthusiasm about such applications is mounting in the field as well as industry. These developments promise to address insufficient mental healthcare system capacity and scale individual access to personalized treatments. However, clinical psychology is an uncommonly high stakes application domain for AI systems, as responsible and evidence-based therapy requires nuanced expertise. This paper provides a roadmap for the ambitious yet responsible application of clinical LLMs in psychotherapy. First, a technical overview of clinical LLMs is presented. Second, the stages of integration of LLMs into psychotherapy are discussed while highlighting parallels to the development of autonomous vehicle technology. Third, potential applications of LLMs in clinical care, training, and research are discussed, highlighting areas of risk given the complex nature of psychotherapy. Fourth, recommendations for the responsible development and evaluation of clinical LLMs are provided, which include centering clinical science, involving robust interdisciplinary collaboration, and attending to issues like assessment, risk detection, transparency, and bias. Lastly, a vision is outlined for how LLMs might enable a new generation of studies of evidence-based interventions at scale, and how these studies may challenge assumptions about psychotherapy.

Similar content being viewed by others

how to evaluate research proposal

The imperative for regulatory oversight of large language models (or generative AI) in healthcare

Bertalan Meskó & Eric J. Topol

how to evaluate research proposal

The future landscape of large language models in medicine

Jan Clusmann, Fiona R. Kolbinger, … Jakob Nikolas Kather

how to evaluate research proposal

Large language models in medicine

Arun James Thirunavukarasu, Darren Shu Jeng Ting, … Daniel Shu Wei Ting


Large language models (LLMs), built on artificial intelligence (AI) – such as Open AI’s GPT-4 (which power ChatGPT) and Google’s Gemini – are breakthrough technologies that can read, summarize, and generate text. LLMs have a wide range of abilities, including serving as conversational agents (chatbots), generating essays and stories, translating between languages, writing code, and diagnosing illness 1 . With these capacities, LLMs are influencing many fields, including education, media, software engineering, art, and medicine. They have started to be applied in the realm of behavioral healthcare, and consumers are already attempting to use LLMs for quasi-therapeutic purposes 2 .

Applications incorporating older forms of AI, including natural language processing (NLP) technology, have existed for decades 3 . For example, machine learning and NLP have been used to detect suicide risk 4 , identify the assignment of homework in psychotherapy sessions 5 , and identify patient emotions within psychotherapy 6 . Current applications of LLMs in the behavioral health field are far more nascent – they include tailoring an LLM to help peer counselors increase their expressions of empathy, which has been deployed with clients both in academic and commercial settings 2 , 7 . As another example, LLM applications have been used to identify therapists’ and clients’ behaviors in a motivational interviewing framework 8 , 9 .

Similarly, while algorithmic intelligence with NLP has been deployed in patient-facing behavioral health contexts, LLMs have not yet been heavily employed in these domains. For example, mental health chatbots Woebot and Tessa, which target depression and eating pathology respectively 10 , 11 , are rule-based and do not use LLMs (i.e., the application’s content is human-generated, and the chatbot’s responds based on predefined rules or decision trees 12 ). However, these and other existing chatbots frequently struggle to understand and respond to unanticipated user responses 10 , 13 , which likely contributes to their low engagement and high dropout rates 14 , 15 . LLMs may hold promise to fill some of these gaps, given their ability to flexibly generate human-like and context-dependent responses. A small number of patient-facing applications incorporating LLMs have been tested, including a research-based application to generate dialog for therapeutic counseling 16 , 17 , and an industry-based mental-health chatbot, Youper, which uses a mix of rule-based and generative AI 18 .

These early applications demonstrate the potential of LLMs in psychotherapy – as their use becomes more widespread, they will change many aspects of psychotherapy care delivery. However, despite the promise they may hold for this purpose, caution is warranted given the complex nature of psychopathology and psychotherapy. Psychotherapy delivery is an unusually complex, high-stakes domain vis-à-vis other LLM use cases. For example, in the productivity realm, with a “LLM co-pilot” summarizing meeting notes, the stakes are failing to maximize efficiency or helpfulness; in behavioral healthcare, the stakes may include improperly handling the risk of suicide or homicide.

While there are other applications of artificial intelligence that may involve high-stakes or life-or death decisions (e.g., self-driving cars), prediction and mitigation of risk in the case of psychotherapy is very nuanced, involving complex case conceptualization, the consideration of social and cultural contexts, and addressing unpredictable human behavior. Poor outcomes or ethical transgressions from clinical LLMs could run the risk of harming individuals, which may also be disproportionately publicized (as has occurred with other AI failures 19 ), which may damage public trust in the field of behavioral healthcare.

Therefore, developers of clinical LLMs need to act with special caution to prevent such consequences. Developing responsible clinical LLMs will be a challenging coordination problem, primarily because the technological developers who are typically responsible for product design and development lack clinical sensitivity and experience. Thus, behavioral health experts will need to play a critical role in guiding development and speaking to the potential limitations, ethical considerations, and risks of these applications.

Presented below is a discussion on the future of LLMs in behavioral healthcare from the perspective of both behavioral health providers and technologists. A brief overview of the technology underlying clinical LLMs is provided for the purposes of both educating clinical providers and to set the stage for further discussion regarding recommendations for development. The discussion then outlines various applications of LLMs to psychotherapy and provides a proposal for the cautious, phased development and evaluation of LLM-based applications for psychotherapy.

Overview of clinical LLMs

Clinical LLMs could take a wide variety of forms, spanning everything from brief interventions or circumscribed tools to augment therapy, to chatbots designed to provide psychotherapy in an autonomous manner. These applications could be patient-facing (e.g., providing psychoeducation to the patient), therapist-facing (e.g., offering options for interventions from which the therapist could select), trainee-facing (e.g., offering feedback on qualities of the trainee’s performance), or supervisor/consultant facing (e.g., summarizing supervisees’ therapy sessions in a high-level manner).

How language models work

Language models, or computational models of the probability of sequences of words, have existed for quite some time. The mathematical formulations date back to 20 and original use cases focused on compressing communication 21 and speech recognition 22 , 23 , 24 . Language modeling became a mainstay for choosing among candidate phrases in speech recognition and automatic translation systems but until recently, using such models for generating natural language found little success beyond abstract poetry 24 .

Large language models

The advent of large language models, enabled by a combination of the deep learning technique transformers 25 and increases in computing power, has opened new possibilities 26 . These models are first trained on massive amounts of data 27 , 28 using “unsupervised” learning in which the model’s task is to predict a given word in a sequence of words. The models can then be tailored to a specific task using methods, including prompting with examples or fine-tuning, some of which use no or small amounts of task-specific data (see Fig. 1 ) 28 , 29 . LLMs hold promise for clinical applications because they can parse human language and generate human-like responses, classify/score (i.e., annotate) text, and flexibly adopt conversational styles representative of different theoretical orientations.

figure 1

Figure was designed using image components from Flaticon.com.

LLMs and psychotherapy skills

For certain use cases, LLM show a promising ability to conduct tasks or skills needed for psychotherapy, such as conducting assessment, providing psychoeducation, or demonstrating interventions (see Fig. 2 ). Yet to date, clinical LLM products and prototypes have not demonstrated anywhere near the level of sophistication required to take the place of psychotherapy. For example, while an LLM can generate an alternative belief in the style of CBT, it remains to be seen whether it can engage in the type of turn-based, Socratic questioning that would be expected to produce cognitive change. This more generally highlights the gap that likely exists between simulating therapy skills and implementing them effectively to alleviate patient suffering. Given that psychotherapy transcripts are likely poorly represented in the training data for LLMs, and that privacy and ethical concerns make such representation challenging, prompt engineering may ultimately be the most appropriate fine-tuning approach for shaping LLM behavior in this manner.

figure 2

Note . Figure was designed using image component from Flaticon.com.

Clinical LLMs: stages of integration

The integration of LLMs into psychotherapy could be articulated as occurring along a continuum of stages spanning from assistive AI to fully autonomous AI (see Fig. 3 and Table 1 ). This continuum can be illustrated by models of AI integration in other fields, such as those used in the autonomous vehicle industry. For example, at one end of this continuum is the assistive AI (“machine in the loop”) stage, wherein the vehicle system has no ability to complete the primary tasks – acceleration, braking, and steering – on its own, but provides momentary assistance (e.g., automatic emergency breaking, lane departure warning) to increase driving quality or decrease burden on the driver. In the collaborative AI (“human in the loop”) stage, the vehicle system aids in the primary tasks, but requires human oversight (e.g., adaptive cruise control, lane keeping assistance). Finally, in fully autonomous AI, vehicles are self-driving and do not require human oversight. The stages of LLM integration into psychotherapy and their related functionalities are described below.

figure 3

Stage 1: assistive LLMs

At the first stage in LLM integration, AI will be used as a tool to assist clinical providers and researchers with tasks that can easily be “offloaded” to AI assistants (Table 1 ; first row). As this is a preliminary step in integration, relevant tasks will be low-level, concrete, and circumscribed, such that they present a low level of risk. Examples of tasks could include assisting with collecting information for patient intakes or assessment, providing basic psychoeducation to patients, suggesting text edits for providers engaging in text-based care, and summarizing patient worksheets. Administratively, systems at this stage could also assist with clinical documentation by drafting session notes.

Stage 2: collaborative LLMs

Further along the continuum, AI systems will take the lead by providing or suggesting options for treatment planning and much of the therapy content, which humans will use their professional judgement to select from or tailor. For example, in the context of a text- or instant-message delivered structured psychotherapeutic intervention, the LLM might generate messages containing session content and assignments, which the therapist would review and adapt as needed before sending (Table 1 ; second row). A more advanced use of AI within the collaborative stage may entail a LLM providing a structured intervention in a semi-independent manner (e.g., as a chatbot), with a provider monitoring the discussion and stepping in to take control of the conversation as needed. The collaborative LLM stage has parallels to “guided self-help” approaches 30 .

Stage 3: fully autonomous LLMs

In the fully autonomous stage, AIs will achieve the greatest degree of scope and autonomy wherein a clinical LLM would perform a full range of clinical skills and interventions in an integrated manner without direct provider oversight (Table 1 ; third row). For example, an application at this stage might theoretically conduct a comprehensive assessment, select an appropriate intervention, and deliver a full course of therapy with no human intervention. In addition to clinical content, applications in this stage could integrate with the electronic health record to complete clinical documentation and report writing, schedule appointments and process billing. Fully autonomous applications offer the most scalable treatment method 30 .

Progression across the stages

Progression across the stages may not be linear; human oversight will be required to ensure that applications at greater stages of integration are safe for real world deployment. As different forms of psychopathology and their accompanying interventions vary in complexity, certain types of interventions will be simpler than others to develop as LLM applications. Interventions that are more concrete and standardized may be easier for models to deliver (and may be available sooner), such as circumscribed behavior change interventions (e.g., activity scheduling), as opposed to applications which include skills that are abstract in nature or emphasize cognitive change (e.g., Socratic questioning). Similarly, when it comes to full therapy protocols, LLM applications for interventions that are highly structured, behavioral, and protocolized (e.g., CBT for insomnia [CBT-I] or exposure therapy for specific phobia) may be available sooner than applications delivering highly flexible or personalized interventions (for example 31 ).

In theory, the final stage in the integration of LLMs into psychotherapy is fully autonomous delivery of psychotherapy which does not require human intervention or monitoring. However, it remains to be seen whether fully autonomous AI systems will reach a point at which they have been evaluated to be safe for deployment by the behavioral health community. Specific concerns include how well these systems are able to carry out case conceptualization on individuals with complex, highly comorbid symptom presentations, including accounting for current and past suicidality, substance use, safety concerns, medical comorbidities, and life circumstances and events (such as court dates and upcoming medical procedures). Similarly, it is unclear whether these systems will prove sufficiently adept at engaging patients over time 32 or accounting for and addressing contextual nuances in treatment (e.g., using exposure to treat a patient experiencing PTSD-related fear of leaving the house, who also lives in a neighborhood with high rates of crime). Furthermore, several skills which may be viewed as central to clinical work currently fall outside the purview of LLM systems, such as interpreting nonverbal behavior (e.g., fidgeting, eye-rolling), appropriately challenging a patient, addressing alliance ruptures, and making decisions about termination. Technological advances, including the approaching advent of multimodal language models that integrate text, images, video, and audio, may eventually begin to fill these gaps.

Beyond technical limitations, it remains to be decided whether complete automation is an appropriate end goal for behavioral healthcare, due to safety, legal, philosophical, and ethical concerns 33 . While some evidence indicates that humans can develop a therapeutic alliance with chatbots 34 , the long-term viability of such alliance building, and whether or not it produces undesirable downstream effects (e.g., altering an individual’s existing relationships or social skills) remains to be seen. Others have documented potentially harmful behavior of LLM chatbots, such as narcissistic tendencies 35 and expressed concerns about the potential for their undue influence on humans in addition to articulating societal risks associated with LLMs more generally 36 , 37 . The field will also need to grapple with questions of accountability and liability in the case of a fully autonomous clinical LLM application causing damage (e.g., identifying the responsible party in an incident of malpractice 38 ). For these and other reasons, some have argued against the implementation of fully autonomous systems in behavioral healthcare and healthcare more broadly 39 , 40 . Taken together, these issues and concerns may suggest that in the short and medium term, assistive or collaborative AI applications will be more appropriate for the provision of behavioral healthcare.

Applications of clinical LLMs

Given the vast nature of behavioral healthcare, there are seemingly endless applications of LLMs. Outlined below are some of the currently existing, imminently feasible, and potential long-term applications of clinical LLMs. Here we focus our discussion on applications directly related to the provision of, training in, and research on psychotherapy. As such, several important aspects of behavioral healthcare, such as initial symptom detection, psychological assessment and brief interventions (e.g., crisis counseling) are not explicitly discussed herein.

Imminent applications

Automating clinical administration tasks.

At the most basic level, LLMs have the potential to automate several time-consuming tasks associated with providing psychotherapy (Table 2 , first row). In addition to using session transcripts to summarize the session for the provider, there is potential for such models to integrate within electronic health records to aid with clinical documentation and conducting chart reviews. Clinical LLMs could also produce a handout for the patient that provides a personalized overview of the session, skills learned and assigned homework or between-session material.

Measuring treatment fidelity

A clinical LLM application could automate measurement of therapist fidelity to evidence-based practices (EBPs; Table 2 , second row), which can include measuring adherence to the treatment as designed, competence in delivering a specific therapy skill, treatment differentiation (whether multiple treatments being compared actually differ from one another), and treatment receipt (patient comprehension of, engagement with, and adherence to the therapy content) 41 , 42 . Measuring fidelity is crucial to the development, testing, dissemination, and implementation of EBPs, yet can be resource intensive and difficult to do reliably. In the future, clinical LLMs could computationally derive adherence and competence ratings, aiding research efforts and reducing therapist drift 43 . Traditional machine-learning models are already being used to assess fidelity to specific modalities 44 and other important constructs like counseling skills 45 and alliance 46 . Given their improved ability to consider context, LLMs will likely increase the accuracy with which these constructs are assessed.

Offering feedback on therapy worksheets and homework

LLM applications could also be developed deliver real-time feedback and support on patients’ between-session homework assignments (Table 2 , third row). For example, an LLM tailored to assist a patient to complete a CBT worksheet might provide clarification or aid in problem solving if the patient experiences difficulty (e.g., the patient was completing a thought log and having trouble differentiating between the thought and the emotion). This could help to “bridge the gap” between sessions and expedite patient skill development. Early evidence outside the AI realm 47 points to increasing worksheet competence as a fruitful clinical target.

Automating aspects of supervision and training

LLMs could be used to provide feedback on psychotherapy or peer support sessions, especially for clinicians with less training and experience (i.e., peer counselors, lay health workers, psychotherapy trainees). For example, an LLM might be used to offer corrections and suggestions to the dialog of peer counselors (Table 2 , fourth row). This application has parallels to “task sharing,” a method used in the global mental health field by which nonprofessionals provide mental health care with the oversight by specialist workers to expand access to mental health services 48 . Some of this work is already underway, for example, as described above, using LLMs to support peer counselors 7 .

LLMs could also support supervision for psychotherapists learning new treatments (Table 2 , fifth row). Gold-standard methods of reviewing trainees’ work, like live observation or review of recorded sessions 49 , are time-consuming. LLMs could analyze entire therapy sessions and identify areas of improvement, offering a scalable approach for supervisors or consultants to review.

Potential long-term applications

It is important to note that many of the potential applications listed below are theoretical and have yet to be developed, let alone thoroughly evaluated. Furthermore, we use the term “clinical LLM” in recognition of the fact that when and under what circumstances the work of an LLM could be called psychotherapy is evolving and depends on how psychotherapy is defined.

Fully autonomous clinical care

As previously described, the final stage of clinical LLM development could involve an LLM that can independently conduct comprehensive behavioral healthcare. This could involve all aspects related to traditional care including conducting assessment, presenting feedback, selecting an appropriate intervention and delivering a course of therapy to the patient. This course of treatment could be delivered in ways consistent with current models of psychotherapy wherein a patient engages with a “chatbot” weekly for a prescribed amount of time, or in more flexible or alternative formats. LLMs used in this manner would ideally be trained using standardized assessment approaches and manualized therapy protocols that have large bodies of evidence.

Decision aid for existing evidence-based practices

Even without full automation, clinical LLMs could be used as a tool to guide a provider on the best course of treatment for a given patient by optimizing the delivery of existing EBPs and therapeutic techniques. In practice, this may look like a LLM that can analyze transcripts from therapy sessions and offer a provider guidance on therapeutic skills, approaches or language, either in real time, or at the end of the therapy session. Furthermore, the LLM could integrate current evidence on the tailoring of specific EBPs to the condition being treated, and to demographic or cultural factors and comorbid conditions. Developing tailored clinical LLM “advisors” based on EBPs could both enhance fidelity to treatment and maximize the possibility of patients achieving clinical improvement in light of updated clinical evidence.

Development of new therapeutic techniques and EBPs

To this point, we have discussed how LLMs could be applied to current approaches to psychotherapy using extant evidence. However, LLMs and other computational methods could greatly enhance the detection and development of new therapeutic skills and EBPs. Historically, EBPs have traditionally been developed using human-derived insights and then evaluated through years of clinical trial research. While EBPs are effective, effect sizes for psychotherapy are typically small 50 , 51 and significant proportions of patients do not respond 52 . There is a great need for more effective treatments, particularly for individuals with complex presentations or comorbid conditions. However, the traditional approach to developing and testing therapeutic interventions is slow, contributing to significant time lags in translational research 53 , and fails to deliver insights at the level of the individual.

Data-driven approaches hold the promise of revealing patterns that are not yet realized by clinicians, thus generating new approaches to psychotherapy; machine learning is already being used, for example, to predict behavioral health treatment outcomes 54 . With their ability to parse and summarize natural language, LLMs could add to existing data-driven approaches. For example, an LLM could be provided with a large historical dataset containing psychotherapy transcripts of different therapeutic orientations, outcome measures and sociodemographic information, and tasked with detecting therapeutic behaviors and techniques associated with objective outcomes (e.g., reduction in depressive symptoms). Using such a process might make it possible for an LLM to yield fine-grained insights about what makes existing therapeutic techniques work best (e.g., Which components of existing EBPs are the most potent? Are there therapist or patient characteristics that moderate the efficacy of intervention X? How does the ordering of interventions effect outcomes?) or even to isolate previously unidentified therapeutic techniques associated with improved clinical outcomes. By identifying what happens in therapy in such a fine-grained manner, LLMs could also play a role in revealing mechanisms of change, which is important for improving existing treatments and facilitating real-world implementation 55 .

However, to realize this possibility, and make sure that LLM-based advances can be integrated and vetted by the clinical community, it is necessary to steer away from the development of “black box,” LLM-identified interventions with low explainability (e.g., interpretability 56 ). To guard against interventions with low interpretability, work to finetune LLMs to improve patient outcomes could include inspectable representations of the techniques employed by the LLM. Clinicians could examine these representations and situate them in the broader psychotherapy literature, which would involve comparing them to existing psychotherapy techniques and theories. Such an approach could speed up the identification of novel mechanisms while guarding against the identification of “novel” interventions which overlap with existing techniques or constructs (thus avoiding the jangle fallacy, the erroneous assumption that two constructs with different names are necessarily distinct 57 ).

In the long run, by combining this information, it might even be possible for an LLM to “reverse-engineer” a new EBP, freed from the constraints of traditional therapeutic protocols and instead maximizing on the delivery of the constituent components shown to produce patient change (in a manner akin to modular approaches, wherein an individualized treatment plan is crafted for each patient by curating and sequencing treatment modules from an extensive menu of all available options based on the unique patient’s presentation 31 ). Eventually, a self-learning clinical LLM might deliver a broad range of psychotherapeutic interventions while measuring patient outcomes and adapting its approach on the fly in response to changes in the patient (or lack thereof).

Toward a precision medicine approach to psychotherapy

Current approaches to psychotherapy often are unable to provide guidance on the best approach to treatment when an individual has a complex presentation, which is often the rule rather than being the exception. For example, providers are likely to have greatly differing treatment plans for a patient with concurrent PTSD, substance use, chronic pain, and significant interpersonal difficulties. Models that use a data-driven approach (rather than a provider’s educated guess) to address an individual’s presenting concern alongside their comorbidities, sociodemographic factors, history, and responses to the current treatment, may ultimately offer the best chance at maximizing patient benefit. While there have been some advances in precision medicine approaches in behavioral healthcare 54 , 58 , these efforts are in their infancy and limited by sample sizes 59 .

The potential applications of clinical LLMs we have outlined above may come together to facilitate a personalized approach to behavioral healthcare, analogous to that of precision medicine. Through optimizing existing EBPs, identifying new therapeutic approaches, and better understanding mechanisms of change, LLMs (and their future descendants) may provide behavioral healthcare with an enhanced ability to identify what works best for whom and under what circumstances.

Recommendations for responsible development and evaluation of clinical LLMs

Focus first on evidence-based practices.

In the immediate future, clinical LLM applications will have the greatest chance of creating meaningful clinical impact if developed based on EBPs or a “common elements” approach (i.e., evidence-based procedures shared across treatments) 60 . Evidence-based treatments and techniques have been identified for specific psychopathologies (e.g., major depressive disorder, posttraumatic stress disorder), stressors (e.g., bereavement, job loss, divorce), and populations (e.g., LGBTQ individuals, older adults) 55 , 61 , 62 . Without an initial focus on EBPs, clinical LLM applications may fail to reflect current knowledge and may even produce harm 63 . Only once LLMs have been fully trained on EBPs can the field start to consider using LLMs in a data-driven manner, such as those outlined in the previous section on potential long-term applications.

Focus next on improvement (engagement is not enough)

Others have highlighted the importance of promoting engagement with digital mental health applications 15 , which is important for achieving an adequate “dose” of the therapeutic intervention. LLM applications hold the promise of improving engagement and retention through their ability to respond to free text, extract key concepts, and address patients’ unique context and concerns during interventions in a timely manner. However, engagement alone is not an appropriate outcome on which to train an LLM, because engagement is not expected to be sufficient for producing change. A focus on such metrics for clinical LLMs will risk losing sight of the primary goals, clinical improvement (e.g., reductions in symptoms or impairment, increases in well-being and functioning) and prevention of risks and adverse events. It will behoove the field to be wary of attempts to optimize clinical LLMs on outcomes that have an explicit relationship with a company’s profit (e.g., length of time using the application). An LLM that optimizes only for engagement (akin to YouTube recommendations) could have high rates of user retention without employing meaningful clinical interventions to reduce suffering and improve quality of life. Previous research has suggested that this may be happening with non-LLM digital mental health interventions. For instance, exposure is a technique with strong support for treating anxiety, yet it is rarely included in popular smartphone applications for anxiety 64 , perhaps because developers fear that the technique will not appeal to users, or have concerns about how exposures going poorly or increasing anxiety in the short term, which may prompt concerns about legal exposure.

Commit to rigorous yet commonsense evaluation

An evaluation approach for clinical LLMs that hierarchically prioritizes risk and safety, followed by feasibility, acceptability, and effectiveness, would be in line with existing recommendations for the evaluation of digital mental health smartphone apps 65 . The first level of evaluation could involve a demonstration that a clinical LLM produces no harm or very minimal harm that is outweighed by its benefits, similar to FDA phase I drug tests. Key risk and safety related constructs include measures of suicidality, non-suicidal self harm, and risk of harm to others.

Next, rigorous examinations of clinical LLM applications will be needed to provide empirical evidence of their utility, using head-to-head comparisons with standard treatments. Key constructs to be assessed in these empirical tests are feasibility and acceptability to the patient and the therapist as well as treatment outcomes (e.g., symptoms, impairment, clinical status, rates of relapse). Other relevant considerations include patients’ user experience with the application, measures of therapist efficiency and burnout, and cost.

Lastly, we note that given that possible benefits of clinical LLMs (including expanding access to care), it will be important for the field to adopt a commonsense approach to evaluation. While rigorous evaluation is important, the comparison conditions on which these evaluations are based should reflect real-world risk and efficacy rates, and perhaps employ a graded hierarchy with which to classify risk and error (i.e., missing a mention of suicidality is unacceptable, but getting a patient’s partner’s name wrong is nonideal but tolerable), rather than holding clinical LLM applications to a standard of perfection which humans do not achieve. Furthermore, developers will need to strike the appropriate balance of prioritizing constructs in a manner expected to be most clinically beneficial, for example, if exposure therapy is indicated for the patient, but the patient does not find this approach acceptable, the clinical LLM could recommend the intervention prioritizing effectiveness before offering second-line interventions which may be more acceptable.

Involve interdisciplinary collaboration

Interdisciplinary collaboration between clinical scientists, engineers, and technologists will be crucial in the development of clinical LLMs. While it is plausible that engineers and technologists could use available therapeutic manuals to develop clinical LLMs without the expertise of a behavioral health expert, this is ill-advised. Manuals are only a first step towards learning a specific intervention, as they do not provide guidance on how the intervention can be applied to specific individuals or presentations, or how to handle specific issues or concerns that may arise through the course of treatment.

Clinicians and clinician-scientists have expertise that bears on these issues, as well as many other aspects of the clinical LLM development process. Their involvement could include a) testing new applications to identify limitations and risks and optimize their integration into clinical practice, b) improving the ability of applications to adequately address the complexity of psychological phenomena, c) ensuring that applications are developed and implemented in an ethical manner, and d) testing and ensuring that applications don’t have iatrogenic effects, such as reinforcing behaviors that perpetuate psychopathology or distress.

Behavioral health experts could also provide guidance on how best to finetune or tailor models, including addressing the question of whether and how real patient data should be used for these purposes. For example, most proximately, behavioral health experts might assist in prompt engineering , or the designing and testing of a series of prompts which provide the LLM framing and context for delivering a specific type of treatment or clinical skill (e.g., “Use cognitive restructuring to help the patient evaluate and reappraise negative thoughts in depression”), or a desired clinical task, such as evaluating therapy sessions for fidelity (e.g., “Analyze this psychotherapy transcript and select sections in which the therapist demonstrated the particularly skillful use of CBT skills, and sections in which the therapist’s delivery of CBT skills could be improved”). Similarly, in few-shot learning , behavioral health experts could be involved in crafting example exchanges which are added to prompts. For example, treatment modality experts might generate examples of clinical skills (e.g., high-quality examples of using cognitive restructuring to address depression) or of a clinical task (e.g., examples of both high- and low-quality delivery of CBT skills). For fine-tuning , in which a large, labeled dataset is used to train the LLM, and reinforcement learning from human feedback (RLHF), in which a human-labeled dataset is used to train a smaller model which is then used for LLM “self-training,” behavioral health experts could build and curate (and ensure informed patient consent for use of) appropriate datasets (e.g., a dataset containing psychotherapy transcripts rated for fidelity to an evidence-based psychotherapy). The expertise that behavioral health experts could draw on to generate instructive examples and curate high-quality datasets holds particular value in light of recent evidence that quality of data trumps quantity of data for training well-performing models 66 .

In the service of facilitating interdisciplinary collaboration, it would benefit clinical scientists to seek out a working knowledge about LLMs, while it would benefit technologists to develop a working knowledge of therapy in general and EBPs in particular. Dedicated venues that bring together behavioral health experts and clinical psychologists for interdisciplinary collaboration and communication will aid in these efforts. Historically, venues of this type have included psychology-focused workshops at NLP conferences (e.g., the Workshop on Computational Linguistics and Clinical Psychology [CLPsych], held at the Annual Conference of the North American Chapter of the Association for Computational Linguistics [NAACL]) and technology-focused conferences or workgroups hosted by psychological organizations (e.g., APA’s Technology, Mind & Society conference; Association for Behavioral and Cognitive Therapies’ [ABCT] Technology and Behavior Change special interest group). This work has also been done at nonprofits centered on technological tools for mental health (e.g., the Society for Digital Mental Health). Beyond these venues, it may be fruitful to develop a gathering that brings together technologists, clinical scientists, and industry partners with a dedicated focus on AI/LLMs, which could routinely publish on its efforts, akin to the efforts of the World Health Organization’s Infodemic Management Conference, which has employed this approach to address misinformation 67 . Finally, given the numerous applications of AI to behavioral health, it is conceivable that a new “computational behavioral health” subfield could emerge, offering specialized training that would bridge the gap between these two domains.

Focus on trust and usability for clinicians and patients

It is important to engage therapists, policymakers, end-users, and experts in human-computer interactions to understand and improve levels of trust that will be necessary for successful and effective implementation. With respect to applications of AI to augment supervision and support for psychotherapy, therapists have expressed concern about privacy, the ability to detect subtle non-verbal cues and cultural responsiveness, and the impact on therapist confidence, but they also see benefits for training and professional growth 68 . Other research suggests that while therapists believe AI can increase access to care, allow individuals to disclose embarrassing information more comfortably, continuously refine therapeutic techniques 69 , they have concerns about privacy and the formation of a strong therapeutic bond with machine-based therapeutic interventions 70 . Involvement of individuals who will be referring their patients and using LLMs in their own practice will be essential to developing solutions they can trust and implement, and to make sure these solutions have the features that support trust and usability (simple interfaces, accurate summaries of AI-patient interactions, etc.).

Regarding how much patients will trust the AI systems, following the stages we outlined in Fig. 3 , initial AI-patient interactions will continue to be supervised by clinicians, and the therapeutic bond between the clinician and the patient will continue to be the primary relationship. During this stage, it is important that clinicians talk to the patients about their experience with the LLMs, and that the field as a whole begins to accumulate an understanding and data on how acceptable interfacing with LLMs is for what kind of patient for what kind of clinical use case, in how clinicians can scaffold the patient-LLM relationship. This data will be critical for developing collaborative LLM applications that have more autonomy, and for ensuring that the transition from assistive to collaborative stage applications is not associated with large unforeseen risk. For example, in the case of CBT for insomnia, once an assistive AI system has been iterated on to reliably collect information about patients’ sleep patterns, it is more conceivable that it could be evolved into a collaborative AI system that does a comprehensive insomnia assessment (i.e., it also collects and interprets data on patients’ clinically significant distress, impairment of functioning, and ruling out of sleep-wake disorders, like narcolepsy) 71 .

Design criteria for effective clinical LLMs

Below, we propose an initial set of desirable design qualities for clinical LLMs.

Detect risk of harm

Accurate risk detection and mandated reporting are crucial aspects that clinical LLMs must prioritize, particularly in the identification of suicidal/homicidal ideation, child/elder abuse, and intimate partner violence. Algorithms for detecting risks are under development 4 . One threat to risk detection is that current LLMs have limited context windows, meaning they only “remember” a limited amount of user input. Functionally, this means a clinical LLM application could “forget” crucial details about a patient, which could impact safety (e.g., an application “forgetting” that the patient owns firearms would threaten its ability to properly assess and intervene around suicide risk). However, context windows have been rapidly expanding with each subsequent model release, so this issue may not be a problem for long. In addition, it is already possible to augment the memory of LLMs with “vector databases,” which would have the added benefit of retaining inspectable learnings and summaries across clinical encounters 72 .

In the future, and especially given much larger context windows, clinical LLMs could prompt clinicians with ethical guidelines, legal requirements (e.g., the Tarasoff rule, which requires clinicians to warn intended victims when a patient presents a serious threat of violence), or evidence-based methods for decreasing risk (e.g., safety planning 73 ), or even provide interventions targeting risk directly to patients. This type of risk monitoring and intervention could be particularly useful in supplementing existing healthcare systems during gaps in clinician coverage like nights and weekends 4 .

b) Be “Healthy.” There is growing concern that AI chat systems can demonstrate undesirable behaviors, including expressions akin to depression or narcissism 35 , 74 . Such poorly understood, undesirable behaviors risk harming already vulnerable patients or interfering with their ability to benefit from treatment. Clinical LLM applications will need training, monitoring, auditing, and guardrails to prevent the expression of undesirable behaviors and maintain healthy interactions with users. These efforts will need to be continually evaluated and updated to prevent or address the emergence of new undesirable or clinically contraindicated behavior.

Aid in psychodiagnostic assessment

Clinical LLMs ought to integrate psychodiagnostic assessment and diagnosis, facilitating intervention selection and outcome monitoring 75 . Recent developments show promise for LLMs in the assessment realm 76 . Down the line, LLMs could be used for diagnostic interviewing (e.g., Structured Clinical Interview for the DSM-5 77 ) using chatbots or voice interfaces. Prioritizing assessment enhances diagnostic accuracy and ensures appropriate intervention, reducing the risk of harmful interventions 63 .

Be responsive and flexible

Given the frequency with which ambivalence and poor patient engagement arise in clinical encounters, clinical LLMs which use evidence-based and patient-centered methods for handling these issues (e.g., motivational enhancement techniques, shared decision making), and have options for second-line interventions for patients not interested in gold-standard treatments, will have the best chance of success.

Stop when not helping or confident

Psychologists are ethically obligated to cease treatment and offer appropriate referrals to the patient if the current course of treatment has not helped or likely will not help. Clinical LLMs can abide by this ethical standard by drawing on integrated assessment (discussed above) to assess the appropriateness of the given intervention and detect cases that need more specialized or intensive intervention.

Be fair, inclusive, and free from bias

As has been written about extensively, LLMs may perpetuate bias, including racism, sexism, and homophobia, given that they are trained on existing text 36 . These biases can contribute to both error disparities – where models are less accurate for particular groups – or outcome disparities – where models tend to over-capture demographic information 78 – which would in turn contribute to the disparities in mental health status and care already experienced by minoritized groups 79 . The integration of bias countermeasures into clinical LLM applications could serve to prevent this 78 , 80 .

Be empathetic–to an extent

Clinical LLMs will likely need to demonstrate empathy and build the therapeutic alliance in order to engage patients. Other skills used by therapists include humor, irreverence, and gentle methods of challenging the patient. Incorporating these into clinical LLMs might be beneficial, as appropriate human likeness may facilitate engagement and interaction with AI 81 . However, this needs to be balanced against associated risks, mentioned above, of incorporating human likeness in systems 36 . Whether and how much human likeness is necessary for a psychological intervention remains a question for future empirical work.

Be transparent about being AIs

Mental illness and mental health care is already stigmatized, and the application of LLMs without transparent consent can erode patient/consumer trust, which reduces trust in the behavioral health profession more generally. Some mental health startups have already faced criticism for employing generative AI in applications without disclosing this information to the end user 2 . As laid out in the White House Blueprint for an AI Bill of Rights, AI applications should be explicitly (and perhaps repeatedly/consistently) labeled as such to allow patients and consumers to “know that an automated system is being used and understand how and why it contributes to outcomes that impact them” 82 .

Unintended consequences may change the clinical profession

The development of clinical LLM applications could lead to unintended consequences, such as changes to the structure of and compensation for mental health services. AI may permit increased staffing by non-professionals or paraprofessionals, causing professional clinicians to supervise large numbers of non-professionals or even semi-autonomous LLM systems. This could reduce clinicians’ direct patient contact and perhaps increase their exposure to challenging or complicated cases not suitable for the LLM, which may lead to burnout and make clinical jobs less attractive. To address this, research could determine the appropriate number of cases for a clinician to oversee safely and guidelines could be published to disseminate these findings. The 24-hour availability of LLM-based intervention may also change consumer expectations of psychotherapy in a way that is at odds with many of the norms of psychotherapy practice (e.g., waiting for a session to discuss stressors, limited or emergency-only contact between sessions).

LLMs could pave the way for a next generation of clinical science

Beyond the imminent applications described in this paper, it is worth considering how the long-term applications of clinical LLMs might also facilitate significant advances in clinical care and clinical science.

Clinical practice

In terms of their effects on therapeutic interventions themselves, clinical LLMs might promote advances in the field by allowing for the pooling of data on what works with the most difficult cases, perhaps through the use of practice research networks 83 . At the level of health systems, they could expedite the implementation and translation of research findings into clinical practice by suggesting therapeutic strategies to psychotherapists, for instance, promoting strategies that enhance inhibitory learning during exposure therapy 84 . Lastly, clinical LLMs could increase access to care if LLM-based psychotherapy chatbots are offered as low intensity, low-cost options in stepped-care models, similar to the existing provision of computerized CBT and guided self-help 85 .

As the utilization of clinical LLMs expands, there may be a shift towards psychologists and other behavioral health experts operating at the top of their degree. Presently, a significant amount of clinician time is consumed by administrative tasks, chart review, and documentation. The shifting of responsibilities afforded by the automation of certain aspects of psychotherapy by clinical LLMs could allow clinicians to pursue leadership roles, contribute to the development, evaluation, and implementation of LLM-based care, or lead policy efforts, or simply to devote more time to direct patient care.

Clinical science

By facilitating supervision, consultation, and fidelity measurement, LLMs could expedite psychotherapist training and increase the capacity of study supervisors, thus making psychotherapy research less expensive and more efficient.

In a world in which fully autonomous LLM applications screen and assess patients, deliver high-fidelity, protocolized psychotherapy, and collect outcome measurements, psychotherapy clinical trials would be limited largely by the number of willing participants eligible for the study, rather than by the resources required to screen, assess, treat, and follow these participants. This could open the door to unprecedentedly large-N clinical trials. This would allow for well-powered, sophisticated dismantling studies to support the search for mechanisms of change in psychotherapy, which are currently only possible using individual participant level meta-analysis (for example, see ref. 86 ). Ultimately, such insights into causal mechanisms of change in psychotherapy could help to refine these treatments and potentially improve their efficacy.

Finally, the emergence of LLM treatment modalities will challenge (or confirm) fundamental assumptions about psychotherapy. Does therapeutic (human) alliance account for a majority of the variance in patient change? To what extent can an alliance be formed with a technological agent? Is lasting and meaningful therapeutic change only possible through working with a human therapist? LLMs hold the promise of empirical answers to these questions.

In summary, large language models hold promise for supporting, augmenting, or even in some cases replacing human-led psychotherapy, which may improve the quality, accessibility, consistency, and scalability of therapeutic interventions and clinical science research. However, LLMs are advancing quickly and will soon be deployed in the clinical domain, with little oversight or understanding of harms that they may produce. While cautious optimism about clinical LLM applications is warranted, it is also crucial for psychologists to approach the integration of LLMs into psychotherapy with caution and to educate the public about the potential risks and limitations of using these technologies for therapeutic purposes. Furthermore, clinical psychologists ought to actively engage with the technologists building these solutions. As the field of AI continues to evolve, it is essential that researchers and clinicians closely monitor the use of LLMs in psychotherapy and advocate for responsible and ethical use to protect the wellbeing of patients.

Data availability

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

Bubeck, S. et al. Sparks of artificial general intelligence: Early experiments with GPT-4. Preprint at http://arxiv.org/abs/2303.12712 (2023).

Broderick, R. People are using AI for therapy, whether the tech is ready for it or not. Fast Company (2023).

Weizenbaum, J. ELIZA—a computer program for the study of natural language communication between man and machine. Commun. ACM 9 , 36–45 (1966).

Article   Google Scholar  

Bantilan, N., Malgaroli, M., Ray, B. & Hull, T. D. Just in time crisis response: Suicide alert system for telemedicine psychotherapy settings. Psychother. Res. 31 , 289–299 (2021).

Peretz, G., Taylor, C. B., Ruzek, J. I., Jefroykin, S. & Sadeh-Sharvit, S. Machine learning model to predict assignment of therapy homework in behavioral treatments: Algorithm development and validation. JMIR Form. Res. 7 , e45156 (2023).

Article   PubMed   PubMed Central   Google Scholar  

Tanana, M. J. et al. How do you feel? Using natural language processing to automatically rate emotion in psychotherapy. Behav. Res. Methods 53 , 2069–2082 (2021).

Sharma, A., Lin, I. W., Miner, A. S., Atkins, D. C. & Althoff, T. Human–AI collaboration enables more empathic conversations in text-based peer-to-peer mental health support. Nat. Mach. Intell . 5 , 46–57 (2023).

Chen, Z., Flemotomos, N., Imel, Z. E., Atkins, D. C. & Narayanan, S. Leveraging open data and task augmentation to automated behavioral coding of psychotherapy conversations in low-resource scenarios. Preprint at https://doi.org/10.48550/arXiv.2210.14254 (2022).

Shah, R. S. et al. Modeling motivational interviewing strategies on an online peer-to-peer counseling platform. Proc. ACM Hum.-Comput. Interact 6 , 1–24 (2022).

Chan, W. W. et al. The challenges in designing a prevention chatbot for eating disorders: Observational study. JMIR Form. Res. 6 , e28003 (2022).

Darcy, A. Why generative AI Is not yet ready for mental healthcare. Woebot Health https://woebothealth.com/why-generative-ai-is-not-yet-ready-for-mental-healthcare/ (2023).

Abd-Alrazaq, A. A. et al. An overview of the features of chatbots in mental health: A scoping review. Int. J. Med. Inf. 132 , 103978 (2019).

Lim, S. M., Shiau, C. W. C., Cheng, L. J. & Lau, Y. Chatbot-delivered psychotherapy for adults with depressive and anxiety symptoms: A systematic review and meta-regression. Behav. Ther. 53 , 334–347 (2022).

Article   PubMed   Google Scholar  

Baumel, A., Muench, F., Edan, S. & Kane, J. M. Objective user engagement with mental health apps: Systematic search and panel-based usage analysis. J. Med. Internet Res. 21 , e14567 (2019).

Torous, J., Nicholas, J., Larsen, M. E., Firth, J. & Christensen, H. Clinical review of user engagement with mental health smartphone apps: Evidence, theory and improvements. Evid. Based Ment. Health 21 , 116–119 (2018b).

Das, A. et al. Conversational bots for psychotherapy: A study of generative transformer models using domain-specific dialogues. in Proceedings of the 21st Workshop on Biomedical Language Processing 285–297 (Association for Computational Linguistics, 2022). https://doi.org/10.18653/v1/2022.bionlp-1.27 .

Liu, H. Towards automated psychotherapy via language modeling. Preprint at http://arxiv.org/abs/2104.10661 (2021).

Hamilton, J. Why generative AI (LLM) is ready for mental healthcare. LinkedIn https://www.linkedin.com/pulse/why-generative-ai-chatgpt-ready-mental-healthcare-jose-hamilton-md/ (2023).

Shariff, A., Bonnefon, J.-F. & Rahwan, I. Psychological roadblocks to the adoption of self-driving vehicles. Nat. Hum. Behav. 1 , 694–696 (2017).

Markov, A. A. Essai d’une recherche statistique sur le texte du roman “Eugene Onegin” illustrant la liaison des epreuve en chain (‘Example of a statistical investigation of the text of “Eugene Onegin” illustrating the dependence between samples in chain’). Izvistia Imperatorskoi Akad. Nauk Bull. L’Academie Imp. Sci. St-Petersbourg 7 , 153–162 (1913).

Google Scholar  

Shannon, C. E. A mathematical theory of communication. Bell Syst. Tech. J. 27 , 379–423 (1948).

Article   MathSciNet   Google Scholar  

Baker, J. K. Stochastic modeling for automatic speech understanding. in Speech recognition: invited papers presented at the 1974 IEEE symposium (ed. Reddy, D. R.) (Academic Press, 1975).

Jelinek, F. Continuous speech recognition by statistical methods. Proc. IEEE 64 , 532–556 (1976).

Jurafsky, D. & Martin, J. H. N-gram language models. in Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition (Pearson Prentice Hall, 2009).

Vaswani, A. et al. Attention is all you need. 31st Conf. Neural Inf. Process. Syst . (2017).

Bommasani, R. et al. On the opportunities and risks of foundation models. Preprint at http://arxiv.org/abs/2108.07258 (2022).

Gao, L. et al. The Pile: An 800GB dataset of diverse text for language modeling. Preprint at http://arxiv.org/abs/2101.00027 (2020).

Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: Pre-training of deep bidirectional transformers for language understanding. Preprint at http://arxiv.org/abs/1810.04805 (2019).

Kojima, T., Gu, S. S., Reid, M., Matsuo, Y. & Iwasawa, Y. Large language models are zero-shot reasoners. Preprint at http://arxiv.org/abs/2205.11916 (2023).

Fairburn, C. G. & Patel, V. The impact of digital technology on psychological treatments and their dissemination. Behav. Res. Ther. 88 , 19–25 (2017).

Fisher, A. J. et al. Open trial of a personalized modular treatment for mood and anxiety. Behav. Res. Ther. 116 , 69–79 (2019).

Fan, X. et al. Utilization of self-diagnosis health chatbots in real-world settings: Case study. J. Med. Internet Res. 23 , e19928 (2021).

Coghlan, S. et al. To chat or bot to chat: Ethical issues with using chatbots in mental health. Digit. Health 9 , 1–11 (2023).

Beatty, C., Malik, T., Meheli, S. & Sinha, C. Evaluating the therapeutic alliance with a free-text CBT conversational agent (Wysa): A mixed-methods study. Front. Digit. Health 4 , 847991 (2022).

Lin, B., Bouneffouf, D., Cecchi, G. & Varshney, K. R. Towards healthy AI: Large language models need therapists too. Preprint at http://arxiv.org/abs/2304.00416 (2023).

Weidinger, L. et al. Ethical and social risks of harm from language models. Preprint at http://arxiv.org/abs/2112.04359 (2021).

Bender, E. M., Gebru, T., McMillan-Major, A. & Shmitchell, S. On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency 610–623 (ACM, 2021). https://doi.org/10.1145/3442188.3445922 .

Chamberlain, J. The risk-based approach of the European Union’s proposed artificial intelligence regulation: Some comments from a tort law perspective. Eur. J. Risk Regul. 14 , 1–13 (2023).

Norden, J. G. & Shah, N. R. What AI in health care can learn from the long road to autonomous vehicles. NEJM Catal. Innov. Care Deliv . https://doi.org/10.1056/CAT.21.0458 (2022).

Sedlakova, J. & Trachsel, M. Conversational artificial intelligence in psychotherapy: A new therapeutic tool or agent? Am. J. Bioeth. 23 , 4–13 (2023).

Gearing, R. E. et al. Major ingredients of fidelity: A review and scientific guide to improving quality of intervention research implementation. Clin. Psychol. Rev. 31 , 79–88 (2011).

Wiltsey Stirman, S. Implementing evidence-based mental-health treatments: Attending to training, fidelity, adaptation, and context. Curr. Dir. Psychol. Sci. 31 , 436–442 (2022).

Waller, G. Evidence-based treatment and therapist drift. Behav. Res. Ther. 47 , 119–127 (2009).

Flemotomos, N. et al. “Am I a good therapist?” Automated evaluation of psychotherapy skills using speech and language technologies. CoRR, Abs, 2102 (10.3758) (2021).

Zhang, X. et al. You never know what you are going to get: Large-scale assessment of therapists’ supportive counseling skill use. Psychotherapy https://doi.org/10.1037/pst0000460 (2022).

Goldberg, S. B. et al. Machine learning and natural language processing in psychotherapy research: Alliance as example use case. J. Couns. Psychol. 67 , 438–448 (2020).

Wiltsey Stirman, S. et al. A novel approach to the assessment of fidelity to a cognitive behavioral therapy for PTSD using clinical worksheets: A proof of concept with cognitive processing therapy. Behav. Ther. 52 , 656–672 (2021).

Raviola, G., Naslund, J. A., Smith, S. L. & Patel, V. Innovative models in mental health delivery systems: Task sharing care with non-specialist providers to close the mental health treatment gap. Curr. Psychiatry Rep. 21 , 44 (2019).

American Psychological Association. Guidelines for clinical supervision in health service psychology. Am. Psychol. 70 , 33–46 (2015).

Cook, S. C., Schwartz, A. C. & Kaslow, N. J. Evidence-based psychotherapy: Advantages and challenges. Neurotherapeutics 14 , 537–545 (2017).

Leichsenring, F., Steinert, C., Rabung, S. & Ioannidis, J. P. A. The efficacy of psychotherapies and pharmacotherapies for mental disorders in adults: An umbrella review and meta‐analytic evaluation of recent meta‐analyses. World Psych. 21 , 133–145 (2022).

Cuijpers, P., van Straten, A., Andersson, G. & van Oppen, P. Psychotherapy for depression in adults: A meta-analysis of comparative outcome studies. J. Consult. Clin. Psychol. 76 , 909–922 (2008).

Morris, Z. S., Wooding, S. & Grant, J. The answer is 17 years, what is the question: Understanding time lags in translational research. J. R. Soc. Med. 104 , 510–520 (2011).

Chekroud, A. M. et al. The promise of machine learning in predicting treatment outcomes in psychiatry. World Psych. 20 , 154–170 (2021).

Kazdin, A. E. Mediators and mechanisms of change in psychotherapy research. Annu. Rev. Clin. Psychol. 3 , 1–27 (2007).

Angelov, P. P., Soares, E. A., Jiang, R., Arnold, N. I. & Atkinson, P. M. Explainable artificial intelligence: An analytical review. WIREs Data Min. Knowl. Discov . 11 , (2021).

Kelley, T. L. Interpretation of Educational Measurements . (World Book, 1927).

van Bronswijk, S. C. et al. Precision medicine for long-term depression outcomes using the Personalized Advantage Index approach: Cognitive therapy or interpersonal psychotherapy? Psychol. Med. 51 , 279–289 (2021).

Scala, J. J., Ganz, A. B. & Snyder, M. P. Precision medicine approaches to mental health care. Physiology 38 , 82–98 (2023).

Article   CAS   Google Scholar  

Chorpita, B. F., Daleiden, E. L. & Weisz, J. R. Identifying and selecting the common elements of evidence based interventions: A distillation and matching model. Ment. Health Serv. Res. 7 , 5–20 (2005).

Chambless, D. L. & Hollon, S. D. Defining empirically supported therapies. J. Consult. Clin. Psychol. 66 , 7–18 (1998).

Article   CAS   PubMed   Google Scholar  

Tolin, D. F., McKay, D., Forman, E. M., Klonsky, E. D. & Thombs, B. D. Empirically supported treatment: Recommendations for a new model. Clin. Psychol. Sci. Pract. 22 , 317–338 (2015).

Lilienfeld, S. O. Psychological treatments that cause harm. Perspect. Psychol. Sci. 2 , 53–70 (2007).

Wasil, A. R., Venturo-Conerly, K. E., Shingleton, R. M. & Weisz, J. R. A review of popular smartphone apps for depression and anxiety: Assessing the inclusion of evidence-based content. Behav. Res. Ther. 123 , 103498 (2019).

Torous, J. B. et al. A hierarchical framework for evaluation and informed decision making regarding smartphone apps for clinical care. Psychiatr. Serv. 69 , 498–500 (2018).

Gunasekar, S. et al. Textbooks are all you need. Preprint at http://arxiv.org/abs/2306.11644 (2023).

Wilhelm, E. et al. Measuring the burden of infodemics: Summary of the methods and results of the Fifth WHO Infodemic Management Conference. JMIR Infodemiology 3 , e44207 (2023).

Creed, T. A. et al. Knowledge and attitudes toward an artificial intelligence-based fidelity measurement in community cognitive behavioral therapy supervision. Adm. Policy Ment. Health Ment. Health Serv. Res. 49 , 343–356 (2022).

Aktan, M. E., Turhan, Z. & Dolu, İ. Attitudes and perspectives towards the preferences for artificial intelligence in psychotherapy. Comput. Hum. Behav. 133 , 107273 (2022).

Prescott, J. & Hanley, T. Therapists’ attitudes towards the use of AI in therapeutic practice: considering the therapeutic alliance. Ment. Health Soc. Incl. 27 , 177–185 (2023).

American Psychiatric Association. Diagnostic and Statistical Manual of Mental Disorders . (2013).

Yogatama, D., De Masson d’Autume, C. & Kong, L. Adaptive semiparametric language models. Trans. Assoc. Comput. Linguist 9 , 362–373 (2021).

Stanley, B. & Brown, G. K. Safety planning intervention: A brief intervention to mitigate suicide risk. Cogn. Behav. Pract. 19 , 256–264 (2012).

Behzadan, V., Munir, A. & Yampolskiy, R. V. A psychopathological approach to safety engineering in AI and AGI. Preprint at http://arxiv.org/abs/1805.08915 (2018).

Lambert, M. J. & Harmon, K. L. The merits of implementing routine outcome monitoring in clinical practice. Clin. Psychol. Sci. Pract . 25 , (2018).

Kjell, O. N. E., Kjell, K. & Schwartz, H. A. AI-based large language models are ready to transform psychological health assessment. Preprint at https://doi.org/10.31234/osf.io/yfd8g (2023).

First, M. B., Williams, J. B. W., Karg, R. S. & Spitzer, R. L. SCID-5-CV: Structured Clinical Interview for DSM-5 Disorders: Clinician Version . (American Psychiatric Association Publishing, 2016).

Shah, D. S., Schwartz, H. A. & Hovy, D. Predictive biases in natural language processing models: A conceptual framework and overview. in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 5248–5264 (Association for Computational Linguistics, 2020). https://doi.org/10.18653/v1/2020.acl-main.468 .

Adams, L. M. & Miller, A. B. Mechanisms of mental-health disparities among minoritized groups: How well are the top journals in clinical psychology representing this work? Clin . Psychol. Sci. 10 , 387–416 (2022).

Viswanath, H. & Zhang, T. FairPy: A toolkit for evaluation of social biases and their mitigation in large language models. Preprint at http://arxiv.org/abs/2302.05508 (2023).

von Zitzewitz, J., Boesch, P. M., Wolf, P. & Riener, R. Quantifying the human likeness of a humanoid robot. Int. J. Soc. Robot. 5 , 263–276 (2013).

White House Office of Science and Technology Policy. Blueprint for an AI bill of rights. (2022).

Parry, G., Castonguay, L. G., Borkovec, T. D. & Wolf, A. W. Practice research networks and psychological services research in the UK and USA. in Developing and Delivering Practice-Based Evidence (eds. Barkham, M., Hardy, G. E. & Mellor-Clark, J.) 311–325 (Wiley-Blackwell, 2010). https://doi.org/10.1002/9780470687994.ch12 .

Craske, M. G., Treanor, M., Conway, C. C., Zbozinek, T. & Vervliet, B. Maximizing exposure therapy: An inhibitory learning approach. Behav. Res. Ther. 58 , 10–23 (2014).

Delgadillo, J. et al. Stratified care vs stepped care for depression: A cluster randomized clinical trial. JAMA Psychiatry 79 , 101 (2022).

Furukawa, T. A. et al. Dismantling, optimising, and personalising internet cognitive behavioural therapy for depression: A systematic review and component network meta-analysis using individual participant data. Lancet Psychiatry 8 , 500–511 (2021).

Download references


This work was supported by the National Institute of Mental Health under award numbers R01-MH125702 (PI: H.A.S) and RF1-MH128785 (PI: S.W.S.), and by the Institute for Human-Centered A.I. at Stanford University to J.C.E. The authors are grateful to Adam S. Miner and Victor Gomes who provided critical feedback on an earlier version of this manuscript.

Author information

Authors and affiliations.

Dissemination and Training Division, National Center for PTSD, VA Palo Alto Health Care System, Palo Alto, CA, USA

Elizabeth C. Stade, Shannon Wiltsey Stirman & Cody L. Boland

Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, USA

Elizabeth C. Stade & Shannon Wiltsey Stirman

Institute for Human-Centered Artificial Intelligence & Department of Psychology, Stanford University, Stanford, CA, USA

Elizabeth C. Stade & Johannes C. Eichstaedt

Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA

Lyle H. Ungar

Department of Computer Science, Stony Brook University, Stony Brook, NY, USA

H. Andrew Schwartz

Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, MD, USA

David B. Yaden

Department of Technology, Operations, and Statistics, New York University, New York, NY, USA

Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA

Robert J. DeRubeis

Department of Sociology, Stanford University, Stanford, CA, USA

Robb Willer

You can also search for this author in PubMed   Google Scholar


E.C.S., S.W.S., C.L.B., and J.C.E. wrote the main manuscript text. E.C.S., L.H.U., and J.C.E. prepared the figures. All authors reviewed the manuscript.

Corresponding authors

Correspondence to Elizabeth C. Stade or Johannes C. Eichstaedt .

Ethics declarations

Competing interests.

The authors declare the following competing interests: receiving consultation fees from Jimini Health (E.C.S., L.H.U., H.A.S., and J.C.E.).

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Stade, E.C., Stirman, S.W., Ungar, L.H. et al. Large language models could change the future of behavioral healthcare: a proposal for responsible development and evaluation. npj Mental Health Res 3 , 12 (2024). https://doi.org/10.1038/s44184-024-00056-z

Download citation

Received : 24 July 2023

Accepted : 30 January 2024

Published : 02 April 2024

DOI : https://doi.org/10.1038/s44184-024-00056-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

how to evaluate research proposal


  1. Research Proposal Sample

    how to evaluate research proposal

  2. 16+ Research Proposal Templates

    how to evaluate research proposal

  3. Choose from 40 Research Proposal Templates & Examples. 100% Free

    how to evaluate research proposal

  4. Choose From 40 Research Proposal Templates & Examples 100% Free

    how to evaluate research proposal

  5. Top 10 One Page Research Proposal PowerPoint Templates to Present Your

    how to evaluate research proposal

  6. FREE 10+ Scientific Research Proposal Samples in MS Word

    how to evaluate research proposal


  1. Creating a research proposal

  2. Developing a Research Proposal

  3. Introduction To Research Proposal Writing 1

  4. 10 important points the evaluators should consider while evaluating PhD Research Proposals

  5. Tips to make your Research Proposal unique

  6. Overview of a Research Proposal


  1. Evaluating Research

    Definition: Evaluating Research refers to the process of assessing the quality, credibility, and relevance of a research study or project. This involves examining the methods, data, and results of the research in order to determine its validity, reliability, and usefulness. Evaluating research can be done by both experts and non-experts in the ...

  2. PDF Evaluation of research proposals: the why and what of the ERC's recent

    on research assessment, especially the assessment of researchers (as opposed to research proposals). This is important for this discussion. When we say we judge the excellence of the proposal or researcher, we do not expect the application to satisfy each element of a broad portfolio of demands.

  3. How to Write a Research Proposal

    Example research proposal #1:"A Conceptual Framework for Scheduling Constraint Management". Example research proposal #2:"Medical Students as Mediators of Change in Tobacco Use". Title page. Like your dissertation or thesis, the proposal will usually have a title pagethat includes: The proposed title of your project.

  4. Evaluating Research in Academic Journals: A Practical Guide to

    Academic Journals. Evaluating Research in Academic Journals is a guide for students who are learning how to. evaluate reports of empirical research published in academic journals. It breaks down ...

  5. Evaluation of research proposals by peer review panels: broader panels

    To assess research proposals, funders rely on the services of peer experts to review the thousands or perhaps millions of research proposals seeking funding each year. While often associated with scholarly publishing, peer review also includes the ex ante assessment of research grant and fellowship applications ( Abdoul et al. 2012 ).

  6. Research Evaluation

    An example of a standard checklist for the evaluation of research proposals is reported in Box 5.4. Very often, competitive research funding generates a lot of competition, and the number of applicants largely exceeds the allocation of the available budget. The result of peer evaluations normally generates a ranking of fundable research projects.

  7. How Do I Review Thee? Let Me Count the Ways: A Comparison of Research

    The review criteria used to evaluate research grant proposals reflect the funder's approach to identifying the most relevant and impactful research to support (Geever, 2012; Gerin & Kapelewski, 2010; Kiritz, 2007). Thus, planning and preparing a successful grant proposal depends on a clear understanding of the review criteria that will be used.

  8. What Is A Research Proposal? Examples + Template

    The purpose of the research proposal (its job, so to speak) is to convince your research supervisor, committee or university that your research is suitable (for the requirements of the degree program) and manageable (given the time and resource constraints you will face). The most important word here is "convince" - in other words, your ...

  9. Formative Evaluation Using Checklists to Improve Research Proposals

    Formative evaluation, the process of critically reviewing work to improve it, is emphasized, and checklists that summarize IRB criteria and standards and present the critical content of research proposals for studies involving human participants are provided. Teaching principles that can guide faculty in using the checklists to give feedback ...

  10. How to Write a Research Proposal

    Research proposal examples. Writing a research proposal can be quite challenging, but a good starting point could be to look at some examples. We've included a few for you below. Example research proposal #1: 'A Conceptual Framework for Scheduling Constraint Management'.

  11. Evaluating Research Proposals

    Comparing proposals "apples-to-apples" is crucial to establishing which one will best meet your needs. Consider these ideas to help you focus on the details that contribute to a successful survey. Make sure the proposal responds to your objectives. The proposal process begins well before you ask any research firm for quote.

  12. Writing an Evaluation Plan

    For small projects, the Office of the Vice President for Research can help you develop a simple evaluation plan. If you are writing a proposal for larger center grant, using a professional external evaluator is recommended. We can provide recommendations of external evaluators; please contact Amy Carroll at [email protected] or 3-6301.

  13. How To Write A Research Proposal (With Examples)

    Make sure you can ask the critical what, who, and how questions of your research before you put pen to paper. Your research proposal should include (at least) 5 essential components : Title - provides the first taste of your research, in broad terms. Introduction - explains what you'll be researching in more detail.

  14. PDF Criteria for Evaluating Research Proposals

    Criteria for Evaluating Research Propossl.s You are asked to evaluate a proposed study, one that has been actually submitted to the Office of Education, Bureau of' Education for the Handicappedo Your professor was one of the Office of Education consultants, evaluating that research. The decision to support or disapprove this proposal has

  15. 11.2 Steps in Developing a Research Proposal

    Developing a research proposal involves the following preliminary steps: identifying potential ideas, choosing ideas to explore further, choosing and narrowing a topic, formulating a research question, and developing a working thesis. A good topic for a research paper interests the writer and fulfills the requirements of the assignment.

  16. Writing a Research Proposal

    A research proposal must be focused and not be "all over the map" or diverge into unrelated tangents without a clear sense of purpose. ... The reader will never have a study outcome from which to evaluate whether your methodological choices were the correct ones. Thus, the objective here is to convince the reader that your overall research ...

  17. What Is a Research Methodology?

    Step 1: Explain your methodological approach. Step 2: Describe your data collection methods. Step 3: Describe your analysis method. Step 4: Evaluate and justify the methodological choices you made. Tips for writing a strong methodology chapter. Other interesting articles.

  18. How to prepare a Research Proposal

    It puts the proposal in context. 3. The introduction typically begins with a statement of the research problem in precise and clear terms. 1. The importance of the statement of the research problem 5: The statement of the problem is the essential basis for the construction of a research proposal (research objectives, hypotheses, methodology ...

  19. Formal Review of Research Proposals

    The proposal form asks for information about the purpose and proposed design of the study, as well as draft versions of data collection instruments. Samples of completed research proposals are available here and here. The following criteria will be used by the committee to evaluate research proposals:

  20. PDF RFP Writing: Evaluation & Selection Criteria

    Depending on the RFP content, proposal submission requirements will vary. Regardless of the proposal submission requirements you include, it is important to put yourself in the shoes of the proposer, and to check that the submission requirements are clear and directly tied to either evaluation criteria, or government legal and policy requirements.

  21. How to Start an Evaluation Essay: Tips & Steps

    Indeed, precision and clarity are paramount in creating an effective thesis statement, as it sets the tone for the entire research. Be also prepared to refine and revise your thesis statement as your text evolves. Step 3. Determining evaluation criteria. You may ask how to evaluate in an essay.

  22. Webinar Transcript: NIJ FY 2024 Research and Evaluation on Youth

    This webinar will provide an overview of the NIJ FY 2024 Research and Evaluation on Youth Justice Topics solicitation. In collaboration with the Office of Juvenile Justice and Delinquency Prevention, NIJ seeks applications for research and evaluation projects that inform policy and practice in the field of youth justice and delinquency prevention.Specifically, this solicitation seeks proposals ...

  23. How to write a research proposal?

    A proposal needs to show how your work fits into what is already known about the topic and what new paradigm will it add to the literature, while specifying the question that the research will answer, establishing its significance, and the implications of the answer. [ 2] The proposal must be capable of convincing the evaluation committee about ...

  24. Large language models could change the future of behavioral healthcare

    The discussion then outlines various applications of LLMs to psychotherapy and provides a proposal for the cautious, phased development and evaluation of LLM-based applications for psychotherapy.