SEP home page

  • Table of Contents
  • Random Entry
  • Chronological
  • Editorial Information
  • About the SEP
  • Editorial Board
  • How to Cite the SEP
  • Special Characters
  • Advanced Tools
  • Support the SEP
  • PDFs for SEP Friends
  • Make a Donation
  • SEPIA for Libraries
  • Back to Entry
  • Entry Contents
  • Entry Bibliography
  • Academic Tools
  • Friends PDF Preview
  • Author and Citation Info
  • Back to Top

Supplement to Critical Thinking

How can one assess, for purposes of instruction or research, the degree to which a person possesses the dispositions, skills and knowledge of a critical thinker?

In psychometrics, assessment instruments are judged according to their validity and reliability.

Roughly speaking, an instrument is valid if it measures accurately what it purports to measure, given standard conditions. More precisely, the degree of validity is “the degree to which evidence and theory support the interpretations of test scores for proposed uses of tests” (American Educational Research Association 2014: 11). In other words, a test is not valid or invalid in itself. Rather, validity is a property of an interpretation of a given score on a given test for a specified use. Determining the degree of validity of such an interpretation requires collection and integration of the relevant evidence, which may be based on test content, test takers’ response processes, a test’s internal structure, relationship of test scores to other variables, and consequences of the interpretation (American Educational Research Association 2014: 13–21). Criterion-related evidence consists of correlations between scores on the test and performance on another test of the same construct; its weight depends on how well supported is the assumption that the other test can be used as a criterion. Content-related evidence is evidence that the test covers the full range of abilities that it claims to test. Construct-related evidence is evidence that a correct answer reflects good performance of the kind being measured and an incorrect answer reflects poor performance.

An instrument is reliable if it consistently produces the same result, whether across different forms of the same test (parallel-forms reliability), across different items (internal consistency), across different administrations to the same person (test-retest reliability), or across ratings of the same answer by different people (inter-rater reliability). Internal consistency should be expected only if the instrument purports to measure a single undifferentiated construct, and thus should not be expected of a test that measures a suite of critical thinking dispositions or critical thinking abilities, assuming that some people are better in some of the respects measured than in others (for example, very willing to inquire but rather closed-minded). Otherwise, reliability is a necessary but not a sufficient condition of validity; a standard example of a reliable instrument that is not valid is a bathroom scale that consistently under-reports a person’s weight.

Assessing dispositions is difficult if one uses a multiple-choice format with known adverse consequences of a low score. It is pretty easy to tell what answer to the question “How open-minded are you?” will get the highest score and to give that answer, even if one knows that the answer is incorrect. If an item probes less directly for a critical thinking disposition, for example by asking how often the test taker pays close attention to views with which the test taker disagrees, the answer may differ from reality because of self-deception or simple lack of awareness of one’s personal thinking style, and its interpretation is problematic, even if factor analysis enables one to identify a distinct factor measured by a group of questions that includes this one (Ennis 1996). Nevertheless, Facione, Sánchez, and Facione (1994) used this approach to develop the California Critical Thinking Dispositions Inventory (CCTDI). They began with 225 statements expressive of a disposition towards or away from critical thinking (using the long list of dispositions in Facione 1990a), validated the statements with talk-aloud and conversational strategies in focus groups to determine whether people in the target population understood the items in the way intended, administered a pilot version of the test with 150 items, and eliminated items that failed to discriminate among test takers or were inversely correlated with overall results or added little refinement to overall scores (Facione 2000). They used item analysis and factor analysis to group the measured dispositions into seven broad constructs: open-mindedness, analyticity, cognitive maturity, truth-seeking, systematicity, inquisitiveness, and self-confidence (Facione, Sánchez, and Facione 1994). The resulting test consists of 75 agree-disagree statements and takes 20 minutes to administer. A repeated disturbing finding is that North American students taking the test tend to score low on the truth-seeking sub-scale (on which a low score results from agreeing to such statements as the following: “To get people to agree with me I would give any reason that worked”. “Everyone always argues from their own self-interest, including me”. “If there are four reasons in favor and one against, I’ll go with the four”.) Development of the CCTDI made it possible to test whether good critical thinking abilities and good critical thinking dispositions go together, in which case it might be enough to teach one without the other. Facione (2000) reports that administration of the CCTDI and the California Critical Thinking Skills Test (CCTST) to almost 8,000 post-secondary students in the United States revealed a statistically significant but weak correlation between total scores on the two tests, and also between paired sub-scores from the two tests. The implication is that both abilities and dispositions need to be taught, that one cannot expect improvement in one to bring with it improvement in the other.

A more direct way of assessing critical thinking dispositions would be to see what people do when put in a situation where the dispositions would reveal themselves. Ennis (1996) reports promising initial work with guided open-ended opportunities to give evidence of dispositions, but no standardized test seems to have emerged from this work. There are however standardized aspect-specific tests of critical thinking dispositions. The Critical Problem Solving Scale (Berman et al. 2001: 518) takes as a measure of the disposition to suspend judgment the number of distinct good aspects attributed to an option judged to be the worst among those generated by the test taker. Stanovich, West and Toplak (2011: 800–810) list tests developed by cognitive psychologists of the following dispositions: resistance to miserly information processing, resistance to myside thinking, absence of irrelevant context effects in decision-making, actively open-minded thinking, valuing reason and truth, tendency to seek information, objective reasoning style, tendency to seek consistency, sense of self-efficacy, prudent discounting of the future, self-control skills, and emotional regulation.

It is easier to measure critical thinking skills or abilities than to measure dispositions. The following eight currently available standardized tests purport to measure them: the Watson-Glaser Critical Thinking Appraisal (Watson & Glaser 1980a, 1980b, 1994), the Cornell Critical Thinking Tests Level X and Level Z (Ennis & Millman 1971; Ennis, Millman, & Tomko 1985, 2005), the Ennis-Weir Critical Thinking Essay Test (Ennis & Weir 1985), the California Critical Thinking Skills Test (Facione 1990b, 1992), the Halpern Critical Thinking Assessment (Halpern 2016), the Critical Thinking Assessment Test (Center for Assessment & Improvement of Learning 2017), the Collegiate Learning Assessment (Council for Aid to Education 2017), the HEIghten Critical Thinking Assessment (https://territorium.com/heighten/), and a suite of critical thinking assessments for different groups and purposes offered by Insight Assessment (https://www.insightassessment.com/products). The Critical Thinking Assessment Test (CAT) is unique among them in being designed for use by college faculty to help them improve their development of students’ critical thinking skills (Haynes et al. 2015; Haynes & Stein 2021). Also, for some years the United Kingdom body OCR (Oxford Cambridge and RSA Examinations) awarded AS and A Level certificates in critical thinking on the basis of an examination (OCR 2011). Many of these standardized tests have received scholarly evaluations at the hands of, among others, Ennis (1958), McPeck (1981), Norris and Ennis (1989), Fisher and Scriven (1997), Possin (2008, 2013a, 2013b, 2013c, 2014, 2020) and Hatcher and Possin (2021). Their evaluations provide a useful set of criteria that such tests ideally should meet, as does the description by Ennis (1984) of problems in testing for competence in critical thinking: the soundness of multiple-choice items, the clarity and soundness of instructions to test takers, the information and mental processing used in selecting an answer to a multiple-choice item, the role of background beliefs and ideological commitments in selecting an answer to a multiple-choice item, the tenability of a test’s underlying conception of critical thinking and its component abilities, the set of abilities that the test manual claims are covered by the test, the extent to which the test actually covers these abilities, the appropriateness of the weighting given to various abilities in the scoring system, the accuracy and intellectual honesty of the test manual, the interest of the test to the target population of test takers, the scope for guessing, the scope for choosing a keyed answer by being test-wise, precautions against cheating in the administration of the test, clarity and soundness of materials for training essay graders, inter-rater reliability in grading essays, and clarity and soundness of advance guidance to test takers on what is required in an essay. Rear (2019) has challenged the use of standardized tests of critical thinking as a way to measure educational outcomes, on the grounds that  they (1) fail to take into account disputes about conceptions of critical thinking, (2) are not completely valid or reliable, and (3) fail to evaluate skills used in real academic tasks. He proposes instead assessments based on discipline-specific content.

There are also aspect-specific standardized tests of critical thinking abilities. Stanovich, West and Toplak (2011: 800–810) list tests of probabilistic reasoning, insights into qualitative decision theory, knowledge of scientific reasoning, knowledge of rules of logical consistency and validity, and economic thinking. They also list instruments that probe for irrational thinking, such as superstitious thinking, belief in the superiority of intuition, over-reliance on folk wisdom and folk psychology, belief in “special” expertise, financial misconceptions, overestimation of one’s introspective powers, dysfunctional beliefs, and a notion of self that encourages egocentric processing. They regard these tests along with the previously mentioned tests of critical thinking dispositions as the building blocks for a comprehensive test of rationality, whose development (they write) may be logistically difficult and would require millions of dollars.

A superb example of assessment of an aspect of critical thinking ability is the Test on Appraising Observations (Norris & King 1983, 1985, 1990a, 1990b), which was designed for classroom administration to senior high school students. The test focuses entirely on the ability to appraise observation statements and in particular on the ability to determine in a specified context which of two statements there is more reason to believe. According to the test manual (Norris & King 1985, 1990b), a person’s score on the multiple-choice version of the test, which is the number of items that are answered correctly, can justifiably be given either a criterion-referenced or a norm-referenced interpretation.

On a criterion-referenced interpretation, those who do well on the test have a firm grasp of the principles for appraising observation statements, and those who do poorly have a weak grasp of them. This interpretation can be justified by the content of the test and the way it was developed, which incorporated a method of controlling for background beliefs articulated and defended by Norris (1985). Norris and King synthesized from judicial practice, psychological research and common-sense psychology 31 principles for appraising observation statements, in the form of empirical generalizations about tendencies, such as the principle that observation statements tend to be more believable than inferences based on them (Norris & King 1984). They constructed items in which exactly one of the 31 principles determined which of two statements was more believable. Using a carefully constructed protocol, they interviewed about 100 students who responded to these items in order to determine the thinking that led them to choose the answers they did (Norris & King 1984). In several iterations of the test, they adjusted items so that selection of the correct answer generally reflected good thinking and selection of an incorrect answer reflected poor thinking. Thus they have good evidence that good performance on the test is due to good thinking about observation statements and that poor performance is due to poor thinking about observation statements. Collectively, the 50 items on the final version of the test require application of 29 of the 31 principles for appraising observation statements, with 13 principles tested by one item, 12 by two items, three by three items, and one by four items. Thus there is comprehensive coverage of the principles for appraising observation statements. Fisher and Scriven (1997: 135–136) judge the items to be well worked and sound, with one exception. The test is clearly written at a grade 6 reading level, meaning that poor performance cannot be attributed to difficulties in reading comprehension by the intended adolescent test takers. The stories that frame the items are realistic, and are engaging enough to stimulate test takers’ interest. Thus the most plausible explanation of a given score on the test is that it reflects roughly the degree to which the test taker can apply principles for appraising observations in real situations. In other words, there is good justification of the proposed interpretation that those who do well on the test have a firm grasp of the principles for appraising observation statements and those who do poorly have a weak grasp of them.

To get norms for performance on the test, Norris and King arranged for seven groups of high school students in different types of communities and with different levels of academic ability to take the test. The test manual includes percentiles, means, and standard deviations for each of these seven groups. These norms allow teachers to compare the performance of their class on the test to that of a similar group of students.

Copyright © 2022 by David Hitchcock < hitchckd @ mcmaster . ca >

  • Accessibility

Support SEP

Mirror sites.

View this site from another server:

  • Info about mirror sites

The Stanford Encyclopedia of Philosophy is copyright © 2023 by The Metaphysics Research Lab , Department of Philosophy, Stanford University

Library of Congress Catalog Data: ISSN 1095-5054

Bookmark this page

  • A Model for the National Assessment of Higher Order Thinking
  • International Critical Thinking Essay Test
  • Online Critical Thinking Basic Concepts Test
  • Online Critical Thinking Basic Concepts Sample Test

Consequential Validity: Using Assessment to Drive Instruction

Translate this page from English...

*Machine translated pages not guaranteed for accuracy. Click Here for our professional translations.

critical thinking assessment method

Critical Thinking Testing and Assessment

The purpose of assessment in instruction is improvement. The purpose of assessing instruction for critical thinking is improving the teaching of discipline-based thinking (historical, biological, sociological, mathematical, etc.) It is to improve students’ abilities to think their way through content using disciplined skill in reasoning. The more particular we can be about what we want students to learn about critical thinking, the better we can devise instruction with that particular end in view.

critical thinking assessment method

The Foundation for Critical Thinking offers assessment instruments which share in the same general goal: to enable educators to gather evidence relevant to determining the extent to which instruction is teaching students to think critically (in the process of learning content). To this end, the Fellows of the Foundation recommend:

that academic institutions and units establish an oversight committee for critical thinking, and

that this oversight committee utilizes a combination of assessment instruments (the more the better) to generate incentives for faculty, by providing them with as much evidence as feasible of the actual state of instruction for critical thinking.

The following instruments are available to generate evidence relevant to critical thinking teaching and learning:

Course Evaluation Form : Provides evidence of whether, and to what extent, students perceive faculty as fostering critical thinking in instruction (course by course). Machine-scoreable.

Online Critical Thinking Basic Concepts Test : Provides evidence of whether, and to what extent, students understand the fundamental concepts embedded in critical thinking (and hence tests student readiness to think critically). Machine-scoreable.

Critical Thinking Reading and Writing Test : Provides evidence of whether, and to what extent, students can read closely and write substantively (and hence tests students' abilities to read and write critically). Short-answer.

International Critical Thinking Essay Test : Provides evidence of whether, and to what extent, students are able to analyze and assess excerpts from textbooks or professional writing. Short-answer.

Commission Study Protocol for Interviewing Faculty Regarding Critical Thinking : Provides evidence of whether, and to what extent, critical thinking is being taught at a college or university. Can be adapted for high school. Based on the California Commission Study . Short-answer.

Protocol for Interviewing Faculty Regarding Critical Thinking : Provides evidence of whether, and to what extent, critical thinking is being taught at a college or university. Can be adapted for high school. Short-answer.

Protocol for Interviewing Students Regarding Critical Thinking : Provides evidence of whether, and to what extent, students are learning to think critically at a college or university. Can be adapted for high school). Short-answer. 

Criteria for Critical Thinking Assignments : Can be used by faculty in designing classroom assignments, or by administrators in assessing the extent to which faculty are fostering critical thinking.

Rubrics for Assessing Student Reasoning Abilities : A useful tool in assessing the extent to which students are reasoning well through course content.  

All of the above assessment instruments can be used as part of pre- and post-assessment strategies to gauge development over various time periods.

Consequential Validity

All of the above assessment instruments, when used appropriately and graded accurately, should lead to a high degree of consequential validity. In other words, the use of the instruments should cause teachers to teach in such a way as to foster critical thinking in their various subjects. In this light, for students to perform well on the various instruments, teachers will need to design instruction so that students can perform well on them. Students cannot become skilled in critical thinking without learning (first) the concepts and principles that underlie critical thinking and (second) applying them in a variety of forms of thinking: historical thinking, sociological thinking, biological thinking, etc. Students cannot become skilled in analyzing and assessing reasoning without practicing it. However, when they have routine practice in paraphrasing, summariz­ing, analyzing, and assessing, they will develop skills of mind requisite to the art of thinking well within any subject or discipline, not to mention thinking well within the various domains of human life.

For full copies of this and many other critical thinking articles, books, videos, and more, join us at the Center for Critical Thinking Community Online - the world's leading online community dedicated to critical thinking!   Also featuring interactive learning activities, study groups, and even a social media component, this learning platform will change your conception of intellectual development.

Menu Trigger

Why Schools Need to Change Yes, We Can Define, Teach, and Assess Critical Thinking Skills

critical thinking assessment method

Jeff Heyck-Williams (He, His, Him) Director of the Two Rivers Learning Institute in Washington, DC

critical thinking

Today’s learners face an uncertain present and a rapidly changing future that demand far different skills and knowledge than were needed in the 20th century. We also know so much more about enabling deep, powerful learning than we ever did before. Our collective future depends on how well young people prepare for the challenges and opportunities of 21st-century life.

Critical thinking is a thing. We can define it; we can teach it; and we can assess it.

While the idea of teaching critical thinking has been bandied around in education circles since at least the time of John Dewey, it has taken greater prominence in the education debates with the advent of the term “21st century skills” and discussions of deeper learning. There is increasing agreement among education reformers that critical thinking is an essential ingredient for long-term success for all of our students.

However, there are still those in the education establishment and in the media who argue that critical thinking isn’t really a thing, or that these skills aren’t well defined and, even if they could be defined, they can’t be taught or assessed.

To those naysayers, I have to disagree. Critical thinking is a thing. We can define it; we can teach it; and we can assess it. In fact, as part of a multi-year Assessment for Learning Project , Two Rivers Public Charter School in Washington, D.C., has done just that.

Before I dive into what we have done, I want to acknowledge that some of the criticism has merit.

First, there are those that argue that critical thinking can only exist when students have a vast fund of knowledge. Meaning that a student cannot think critically if they don’t have something substantive about which to think. I agree. Students do need a robust foundation of core content knowledge to effectively think critically. Schools still have a responsibility for building students’ content knowledge.

However, I would argue that students don’t need to wait to think critically until after they have mastered some arbitrary amount of knowledge. They can start building critical thinking skills when they walk in the door. All students come to school with experience and knowledge which they can immediately think critically about. In fact, some of the thinking that they learn to do helps augment and solidify the discipline-specific academic knowledge that they are learning.

The second criticism is that critical thinking skills are always highly contextual. In this argument, the critics make the point that the types of thinking that students do in history is categorically different from the types of thinking students do in science or math. Thus, the idea of teaching broadly defined, content-neutral critical thinking skills is impossible. I agree that there are domain-specific thinking skills that students should learn in each discipline. However, I also believe that there are several generalizable skills that elementary school students can learn that have broad applicability to their academic and social lives. That is what we have done at Two Rivers.

Defining Critical Thinking Skills

We began this work by first defining what we mean by critical thinking. After a review of the literature and looking at the practice at other schools, we identified five constructs that encompass a set of broadly applicable skills: schema development and activation; effective reasoning; creativity and innovation; problem solving; and decision making.

critical thinking competency

We then created rubrics to provide a concrete vision of what each of these constructs look like in practice. Working with the Stanford Center for Assessment, Learning and Equity (SCALE) , we refined these rubrics to capture clear and discrete skills.

For example, we defined effective reasoning as the skill of creating an evidence-based claim: students need to construct a claim, identify relevant support, link their support to their claim, and identify possible questions or counter claims. Rubrics provide an explicit vision of the skill of effective reasoning for students and teachers. By breaking the rubrics down for different grade bands, we have been able not only to describe what reasoning is but also to delineate how the skills develop in students from preschool through 8th grade.

reasoning rubric

Before moving on, I want to freely acknowledge that in narrowly defining reasoning as the construction of evidence-based claims we have disregarded some elements of reasoning that students can and should learn. For example, the difference between constructing claims through deductive versus inductive means is not highlighted in our definition. However, by privileging a definition that has broad applicability across disciplines, we are able to gain traction in developing the roots of critical thinking. In this case, to formulate well-supported claims or arguments.

Teaching Critical Thinking Skills

The definitions of critical thinking constructs were only useful to us in as much as they translated into practical skills that teachers could teach and students could learn and use. Consequently, we have found that to teach a set of cognitive skills, we needed thinking routines that defined the regular application of these critical thinking and problem-solving skills across domains. Building on Harvard’s Project Zero Visible Thinking work, we have named routines aligned with each of our constructs.

For example, with the construct of effective reasoning, we aligned the Claim-Support-Question thinking routine to our rubric. Teachers then were able to teach students that whenever they were making an argument, the norm in the class was to use the routine in constructing their claim and support. The flexibility of the routine has allowed us to apply it from preschool through 8th grade and across disciplines from science to economics and from math to literacy.

argumentative writing

Kathryn Mancino, a 5th grade teacher at Two Rivers, has deliberately taught three of our thinking routines to students using the anchor charts above. Her charts name the components of each routine and has a place for students to record when they’ve used it and what they have figured out about the routine. By using this structure with a chart that can be added to throughout the year, students see the routines as broadly applicable across disciplines and are able to refine their application over time.

Assessing Critical Thinking Skills

By defining specific constructs of critical thinking and building thinking routines that support their implementation in classrooms, we have operated under the assumption that students are developing skills that they will be able to transfer to other settings. However, we recognized both the importance and the challenge of gathering reliable data to confirm this.

With this in mind, we have developed a series of short performance tasks around novel discipline-neutral contexts in which students can apply the constructs of thinking. Through these tasks, we have been able to provide an opportunity for students to demonstrate their ability to transfer the types of thinking beyond the original classroom setting. Once again, we have worked with SCALE to define tasks where students easily access the content but where the cognitive lift requires them to demonstrate their thinking abilities.

These assessments demonstrate that it is possible to capture meaningful data on students’ critical thinking abilities. They are not intended to be high stakes accountability measures. Instead, they are designed to give students, teachers, and school leaders discrete formative data on hard to measure skills.

While it is clearly difficult, and we have not solved all of the challenges to scaling assessments of critical thinking, we can define, teach, and assess these skills . In fact, knowing how important they are for the economy of the future and our democracy, it is essential that we do.

Jeff Heyck-Williams (He, His, Him)

Director of the two rivers learning institute.

Jeff Heyck-Williams is the director of the Two Rivers Learning Institute and a founder of Two Rivers Public Charter School. He has led work around creating school-wide cultures of mathematics, developing assessments of critical thinking and problem-solving, and supporting project-based learning.

Read More About Why Schools Need to Change

NGLC's Bravely 2024-2025

Bring Your Vision for Student Success to Life with NGLC and Bravely

March 13, 2024

teacher using Canva on laptop

For Ethical AI, Listen to Teachers

Jason Wilmot

October 23, 2023

students walking across bright hallway

Turning School Libraries into Discipline Centers Is Not the Answer to Disruptive Classroom Behavior

Stephanie McGary

October 4, 2023

critical thinking assessment method

  • Reference Manager
  • Simple TEXT file

People also looked at

Original research article, performance assessment of critical thinking: conceptualization, design, and implementation.

critical thinking assessment method

  • 1 Lynch School of Education and Human Development, Boston College, Chestnut Hill, MA, United States
  • 2 Graduate School of Education, Stanford University, Stanford, CA, United States
  • 3 Department of Business and Economics Education, Johannes Gutenberg University, Mainz, Germany

Enhancing students’ critical thinking (CT) skills is an essential goal of higher education. This article presents a systematic approach to conceptualizing and measuring CT. CT generally comprises the following mental processes: identifying, evaluating, and analyzing a problem; interpreting information; synthesizing evidence; and reporting a conclusion. We further posit that CT also involves dealing with dilemmas involving ambiguity or conflicts among principles and contradictory information. We argue that performance assessment provides the most realistic—and most credible—approach to measuring CT. From this conceptualization and construct definition, we describe one possible framework for building performance assessments of CT with attention to extended performance tasks within the assessment system. The framework is a product of an ongoing, collaborative effort, the International Performance Assessment of Learning (iPAL). The framework comprises four main aspects: (1) The storyline describes a carefully curated version of a complex, real-world situation. (2) The challenge frames the task to be accomplished (3). A portfolio of documents in a range of formats is drawn from multiple sources chosen to have specific characteristics. (4) The scoring rubric comprises a set of scales each linked to a facet of the construct. We discuss a number of use cases, as well as the challenges that arise with the use and valid interpretation of performance assessments. The final section presents elements of the iPAL research program that involve various refinements and extensions of the assessment framework, a number of empirical studies, along with linkages to current work in online reading and information processing.

Introduction

In their mission statements, most colleges declare that a principal goal is to develop students’ higher-order cognitive skills such as critical thinking (CT) and reasoning (e.g., Shavelson, 2010 ; Hyytinen et al., 2019 ). The importance of CT is echoed by business leaders ( Association of American Colleges and Universities [AACU], 2018 ), as well as by college faculty (for curricular analyses in Germany, see e.g., Zlatkin-Troitschanskaia et al., 2018 ). Indeed, in the 2019 administration of the Faculty Survey of Student Engagement (FSSE), 93% of faculty reported that they “very much” or “quite a bit” structure their courses to support student development with respect to thinking critically and analytically. In a listing of 21st century skills, CT was the most highly ranked among FSSE respondents ( Indiana University, 2019 ). Nevertheless, there is considerable evidence that many college students do not develop these skills to a satisfactory standard ( Arum and Roksa, 2011 ; Shavelson et al., 2019 ; Zlatkin-Troitschanskaia et al., 2019 ). This state of affairs represents a serious challenge to higher education – and to society at large.

In view of the importance of CT, as well as evidence of substantial variation in its development during college, its proper measurement is essential to tracking progress in skill development and to providing useful feedback to both teachers and learners. Feedback can help focus students’ attention on key skill areas in need of improvement, and provide insight to teachers on choices of pedagogical strategies and time allocation. Moreover, comparative studies at the program and institutional level can inform higher education leaders and policy makers.

The conceptualization and definition of CT presented here is closely related to models of information processing and online reasoning, the skills that are the focus of this special issue. These two skills are especially germane to the learning environments that college students experience today when much of their academic work is done online. Ideally, students should be capable of more than naïve Internet search, followed by copy-and-paste (e.g., McGrew et al., 2017 ); rather, for example, they should be able to critically evaluate both sources of evidence and the quality of the evidence itself in light of a given purpose ( Leu et al., 2020 ).

In this paper, we present a systematic approach to conceptualizing CT. From that conceptualization and construct definition, we present one possible framework for building performance assessments of CT with particular attention to extended performance tasks within the test environment. The penultimate section discusses some of the challenges that arise with the use and valid interpretation of performance assessment scores. We conclude the paper with a section on future perspectives in an emerging field of research – the iPAL program.

Conceptual Foundations, Definition and Measurement of Critical Thinking

In this section, we briefly review the concept of CT and its definition. In accordance with the principles of evidence-centered design (ECD; Mislevy et al., 2003 ), the conceptualization drives the measurement of the construct; that is, implementation of ECD directly links aspects of the assessment framework to specific facets of the construct. We then argue that performance assessments designed in accordance with such an assessment framework provide the most realistic—and most credible—approach to measuring CT. The section concludes with a sketch of an approach to CT measurement grounded in performance assessment .

Concept and Definition of Critical Thinking

Taxonomies of 21st century skills ( Pellegrino and Hilton, 2012 ) abound, and it is neither surprising that CT appears in most taxonomies of learning, nor that there are many different approaches to defining and operationalizing the construct of CT. There is, however, general agreement that CT is a multifaceted construct ( Liu et al., 2014 ). Liu et al. (2014) identified five key facets of CT: (i) evaluating evidence and the use of evidence; (ii) analyzing arguments; (iii) understanding implications and consequences; (iv) developing sound arguments; and (v) understanding causation and explanation.

There is empirical support for these facets from college faculty. A 2016–2017 survey conducted by the Higher Education Research Institute (HERI) at the University of California, Los Angeles found that a substantial majority of faculty respondents “frequently” encouraged students to: (i) evaluate the quality or reliability of the information they receive; (ii) recognize biases that affect their thinking; (iii) analyze multiple sources of information before coming to a conclusion; and (iv) support their opinions with a logical argument ( Stolzenberg et al., 2019 ).

There is general agreement that CT involves the following mental processes: identifying, evaluating, and analyzing a problem; interpreting information; synthesizing evidence; and reporting a conclusion (e.g., Erwin and Sebrell, 2003 ; Kosslyn and Nelson, 2017 ; Shavelson et al., 2018 ). We further suggest that CT includes dealing with dilemmas of ambiguity or conflict among principles and contradictory information ( Oser and Biedermann, 2020 ).

Importantly, Oser and Biedermann (2020) posit that CT can be manifested at three levels. The first level, Critical Analysis , is the most complex of the three levels. Critical Analysis requires both knowledge in a specific discipline (conceptual) and procedural analytical (deduction, inclusion, etc.) knowledge. The second level is Critical Reflection , which involves more generic skills “… necessary for every responsible member of a society” (p. 90). It is “a basic attitude that must be taken into consideration if (new) information is questioned to be true or false, reliable or not reliable, moral or immoral etc.” (p. 90). To engage in Critical Reflection, one needs not only apply analytic reasoning, but also adopt a reflective stance toward the political, social, and other consequences of choosing a course of action. It also involves analyzing the potential motives of various actors involved in the dilemma of interest. The third level, Critical Alertness , involves questioning one’s own or others’ thinking from a skeptical point of view.

Wheeler and Haertel (1993) categorized higher-order skills, such as CT, into two types: (i) when solving problems and making decisions in professional and everyday life, for instance, related to civic affairs and the environment; and (ii) in situations where various mental processes (e.g., comparing, evaluating, and justifying) are developed through formal instruction, usually in a discipline. Hence, in both settings, individuals must confront situations that typically involve a problematic event, contradictory information, and possibly conflicting principles. Indeed, there is an ongoing debate concerning whether CT should be evaluated using generic or discipline-based assessments ( Nagel et al., 2020 ). Whether CT skills are conceptualized as generic or discipline-specific has implications for how they are assessed and how they are incorporated into the classroom.

In the iPAL project, CT is characterized as a multifaceted construct that comprises conceptualizing, analyzing, drawing inferences or synthesizing information, evaluating claims, and applying the results of these reasoning processes to various purposes (e.g., solve a problem, decide on a course of action, find an answer to a given question or reach a conclusion) ( Shavelson et al., 2019 ). In the course of carrying out a CT task, an individual typically engages in activities such as specifying or clarifying a problem; deciding what information is relevant to the problem; evaluating the trustworthiness of information; avoiding judgmental errors based on “fast thinking”; avoiding biases and stereotypes; recognizing different perspectives and how they can reframe a situation; considering the consequences of alternative courses of actions; and communicating clearly and concisely decisions and actions. The order in which activities are carried out can vary among individuals and the processes can be non-linear and reciprocal.

In this article, we focus on generic CT skills. The importance of these skills derives not only from their utility in academic and professional settings, but also the many situations involving challenging moral and ethical issues – often framed in terms of conflicting principles and/or interests – to which individuals have to apply these skills ( Kegan, 1994 ; Tessier-Lavigne, 2020 ). Conflicts and dilemmas are ubiquitous in the contexts in which adults find themselves: work, family, civil society. Moreover, to remain viable in the global economic environment – one characterized by increased competition and advances in second generation artificial intelligence (AI) – today’s college students will need to continually develop and leverage their CT skills. Ideally, colleges offer a supportive environment in which students can develop and practice effective approaches to reasoning about and acting in learning, professional and everyday situations.

Measurement of Critical Thinking

Critical thinking is a multifaceted construct that poses many challenges to those who would develop relevant and valid assessments. For those interested in current approaches to the measurement of CT that are not the focus of this paper, consult Zlatkin-Troitschanskaia et al. (2018) .

In this paper, we have singled out performance assessment as it offers important advantages to measuring CT. Extant tests of CT typically employ response formats such as forced-choice or short-answer, and scenario-based tasks (for an overview, see Liu et al., 2014 ). They all suffer from moderate to severe construct underrepresentation; that is, they fail to capture important facets of the CT construct such as perspective taking and communication. High fidelity performance tasks are viewed as more authentic in that they provide a problem context and require responses that are more similar to what individuals confront in the real world than what is offered by traditional multiple-choice items ( Messick, 1994 ; Braun, 2019 ). This greater verisimilitude promises higher levels of construct representation and lower levels of construct-irrelevant variance. Such performance tasks have the capacity to measure facets of CT that are imperfectly assessed, if at all, using traditional assessments ( Lane and Stone, 2006 ; Braun, 2019 ; Shavelson et al., 2019 ). However, these assertions must be empirically validated, and the measures should be subjected to psychometric analyses. Evidence of the reliability, validity, and interpretative challenges of performance assessment (PA) are extensively detailed in Davey et al. (2015) .

We adopt the following definition of performance assessment:

A performance assessment (sometimes called a work sample when assessing job performance) … is an activity or set of activities that requires test takers, either individually or in groups, to generate products or performances in response to a complex, most often real-world task. These products and performances provide observable evidence bearing on test takers’ knowledge, skills, and abilities—their competencies—in completing the assessment ( Davey et al., 2015 , p. 10).

A performance assessment typically includes an extended performance task and short constructed-response and selected-response (i.e., multiple-choice) tasks (for examples, see Zlatkin-Troitschanskaia and Shavelson, 2019 ). In this paper, we refer to both individual performance- and constructed-response tasks as performance tasks (PT) (For an example, see Table 1 in section “iPAL Assessment Framework”).

www.frontiersin.org

Table 1. The iPAL assessment framework.

An Approach to Performance Assessment of Critical Thinking: The iPAL Program

The approach to CT presented here is the result of ongoing work undertaken by the International Performance Assessment of Learning collaborative (iPAL 1 ). iPAL is an international consortium of volunteers, primarily from academia, who have come together to address the dearth in higher education of research and practice in measuring CT with performance tasks ( Shavelson et al., 2018 ). In this section, we present iPAL’s assessment framework as the basis of measuring CT, with examples along the way.

iPAL Background

The iPAL assessment framework builds on the Council of Aid to Education’s Collegiate Learning Assessment (CLA). The CLA was designed to measure cross-disciplinary, generic competencies, such as CT, analytic reasoning, problem solving, and written communication ( Klein et al., 2007 ; Shavelson, 2010 ). Ideally, each PA contained an extended PT (e.g., examining a range of evidential materials related to the crash of an aircraft) and two short PT’s: one in which students either critique an argument or provide a solution in response to a real-world societal issue.

Motivated by considerations of adequate reliability, in 2012, the CLA was later modified to create the CLA+. The CLA+ includes two subtests: a PT and a 25-item Selected Response Question (SRQ) section. The PT presents a document or problem statement and an assignment based on that document which elicits an open-ended response. The CLA+ added the SRQ section (which is not linked substantively to the PT scenario) to increase the number of student responses to obtain more reliable estimates of performance at the student-level than could be achieved with a single PT ( Zahner, 2013 ; Davey et al., 2015 ).

iPAL Assessment Framework

Methodological foundations.

The iPAL framework evolved from the Collegiate Learning Assessment developed by Klein et al. (2007) . It was also informed by the results from the AHELO pilot study ( Organisation for Economic Co-operation and Development [OECD], 2012 , 2013 ), as well as the KoKoHs research program in Germany (for an overview see, Zlatkin-Troitschanskaia et al., 2017 , 2020 ). The ongoing refinement of the iPAL framework has been guided in part by the principles of Evidence Centered Design (ECD) ( Mislevy et al., 2003 ; Mislevy and Haertel, 2006 ; Haertel and Fujii, 2017 ).

In educational measurement, an assessment framework plays a critical intermediary role between the theoretical formulation of the construct and the development of the assessment instrument containing tasks (or items) intended to elicit evidence with respect to that construct ( Mislevy et al., 2003 ). Builders of the assessment framework draw on the construct theory and operationalize it in a way that provides explicit guidance to PT’s developers. Thus, the framework should reflect the relevant facets of the construct, where relevance is determined by substantive theory or an appropriate alternative such as behavioral samples from real-world situations of interest (criterion-sampling; McClelland, 1973 ), as well as the intended use(s) (for an example, see Shavelson et al., 2019 ). By following the requirements and guidelines embodied in the framework, instrument developers strengthen the claim of construct validity for the instrument ( Messick, 1994 ).

An assessment framework can be specified at different levels of granularity: an assessment battery (“omnibus” assessment, for an example see below), a single performance task, or a specific component of an assessment ( Shavelson, 2010 ; Davey et al., 2015 ). In the iPAL program, a performance assessment comprises one or more extended performance tasks and additional selected-response and short constructed-response items. The focus of the framework specified below is on a single PT intended to elicit evidence with respect to some facets of CT, such as the evaluation of the trustworthiness of the documents provided and the capacity to address conflicts of principles.

From the ECD perspective, an assessment is an instrument for generating information to support an evidentiary argument and, therefore, the intended inferences (claims) must guide each stage of the design process. The construct of interest is operationalized through the Student Model , which represents the target knowledge, skills, and abilities, as well as the relationships among them. The student model should also make explicit the assumptions regarding student competencies in foundational skills or content knowledge. The Task Model specifies the features of the problems or items posed to the respondent, with the goal of eliciting the evidence desired. The assessment framework also describes the collection of task models comprising the instrument, with considerations of construct validity, various psychometric characteristics (e.g., reliability) and practical constraints (e.g., testing time and cost). The student model provides grounds for evidence of validity, especially cognitive validity; namely, that the students are thinking critically in responding to the task(s).

In the present context, the target construct (CT) is the competence of individuals to think critically, which entails solving complex, real-world problems, and clearly communicating their conclusions or recommendations for action based on trustworthy, relevant and unbiased information. The situations, drawn from actual events, are challenging and may arise in many possible settings. In contrast to more reductionist approaches to assessment development, the iPAL approach and framework rests on the assumption that properly addressing these situational demands requires the application of a constellation of CT skills appropriate to the particular task presented (e.g., Shavelson, 2010 , 2013 ). For a PT, the assessment framework must also specify the rubric by which the responses will be evaluated. The rubric must be properly linked to the target construct so that the resulting score profile constitutes evidence that is both relevant and interpretable in terms of the student model (for an example, see Zlatkin-Troitschanskaia et al., 2019 ).

iPAL Task Framework

The iPAL ‘omnibus’ framework comprises four main aspects: A storyline , a challenge , a document library , and a scoring rubric . Table 1 displays these aspects, brief descriptions of each, and the corresponding examples drawn from an iPAL performance assessment (Version adapted from original in Hyytinen and Toom, 2019 ). Storylines are drawn from various domains; for example, the worlds of business, public policy, civics, medicine, and family. They often involve moral and/or ethical considerations. Deriving an appropriate storyline from a real-world situation requires careful consideration of which features are to be kept in toto , which adapted for purposes of the assessment, and which to be discarded. Framing the challenge demands care in wording so that there is minimal ambiguity in what is required of the respondent. The difficulty of the challenge depends, in large part, on the nature and extent of the information provided in the document library , the amount of scaffolding included, as well as the scope of the required response. The amount of information and the scope of the challenge should be commensurate with the amount of time available. As is evident from the table, the characteristics of the documents in the library are intended to elicit responses related to facets of CT. For example, with regard to bias, the information provided is intended to play to judgmental errors due to fast thinking and/or motivational reasoning. Ideally, the situation should accommodate multiple solutions of varying degrees of merit.

The dimensions of the scoring rubric are derived from the Task Model and Student Model ( Mislevy et al., 2003 ) and signal which features are to be extracted from the response and indicate how they are to be evaluated. There should be a direct link between the evaluation of the evidence and the claims that are made with respect to the key features of the task model and student model . More specifically, the task model specifies the various manipulations embodied in the PA and so informs scoring, while the student model specifies the capacities students employ in more or less effectively responding to the tasks. The score scales for each of the five facets of CT (see section “Concept and Definition of Critical Thinking”) can be specified using appropriate behavioral anchors (for examples, see Zlatkin-Troitschanskaia and Shavelson, 2019 ). Of particular importance is the evaluation of the response with respect to the last dimension of the scoring rubric; namely, the overall coherence and persuasiveness of the argument, building on the explicit or implicit characteristics related to the first five dimensions. The scoring process must be monitored carefully to ensure that (trained) raters are judging each response based on the same types of features and evaluation criteria ( Braun, 2019 ) as indicated by interrater agreement coefficients.

The scoring rubric of the iPAL omnibus framework can be modified for specific tasks ( Lane and Stone, 2006 ). This generic rubric helps ensure consistency across rubrics for different storylines. For example, Zlatkin-Troitschanskaia et al. (2019 , p. 473) used the following scoring scheme:

Based on our construct definition of CT and its four dimensions: (D1-Info) recognizing and evaluating information, (D2-Decision) recognizing and evaluating arguments and making decisions, (D3-Conseq) recognizing and evaluating the consequences of decisions, and (D4-Writing), we developed a corresponding analytic dimensional scoring … The students’ performance is evaluated along the four dimensions, which in turn are subdivided into a total of 23 indicators as (sub)categories of CT … For each dimension, we sought detailed evidence in students’ responses for the indicators and scored them on a six-point Likert-type scale. In order to reduce judgment distortions, an elaborate procedure of ‘behaviorally anchored rating scales’ (Smith and Kendall, 1963) was applied by assigning concrete behavioral expectations to certain scale points (Bernardin et al., 1976). To this end, we defined the scale levels by short descriptions of typical behavior and anchored them with concrete examples. … We trained four raters in 1 day using a specially developed training course to evaluate students’ performance along the 23 indicators clustered into four dimensions (for a description of the rater training, see Klotzer, 2018).

Shavelson et al. (2019) examined the interrater agreement of the scoring scheme developed by Zlatkin-Troitschanskaia et al. (2019) and “found that with 23 items and 2 raters the generalizability (“reliability”) coefficient for total scores to be 0.74 (with 4 raters, 0.84)” ( Shavelson et al., 2019 , p. 15). In the study by Zlatkin-Troitschanskaia et al. (2019 , p. 478) three score profiles were identified (low-, middle-, and high-performer) for students. Proper interpretation of such profiles requires care. For example, there may be multiple possible explanations for low scores such as poor CT skills, a lack of a disposition to engage with the challenge, or the two attributes jointly. These alternative explanations for student performance can potentially pose a threat to the evidentiary argument. In this case, auxiliary information may be available to aid in resolving the ambiguity. For example, student responses to selected- and short-constructed-response items in the PA can provide relevant information about the levels of the different skills possessed by the student. When sufficient data are available, the scores can be modeled statistically and/or qualitatively in such a way as to bring them to bear on the technical quality or interpretability of the claims of the assessment: reliability, validity, and utility evidence ( Davey et al., 2015 ; Zlatkin-Troitschanskaia et al., 2019 ). These kinds of concerns are less critical when PT’s are used in classroom settings. The instructor can draw on other sources of evidence, including direct discussion with the student.

Use of iPAL Performance Assessments in Educational Practice: Evidence From Preliminary Validation Studies

The assessment framework described here supports the development of a PT in a general setting. Many modifications are possible and, indeed, desirable. If the PT is to be more deeply embedded in a certain discipline (e.g., economics, law, or medicine), for example, then the framework must specify characteristics of the narrative and the complementary documents as to the breadth and depth of disciplinary knowledge that is represented.

At present, preliminary field trials employing the omnibus framework (i.e., a full set of documents) indicated that 60 min was generally an inadequate amount of time for students to engage with the full set of complementary documents and to craft a complete response to the challenge (for an example, see Shavelson et al., 2019 ). Accordingly, it would be helpful to develop modified frameworks for PT’s that require substantially less time. For an example, see a short performance assessment of civic online reasoning, requiring response times from 10 to 50 min ( Wineburg et al., 2016 ). Such assessment frameworks could be derived from the omnibus framework by focusing on a reduced number of facets of CT, and specifying the characteristics of the complementary documents to be included – or, perhaps, choices among sets of documents. In principle, one could build a ‘family’ of PT’s, each using the same (or nearly the same) storyline and a subset of the full collection of complementary documents.

Paul and Elder (2007) argue that the goal of CT assessments should be to provide faculty with important information about how well their instruction supports the development of students’ CT. In that spirit, the full family of PT’s could represent all facets of the construct while affording instructors and students more specific insights on strengths and weaknesses with respect to particular facets of CT. Moreover, the framework should be expanded to include the design of a set of short answer and/or multiple choice items to accompany the PT. Ideally, these additional items would be based on the same narrative as the PT to collect more nuanced information on students’ precursor skills such as reading comprehension, while enhancing the overall reliability of the assessment. Areas where students are under-prepared could be addressed before, or even in parallel with the development of the focal CT skills. The parallel approach follows the co-requisite model of developmental education. In other settings (e.g., for summative assessment), these complementary items would be administered after the PT to augment the evidence in relation to the various claims. The full PT taking 90 min or more could serve as a capstone assessment.

As we transition from simply delivering paper-based assessments by computer to taking full advantage of the affordances of a digital platform, we should learn from the hard-won lessons of the past so that we can make swifter progress with fewer missteps. In that regard, we must take validity as the touchstone – assessment design, development and deployment must all be tightly linked to the operational definition of the CT construct. Considerations of reliability and practicality come into play with various use cases that highlight different purposes for the assessment (for future perspectives, see next section).

The iPAL assessment framework represents a feasible compromise between commercial, standardized assessments of CT (e.g., Liu et al., 2014 ), on the one hand, and, on the other, freedom for individual faculty to develop assessment tasks according to idiosyncratic models. It imposes a degree of standardization on both task development and scoring, while still allowing some flexibility for faculty to tailor the assessment to meet their unique needs. In so doing, it addresses a key weakness of the AAC&U’s VALUE initiative 2 (retrieved 5/7/2020) that has achieved wide acceptance among United States colleges.

The VALUE initiative has produced generic scoring rubrics for 15 domains including CT, problem-solving and written communication. A rubric for a particular skill domain (e.g., critical thinking) has five to six dimensions with four ordered performance levels for each dimension (1 = lowest, 4 = highest). The performance levels are accompanied by language that is intended to clearly differentiate among levels. 3 Faculty are asked to submit student work products from a senior level course that is intended to yield evidence with respect to student learning outcomes in a particular domain and that, they believe, can elicit performances at the highest level. The collection of work products is then graded by faculty from other institutions who have been trained to apply the rubrics.

A principal difficulty is that there is neither a common framework to guide the design of the challenge, nor any control on task complexity and difficulty. Consequently, there is substantial heterogeneity in the quality and evidential value of the submitted responses. This also causes difficulties with task scoring and inter-rater reliability. Shavelson et al. (2009) discuss some of the problems arising with non-standardized collections of student work.

In this context, one advantage of the iPAL framework is that it can provide valuable guidance and an explicit structure for faculty in developing performance tasks for both instruction and formative assessment. When faculty design assessments, their focus is typically on content coverage rather than other potentially important characteristics, such as the degree of construct representation and the adequacy of their scoring procedures ( Braun, 2019 ).

Concluding Reflections

Challenges to interpretation and implementation.

Performance tasks such as those generated by iPAL are attractive instruments for assessing CT skills (e.g., Shavelson, 2010 ; Shavelson et al., 2019 ). The attraction mainly rests on the assumption that elaborated PT’s are more authentic (direct) and more completely capture facets of the target construct (i.e., possess greater construct representation) than the widely used selected-response tests. However, as Messick (1994) noted authenticity is a “promissory note” that must be redeemed with empirical research. In practice, there are trade-offs among authenticity, construct validity, and psychometric quality such as reliability ( Davey et al., 2015 ).

One reason for Messick (1994) caution is that authenticity does not guarantee construct validity. The latter must be established by drawing on multiple sources of evidence ( American Educational Research Association et al., 2014 ). Following the ECD principles in designing and developing the PT, as well as the associated scoring rubrics, constitutes an important type of evidence. Further, as Leighton (2019) argues, response process data (“cognitive validity”) is needed to validate claims regarding the cognitive complexity of PT’s. Relevant data can be obtained through cognitive laboratory studies involving methods such as think aloud protocols or eye-tracking. Although time-consuming and expensive, such studies can yield not only evidence of validity, but also valuable information to guide refinements of the PT.

Going forward, iPAL PT’s must be subjected to validation studies as recommended in the Standards for Psychological and Educational Testing by American Educational Research Association et al. (2014) . With a particular focus on the criterion “relationships to other variables,” a framework should include assumptions about the theoretically expected relationships among the indicators assessed by the PT, as well as the indicators’ relationships to external variables such as intelligence or prior (task-relevant) knowledge.

Complementing the necessity of evaluating construct validity, there is the need to consider potential sources of construct-irrelevant variance (CIV). One pertains to student motivation, which is typically greater when the stakes are higher. If students are not motivated, then their performance is likely to be impacted by factors unrelated to their (construct-relevant) ability ( Lane and Stone, 2006 ; Braun et al., 2011 ; Shavelson, 2013 ). Differential motivation across groups can also bias comparisons. Student motivation might be enhanced if the PT is administered in the context of a course with the promise of generating useful feedback on students’ skill profiles.

Construct-irrelevant variance can also occur when students are not equally prepared for the format of the PT or fully appreciate the response requirements. This source of CIV could be alleviated by providing students with practice PT’s. Finally, the use of novel forms of documentation, such as those from the Internet, can potentially introduce CIV due to differential familiarity with forms of representation or contents. Interestingly, this suggests that there may be a conflict between enhancing construct representation and reducing CIV.

Another potential source of CIV is related to response evaluation. Even with training, human raters can vary in accuracy and usage of the full score range. In addition, raters may attend to features of responses that are unrelated to the target construct, such as the length of the students’ responses or the frequency of grammatical errors ( Lane and Stone, 2006 ). Some of these sources of variance could be addressed in an online environment, where word processing software could alert students to potential grammatical and spelling errors before they submit their final work product.

Performance tasks generally take longer to administer and are more costly than traditional assessments, making it more difficult to reliably measure student performance ( Messick, 1994 ; Davey et al., 2015 ). Indeed, it is well known that more than one performance task is needed to obtain high reliability ( Shavelson, 2013 ). This is due to both student-task interactions and variability in scoring. Sources of student-task interactions are differential familiarity with the topic ( Hyytinen and Toom, 2019 ) and differential motivation to engage with the task. The level of reliability required, however, depends on the context of use. For use in formative assessment as part of an instructional program, reliability can be lower than use for summative purposes. In the former case, other types of evidence are generally available to support interpretation and guide pedagogical decisions. Further studies are needed to obtain estimates of reliability in typical instructional settings.

With sufficient data, more sophisticated psychometric analyses become possible. One challenge is that the assumption of unidimensionality required for many psychometric models might be untenable for performance tasks ( Davey et al., 2015 ). Davey et al. (2015) provide the example of a mathematics assessment that requires students to demonstrate not only their mathematics skills but also their written communication skills. Although the iPAL framework does not explicitly address students’ reading comprehension and organization skills, students will likely need to call on these abilities to accomplish the task. Moreover, as the operational definition of CT makes evident, the student must not only deploy several skills in responding to the challenge of the PT, but also carry out component tasks in sequence. The former requirement strongly indicates the need for a multi-dimensional IRT model, while the latter suggests that the usual assumption of local item independence may well be problematic ( Lane and Stone, 2006 ). At the same time, the analytic scoring rubric should facilitate the use of latent class analysis to partition data from large groups into meaningful categories ( Zlatkin-Troitschanskaia et al., 2019 ).

Future Perspectives

Although the iPAL consortium has made substantial progress in the assessment of CT, much remains to be done. Further refinement of existing PT’s and their adaptation to different languages and cultures must continue. To this point, there are a number of examples: The refugee crisis PT (cited in Table 1 ) was translated and adapted from Finnish to US English and then to Colombian Spanish. A PT concerning kidney transplants was translated and adapted from German to US English. Finally, two PT’s based on ‘legacy admissions’ to US colleges were translated and adapted to Colombian Spanish.

With respect to data collection, there is a need for sufficient data to support psychometric analysis of student responses, especially the relationships among the different components of the scoring rubric, as this would inform both task development and response evaluation ( Zlatkin-Troitschanskaia et al., 2019 ). In addition, more intensive study of response processes through cognitive laboratories and the like are needed to strengthen the evidential argument for construct validity ( Leighton, 2019 ). We are currently conducting empirical studies, collecting data on both iPAL PT’s and other measures of CT. These studies will provide evidence of convergent and discriminant validity.

At the same time, efforts should be directed at further development to support different ways CT PT’s might be used—i.e., use cases—especially those that call for formative use of PT’s. Incorporating formative assessment into courses can plausibly be expected to improve students’ competency acquisition ( Zlatkin-Troitschanskaia et al., 2017 ). With suitable choices of storylines, appropriate combinations of (modified) PT’s, supplemented by short-answer and multiple-choice items, could be interwoven into ordinary classroom activities. The supplementary items may be completely separate from the PT’s (as is the case with the CLA+), loosely coupled with the PT’s (as in drawing on the same storyline), or tightly linked to the PT’s (as in requiring elaboration of certain components of the response to the PT).

As an alternative to such integration, stand-alone modules could be embedded in courses to yield evidence of students’ generic CT skills. Core curriculum courses or general education courses offer ideal settings for embedding performance assessments. If these assessments were administered to a representative sample of students in each cohort over their years in college, the results would yield important information on the development of CT skills at a population level. For another example, these PA’s could be used to assess the competence profiles of students entering Bachelor’s or graduate-level programs as a basis for more targeted instructional support.

Thus, in considering different use cases for the assessment of CT, it is evident that several modifications of the iPAL omnibus assessment framework are needed. As noted earlier, assessments built according to this framework are demanding with respect to the extensive preliminary work required by a task and the time required to properly complete it. Thus, it would be helpful to have modified versions of the framework, focusing on one or two facets of the CT construct and calling for a smaller number of supplementary documents. The challenge to the student should be suitably reduced.

Some members of the iPAL collaborative have developed PT’s that are embedded in disciplines such as engineering, law and education ( Crump et al., 2019 ; for teacher education examples, see Jeschke et al., 2019 ). These are proving to be of great interest to various stakeholders and further development is likely. Consequently, it is essential that an appropriate assessment framework be established and implemented. It is both a conceptual and an empirical question as to whether a single framework can guide development in different domains.

Performance Assessment in Online Learning Environment

Over the last 15 years, increasing amounts of time in both college and work are spent using computers and other electronic devices. This has led to formulation of models for the new literacies that attempt to capture some key characteristics of these activities. A prominent example is a model proposed by Leu et al. (2020) . The model frames online reading as a process of problem-based inquiry that calls on five practices to occur during online research and comprehension:

1. Reading to identify important questions,

2. Reading to locate information,

3. Reading to critically evaluate information,

4. Reading to synthesize online information, and

5. Reading and writing to communicate online information.

The parallels with the iPAL definition of CT are evident and suggest there may be benefits to closer links between these two lines of research. For example, a report by Leu et al. (2014) describes empirical studies comparing assessments of online reading using either open-ended or multiple-choice response formats.

The iPAL consortium has begun to take advantage of the affordances of the online environment (for examples, see Schmidt et al. and Nagel et al. in this special issue). Most obviously, Supplementary Materials can now include archival photographs, audio recordings, or videos. Additional tasks might include the online search for relevant documents, though this would add considerably to the time demands. This online search could occur within a simulated Internet environment, as is the case for the IEA’s ePIRLS assessment ( Mullis et al., 2017 ).

The prospect of having access to a wealth of materials that can add to task authenticity is exciting. Yet it can also add ambiguity and information overload. Increased authenticity, then, should be weighed against validity concerns and the time required to absorb the content in these materials. Modifications of the design framework and extensive empirical testing will be required to decide on appropriate trade-offs. A related possibility is to employ some of these materials in short-answer (or even selected-response) items that supplement the main PT. Response formats could include highlighting text or using a drag-and-drop menu to construct a response. Students’ responses could be automatically scored, thereby containing costs. With automated scoring, feedback to students and faculty, including suggestions for next steps in strengthening CT skills, could also be provided without adding to faculty workload. Therefore, taking advantage of the online environment to incorporate new types of supplementary documents should be a high priority and, perhaps, to introduce new response formats as well. Finally, further investigation of the overlap between this formulation of CT and the characterization of online reading promulgated by Leu et al. (2020) is a promising direction to pursue.

Data Availability Statement

All datasets generated for this study are included in the article/supplementary material.

Author Contributions

HB wrote the article. RS, OZ-T, and KB were involved in the preparation and revision of the article and co-wrote the manuscript. All authors contributed to the article and approved the submitted version.

This study was funded in part by the Spencer Foundation (Grant No. #201700123).

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

We would like to thank all the researchers who have participated in the iPAL program.

  • ^ https://www.ipal-rd.com/
  • ^ https://www.aacu.org/value
  • ^ When test results are reported by means of substantively defined categories, the scoring is termed “criterion-referenced”. This is, in contrast to results, reported as percentiles; such scoring is termed “norm-referenced”.

American Educational Research Association, American Psychological Association, and National Council on Measurement in Education (2014). Standards for Educational and Psychological Testing. Washington, D.C: American Educational Research Association.

Google Scholar

Arum, R., and Roksa, J. (2011). Academically Adrift: Limited Learning on College Campuses. Chicago, IL: University of Chicago Press.

Association of American Colleges and Universities (n.d.). VALUE: What is value?. Available online at:: https://www.aacu.org/value (accessed May 7, 2020).

Association of American Colleges and Universities [AACU] (2018). Fulfilling the American Dream: Liberal Education and the Future of Work. Available online at:: https://www.aacu.org/research/2018-future-of-work (accessed May 1, 2020).

Braun, H. (2019). Performance assessment and standardization in higher education: a problematic conjunction? Br. J. Educ. Psychol. 89, 429–440. doi: 10.1111/bjep.12274

PubMed Abstract | CrossRef Full Text | Google Scholar

Braun, H. I., Kirsch, I., and Yamoto, K. (2011). An experimental study of the effects of monetary incentives on performance on the 12th grade NAEP reading assessment. Teach. Coll. Rec. 113, 2309–2344.

Crump, N., Sepulveda, C., Fajardo, A., and Aguilera, A. (2019). Systematization of performance tests in critical thinking: an interdisciplinary construction experience. Rev. Estud. Educ. 2, 17–47.

Davey, T., Ferrara, S., Shavelson, R., Holland, P., Webb, N., and Wise, L. (2015). Psychometric Considerations for the Next Generation of Performance Assessment. Washington, DC: Center for K-12 Assessment & Performance Management, Educational Testing Service.

Erwin, T. D., and Sebrell, K. W. (2003). Assessment of critical thinking: ETS’s tasks in critical thinking. J. Gen. Educ. 52, 50–70. doi: 10.1353/jge.2003.0019

CrossRef Full Text | Google Scholar

Haertel, G. D., and Fujii, R. (2017). “Evidence-centered design and postsecondary assessment,” in Handbook on Measurement, Assessment, and Evaluation in Higher Education , 2nd Edn, eds C. Secolsky and D. B. Denison (Abingdon: Routledge), 313–339. doi: 10.4324/9781315709307-26

Hyytinen, H., and Toom, A. (2019). Developing a performance assessment task in the Finnish higher education context: conceptual and empirical insights. Br. J. Educ. Psychol. 89, 551–563. doi: 10.1111/bjep.12283

Hyytinen, H., Toom, A., and Shavelson, R. J. (2019). “Enhancing scientific thinking through the development of critical thinking in higher education,” in Redefining Scientific Thinking for Higher Education: Higher-Order Thinking, Evidence-Based Reasoning and Research Skills , eds M. Murtonen and K. Balloo (London: Palgrave MacMillan).

Indiana University (2019). FSSE 2019 Frequencies: FSSE 2019 Aggregate. Available online at:: http://fsse.indiana.edu/pdf/FSSE_IR_2019/summary_tables/FSSE19_Frequencies_(FSSE_2019).pdf (accessed May 1, 2020).

Jeschke, C., Kuhn, C., Lindmeier, A., Zlatkin-Troitschanskaia, O., Saas, H., and Heinze, A. (2019). Performance assessment to investigate the domain specificity of instructional skills among pre-service and in-service teachers of mathematics and economics. Br. J. Educ. Psychol. 89, 538–550. doi: 10.1111/bjep.12277

Kegan, R. (1994). In Over Our Heads: The Mental Demands of Modern Life. Cambridge, MA: Harvard University Press.

Klein, S., Benjamin, R., Shavelson, R., and Bolus, R. (2007). The collegiate learning assessment: facts and fantasies. Eval. Rev. 31, 415–439. doi: 10.1177/0193841x07303318

Kosslyn, S. M., and Nelson, B. (2017). Building the Intentional University: Minerva and the Future of Higher Education. Cambridge, MAL: The MIT Press.

Lane, S., and Stone, C. A. (2006). “Performance assessment,” in Educational Measurement , 4th Edn, ed. R. L. Brennan (Lanham, MA: Rowman & Littlefield Publishers), 387–432.

Leighton, J. P. (2019). The risk–return trade-off: performance assessments and cognitive validation of inferences. Br. J. Educ. Psychol. 89, 441–455. doi: 10.1111/bjep.12271

Leu, D. J., Kiili, C., Forzani, E., Zawilinski, L., McVerry, J. G., and O’Byrne, W. I. (2020). “The new literacies of online research and comprehension,” in The Concise Encyclopedia of Applied Linguistics , ed. C. A. Chapelle (Oxford: Wiley-Blackwell), 844–852.

Leu, D. J., Kulikowich, J. M., Kennedy, C., and Maykel, C. (2014). “The ORCA Project: designing technology-based assessments for online research,” in Paper Presented at the American Educational Research Annual Meeting , Philadelphia, PA.

Liu, O. L., Frankel, L., and Roohr, K. C. (2014). Assessing critical thinking in higher education: current state and directions for next-generation assessments. ETS Res. Rep. Ser. 1, 1–23. doi: 10.1002/ets2.12009

McClelland, D. C. (1973). Testing for competence rather than for “intelligence.”. Am. Psychol. 28, 1–14. doi: 10.1037/h0034092

McGrew, S., Ortega, T., Breakstone, J., and Wineburg, S. (2017). The challenge that’s bigger than fake news: civic reasoning in a social media environment. Am. Educ. 4, 4-9, 39.

Mejía, A., Mariño, J. P., and Molina, A. (2019). Incorporating perspective analysis into critical thinking performance assessments. Br. J. Educ. Psychol. 89, 456–467. doi: 10.1111/bjep.12297

Messick, S. (1994). The interplay of evidence and consequences in the validation of performance assessments. Educ. Res. 23, 13–23. doi: 10.3102/0013189x023002013

Mislevy, R. J., Almond, R. G., and Lukas, J. F. (2003). A brief introduction to evidence-centered design. ETS Res. Rep. Ser. 2003, i–29. doi: 10.1002/j.2333-8504.2003.tb01908.x

Mislevy, R. J., and Haertel, G. D. (2006). Implications of evidence-centered design for educational testing. Educ. Meas. Issues Pract. 25, 6–20. doi: 10.1111/j.1745-3992.2006.00075.x

Mullis, I. V. S., Martin, M. O., Foy, P., and Hooper, M. (2017). ePIRLS 2016 International Results in Online Informational Reading. Available online at:: http://timssandpirls.bc.edu/pirls2016/international-results/ (accessed May 1, 2020).

Nagel, M.-T., Zlatkin-Troitschanskaia, O., Schmidt, S., and Beck, K. (2020). “Performance assessment of generic and domain-specific skills in higher education economics,” in Student Learning in German Higher Education , eds O. Zlatkin-Troitschanskaia, H. A. Pant, M. Toepper, and C. Lautenbach (Berlin: Springer), 281–299. doi: 10.1007/978-3-658-27886-1_14

Organisation for Economic Co-operation and Development [OECD] (2012). AHELO: Feasibility Study Report , Vol. 1. Paris: OECD. Design and implementation.

Organisation for Economic Co-operation and Development [OECD] (2013). AHELO: Feasibility Study Report , Vol. 2. Paris: OECD. Data analysis and national experiences.

Oser, F. K., and Biedermann, H. (2020). “A three-level model for critical thinking: critical alertness, critical reflection, and critical analysis,” in Frontiers and Advances in Positive Learning in the Age of Information (PLATO) , ed. O. Zlatkin-Troitschanskaia (Cham: Springer), 89–106. doi: 10.1007/978-3-030-26578-6_7

Paul, R., and Elder, L. (2007). Consequential validity: using assessment to drive instruction. Found. Crit. Think. 29, 31–40.

Pellegrino, J. W., and Hilton, M. L. (eds) (2012). Education for life and work: Developing Transferable Knowledge and Skills in the 21st Century. Washington DC: National Academies Press.

Shavelson, R. (2010). Measuring College Learning Responsibly: Accountability in a New Era. Redwood City, CA: Stanford University Press.

Shavelson, R. J. (2013). On an approach to testing and modeling competence. Educ. Psychol. 48, 73–86. doi: 10.1080/00461520.2013.779483

Shavelson, R. J., Zlatkin-Troitschanskaia, O., Beck, K., Schmidt, S., and Marino, J. P. (2019). Assessment of university students’ critical thinking: next generation performance assessment. Int. J. Test. 19, 337–362. doi: 10.1080/15305058.2018.1543309

Shavelson, R. J., Zlatkin-Troitschanskaia, O., and Marino, J. P. (2018). “International performance assessment of learning in higher education (iPAL): research and development,” in Assessment of Learning Outcomes in Higher Education: Cross-National Comparisons and Perspectives , eds O. Zlatkin-Troitschanskaia, M. Toepper, H. A. Pant, C. Lautenbach, and C. Kuhn (Berlin: Springer), 193–214. doi: 10.1007/978-3-319-74338-7_10

Shavelson, R. J., Klein, S., and Benjamin, R. (2009). The limitations of portfolios. Inside Higher Educ. Available online at: https://www.insidehighered.com/views/2009/10/16/limitations-portfolios

Stolzenberg, E. B., Eagan, M. K., Zimmerman, H. B., Berdan Lozano, J., Cesar-Davis, N. M., Aragon, M. C., et al. (2019). Undergraduate Teaching Faculty: The HERI Faculty Survey 2016–2017. Los Angeles, CA: UCLA.

Tessier-Lavigne, M. (2020). Putting Ethics at the Heart of Innovation. Stanford, CA: Stanford Magazine.

Wheeler, P., and Haertel, G. D. (1993). Resource Handbook on Performance Assessment and Measurement: A Tool for Students, Practitioners, and Policymakers. Palm Coast, FL: Owl Press.

Wineburg, S., McGrew, S., Breakstone, J., and Ortega, T. (2016). Evaluating Information: The Cornerstone of Civic Online Reasoning. Executive Summary. Stanford, CA: Stanford History Education Group.

Zahner, D. (2013). Reliability and Validity–CLA+. Council for Aid to Education. Available online at:: https://pdfs.semanticscholar.org/91ae/8edfac44bce3bed37d8c9091da01d6db3776.pdf .

Zlatkin-Troitschanskaia, O., and Shavelson, R. J. (2019). Performance assessment of student learning in higher education [Special issue]. Br. J. Educ. Psychol. 89, i–iv, 413–563.

Zlatkin-Troitschanskaia, O., Pant, H. A., Lautenbach, C., Molerov, D., Toepper, M., and Brückner, S. (2017). Modeling and Measuring Competencies in Higher Education: Approaches to Challenges in Higher Education Policy and Practice. Berlin: Springer VS.

Zlatkin-Troitschanskaia, O., Pant, H. A., Toepper, M., and Lautenbach, C. (eds) (2020). Student Learning in German Higher Education: Innovative Measurement Approaches and Research Results. Wiesbaden: Springer.

Zlatkin-Troitschanskaia, O., Shavelson, R. J., and Pant, H. A. (2018). “Assessment of learning outcomes in higher education: international comparisons and perspectives,” in Handbook on Measurement, Assessment, and Evaluation in Higher Education , 2nd Edn, eds C. Secolsky and D. B. Denison (Abingdon: Routledge), 686–697.

Zlatkin-Troitschanskaia, O., Shavelson, R. J., Schmidt, S., and Beck, K. (2019). On the complementarity of holistic and analytic approaches to performance assessment scoring. Br. J. Educ. Psychol. 89, 468–484. doi: 10.1111/bjep.12286

Keywords : critical thinking, performance assessment, assessment framework, scoring rubric, evidence-centered design, 21st century skills, higher education

Citation: Braun HI, Shavelson RJ, Zlatkin-Troitschanskaia O and Borowiec K (2020) Performance Assessment of Critical Thinking: Conceptualization, Design, and Implementation. Front. Educ. 5:156. doi: 10.3389/feduc.2020.00156

Received: 30 May 2020; Accepted: 04 August 2020; Published: 08 September 2020.

Reviewed by:

Copyright © 2020 Braun, Shavelson, Zlatkin-Troitschanskaia and Borowiec. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Henry I. Braun, [email protected]

This article is part of the Research Topic

Assessing Information Processing and Online Reasoning as a Prerequisite for Learning in Higher Education

APS

  • Teaching Tips

A Brief Guide for Teaching and Assessing Critical Thinking in Psychology

In my first year of college teaching, a student approached me one day after class and politely asked, “What did you mean by the word ‘evidence’?” I tried to hide my shock at what I took to be a very naive question. Upon further reflection, however, I realized that this was actually a good question, for which the usual approaches to teaching psychology provided too few answers. During the next several years, I developed lessons and techniques to help psychology students learn how to evaluate the strengths and weaknesses of scientific and nonscientific kinds of evidence and to help them draw sound conclusions. It seemed to me that learning about the quality of evidence and drawing appropriate conclusions from scientific research were central to teaching critical thinking (CT) in psychology.

In this article, I have attempted to provide guidelines to psychol­ogy instructors on how to teach CT, describing techniques I devel­oped over 20 years of teaching. More importantly, the techniques and approach described below are ones that are supported by scientific research. Classroom examples illustrate the use of the guidelines and how assessment can be integrated into CT skill instruction.

Overview of the Guidelines

Confusion about the definition of CT has been a major obstacle to teaching and assessing it (Halonen, 1995; Williams, 1999). To deal with this problem, we have defined CT as reflective think­ing involved in the evaluation of evidence relevant to a claim so that a sound or good conclusion can be drawn from the evidence (Bensley, 1998). One virtue of this definition is it can be applied to many thinking tasks in psychology. The claims and conclusions psychological scientists make include hypotheses, theoretical state­ments, interpretation of research findings, or diagnoses of mental disorders. Evidence can be the results of an experiment, case study, naturalistic observation study, or psychological test. Less formally, evidence can be anecdotes, introspective reports, commonsense beliefs, or statements of authority. Evaluating evidence and drawing appropriate conclusions along with other skills, such as distin­guishing arguments from nonarguments and finding assumptions, are collectively called argument analysis skills. Many CT experts take argument analysis skills to be fundamental CT skills (e.g., Ennis, 1987; Halpern, 1998). Psychology students need argument analysis skills to evaluate psychological claims in their work and in everyday discourse.

Some instructors expect their students will improve CT skills like argument analysis skills by simply immersing them in challenging course work. Others expect improvement because they use a textbook with special CT questions or modules, give lectures that critically review the literature, or have students complete written assignments. While these and other traditional techniques may help, a growing body of research suggests they are not sufficient to efficiently produce measurable changes in CT skills. Our research on acquisition of argument analysis skills in psychology (Bensley, Crowe, Bernhardt, Buchner, & Allman, in press) and on critical reading skills (Bensley & Haynes, 1995; Spero & Bensley, 2009) suggests that more explicit, direct instruction of CT skills is necessary. These results concur with results of an earlier review of CT programs by Chance (1986) and a recent meta-analysis by Abrami et al., (2008).

Based on these and other findings, the following guidelines describe an approach to explicit instruction in which instructors can directly infuse CT skills and assessment into their courses. With infusion, instructors can use relevant content to teach CT rules and concepts along with the subject matter. Directly infus­ing CT skills into course work involves targeting specific CT skills, making CT rules, criteria, and methods explicit, providing guided practice in the form of exercises focused on assessing skills, and giving feedback on practice and assessments. These components are similar to ones found in effective, direct instruc­tion approaches (Walberg, 2006). They also resemble approaches to teaching CT proposed by Angelo (1995), Beyer (1997), and Halpern (1998). Importantly, this approach has been successful in teaching CT skills in psychology (e.g., Bensley, et al., in press; Bensley & Haynes, 1995; Nieto & Saiz, 2008; Penningroth, Despain, & Gray, 2007). Directly infusing CT skill instruction can also enrich content instruction without sacrificing learning of subject matter (Solon, 2003). The following seven guidelines, illustrated by CT lessons and assessments, explicate this process.

Seven Guidelines for Teaching and Assessing Critical Thinking

1. Motivate your students to think critically

Critical thinking takes effort. Without proper motivation, students are less inclined to engage in it. Therefore, it is good to arouse interest right away and foster commitment to improving CT throughout a course. One motivational strategy is to explain why CT is important to effective, professional behavior. Often, telling a compelling story that illustrates the consequences of failing to think critically can mo­tivate students. For example, the tragic death of 10-year-old Candace Newmaker at the hands of her therapists practicing attachment therapy illustrates the perils of using a therapy that has not been supported by good empirical evidence (Lilienfeld, 2007).

Instructors can also pique interest by taking a class poll posing an interesting question on which students are likely to have an opinion. For example, asking students how many think that the full moon can lead to increases in abnormal behavior can be used to introduce the difference between empirical fact and opinion or common sense belief. After asking students how psychologists answer such questions, instructors might go over the meta-analysis of Rotton and Kelly (1985). Their review found that almost all of the 37 studies they reviewed showed no association between the phase of the moon and abnormal behavior with only a few, usually poorly, controlled studies supporting it. Effect size over all stud­ies was very small (.01). Instructors can use this to illustrate how psychologists draw a conclusion based on the quality and quantity of research studies as opposed to what many people commonly believe. For other interesting thinking errors and misconceptions related to psychology, see Bensley (1998; 2002; 2008), Halpern (2003), Ruscio (2006), Stanovich (2007), and Sternberg (2007).

Attitudes and dispositions can also affect motivation to think critically. If students lack certain CT dispositions such as open-mindedness, fair-mindedness, and skepticism, they will be less likely to think critically even if they have CT skills (Halpern, 1998). Instructors might point out that even great scientists noted for their powers of reasoning sometimes fail to think critically when they are not disposed to use their skills. For example, Alfred Russel Wallace who used his considerable CT skills to help develop the concept of natural selection also believed in spiritualistic contact with the dead. Despite considerable evidence that mediums claiming to contact the dead were really faking such contact, Wallace continued to believe in it (Bensley, 2006). Likewise, the great American psychologist William James, whose reasoning skills helped him develop the seeds of important contemporary theories, believed in spiritualism despite evidence to the contrary.

2. Clearly state the CT goals and objectives for your class

Once students are motivated, the instructor should focus them on what skills they will work on during the course. The APA task force on learning goals and objectives for psychology listed CT as one of 10 major goals for students (Halonen et al., 2002). Under critical thinking they have further specified outcomes such as evaluating the quality of information, identifying and evaluating the source and credibility of information, recognizing and defending against think­ing errors and fallacies. Instructors should publish goals like these in their CT course objectives in their syllabi and more specifically as assignment objectives in their assignments. Given the pragmatic penchant of students for studying what is needed to succeed in a course, this should help motivate and focus them.

To make instruction efficient, course objectives and lesson ob­jectives should explicitly target CT skills to be improved. Objectives should specify the behavior that will change in a way that can be measured. A course objective might read, “After taking this course, you will be able to analyze arguments found in psychological and everyday discussions.” When the goal of a lesson is to practice and improve specific microskills that make up argument analysis, an assignment objective might read “After successfully completing this assignment, you will be able to identify different kinds of evidence in a psychological discussion.” Or another might read “After suc­cessfully completing this assignment, you will be able to distinguish arguments from nonarguments.” Students might demonstrate they have reached these objectives by showing the behavior of correctly labeling the kinds of evidence presented in a passage or by indicating whether an argument or merely a claim has been made. By stating objectives in the form of assessable behaviors, the instructor can test these as assessment hypotheses.

Sometimes when the goal is to teach students how to decide which CT skills are appropriate in a situation, the instructor may not want to identify specific skills. Instead, a lesson objective might read, “After successfully completing this assignment, you will be able to decide which skills and knowledge are appropriate for criti­cally analyzing a discussion in psychology.”

3. Find opportunities to infuse CT that fit content and skill requirements of your course

To improve their CT skills, students must be given opportunities to practice them. Different courses present different opportunities for infusion and practice. Stand-alone CT courses usually provide the most opportunities to infuse CT. For example, the Frostburg State University Psychology Department has a senior seminar called “Thinking like a Psychologist” in which students complete lessons giving them practice in argument analysis, critical reading, critically evaluating information on the Internet, distinguishing science from pseudoscience, applying their knowledge and CT skills in simula­tions of psychological practice, and other activities.

In more typical subject-oriented courses, instructors must find specific content and types of tasks conducive to explicit CT skill instruction. For example, research methods courses present several opportunities to teach argument analysis skills. Instructors can have students critically evaluate the quality of evidence provided by studies using different research methods and designs they find in PsycINFO and Internet sources. This, in turn, could help students write better critical evaluations of research for research reports.

A cognitive psychology teacher might assign a critical evalu­ation of the evidence on an interesting question discussed in text­book literature reviews. For example, students might evaluate the evidence relevant to the question of whether people have flashbulb memories such as accurately remembering the 9-11 attack. This provides the opportunity to teach them that many of the studies, although informative, are quasi-experimental and cannot show causation. Or, students might analyze the arguments in a TV pro­gram such as the fascinating Nova program Kidnapped by Aliens on people who recall having been abducted by aliens.

4. Use guided practice, explicitly modeling and scaffolding CT.

Guided practice involves modeling and supporting the practice of target skills, and providing feedback on progress towards skill attainment. Research has shown that guided practice helps student more efficiently acquire thinking skills than unguided and discovery approaches (Meyer, 2004).

Instructors can model the use of CT rules, criteria, and proce­dures for evaluating evidence and drawing conclusions in many ways. They could provide worked examples of problems, writing samples displaying good CT, or real-world examples of good and bad thinking found in the media. They might also think out loud as they evaluate arguments in class to model the process of thinking.

To help students learn to use complex rules in thinking, instruc­tors should initially scaffold student thinking. Scaffolding involves providing product guidelines, rules, and other frameworks to support the process of thinking. Table 1 shows guidelines like those found in Bensley (1998) describing nonscientific kinds of evidence that can support student efforts to evaluate evidence in everyday psychologi­cal discussions. Likewise, Table 2 provides guidelines like those found in Bensley (1998) and Wade and Tavris (2005) describing various kinds of scientific research methods and designs that differ in the quality of evidence they provide for psychological arguments.

In the cognitive lesson on flashbulb memory described earlier, students use the framework in Table 2 to evaluate the kinds of evidence in the literature review. Table 1 can help them evaluate the kinds of evidence found in the Nova video Kidnapped by Aliens . Specifically, they could use it to contrast scientific authority with less credible authority. The video includes statements by scientific authorities like Elizabeth Loftus based on her extensive research contrasted with the nonscientific authority of Bud Hopkins, an artist turned hypnotherapist and author of popular books on alien abduction. Loftus argues that the memories of alien abduction in the children interviewed by Hopkins were reconstructed around the suggestive interview questions he posed. Therefore, his conclu­sion that the children and other people in the video were recalling actual abduction experiences was based on anecdotes, unreliable self-reports, and other weak evidence.

Modeling, scaffolding, and guided practice are especially useful in helping students first acquire CT skills. After sufficient practice, however, instructors should fade these and have students do more challenging assignments without these supports to promote transfer.

5. Align assessment with practice of specific CT skills

Test questions and other assessments of performance should be similar to practice questions and problems in the skills targeted but differ in content. For example, we have developed a series of practice and quiz questions about the kinds of evidence found in Table 1 used in everyday situations but which differ in subject matter from practice to quiz. Likewise, other questions employ research evidence examples corresponding to Table 2. Questions ask students to identify kinds of evidence, evaluate the quality of the evidence, distinguish arguments from nonarguments, and find assumptions in the examples with practice examples differing in content from assessment items.

6. Provide feedback and encourage students to reflect on it

Instructors should focus feedback on the degree of attainment of CT skill objectives in the lesson or assessment. The purpose of feedback is to help students learn how to correct faulty thinking so that in the future they monitor their thinking and avoid such problems. This should increase their metacognition or awareness and control of their thinking, an important goal of CT instruction (Halpern, 1998).

Students must use their feedback for it to improve their CT skills. In the CT exercises and critical reading assignments, students receive feedback in the form of corrected responses and written feedback on open-ended questions. They should be advised that paying attention to feedback on earlier work and assessments should improve their performance on later assessments.

7. Reflect on feedback and assessment results to improve CT instruction

Instructors should use the feedback they provide to students and the results of ongoing assessments to ‘close the loop,’ that is, use these outcomes to address deficiencies in performance and improve instruction. In actual practice, teaching and assessment strategies rarely work optimally the first time. Instructors must be willing to tinker with these to make needed improvements. Reflec­tion on reliable and valid assessment results provides a scientific means to systematically improve instruction and assessment.

Instructors may find the direct infusion approach as summarized in the seven guidelines to be efficient, especially in helping students acquire basic CT skills, as research has shown. They may especially appreciate how it allows them to take a scientific approach to the improvement of instruction. Although the direct infusion approach seems to efficiently promote acquisition of CT skills, more research is needed to find out if students transfer their skills outside of the class­room or whether this approach needs adjustment to promote transfer.

Table 1. Strengths and Weaknesses of Nonscientific Sources and Kinds of Evidence

Table 2. Strengths and Weaknesses of Scientific Research Methods/Designs Used as Sources of Evidence

Abrami, P. C., Bernard, R. M., Borokhovhovski, E., Wade, A., Surkes, M. A., Tamim, R., et al., (2008). Instructional interventions affecting critical thinking skills and dispositions: A stage 1 meta-analysis. Review of Educational Research, 4 , 1102–1134.

Angelo, T. A. (1995). Classroom assessment for critical thinking. Teaching of Psychology , 22(1), 6–7.

Bensley, D.A. (1998). Critical thinking in psychology: A unified skills approach. Pacific Grove, CA: Brooks/Cole.

Bensley, D.A. (2002). Science and pseudoscience: A critical thinking primer. In M. Shermer (Ed.), The Skeptic encyclopedia of pseudoscience. (pp. 195–203). Santa Barbara, CA: ABC–CLIO.

Bensley, D.A. (2006). Why great thinkers sometimes fail to think critically. Skeptical Inquirer, 30, 47–52.

Bensley, D.A. (2008). Can you learn to think more like a psychologist? The Psychologist, 21, 128–129.

Bensley, D.A., Crowe, D., Bernhardt, P., Buckner, C., & Allman, A. (in press). Teaching and assessing critical thinking skills for argument analysis in psychology. Teaching of Psychology .

Bensley, D.A. & Haynes, C. (1995). The acquisition of general purpose strategic knowledge for argumentation. Teaching of Psychology, 22 , 41–45.

Beyer, B.K. (1997). Improving student thinking: A comprehensive approach . Boston: Allyn & Bacon.

Chance, P. (1986) Thinking in the classroom: A review of programs . New York: Instructors College Press.

Ennis, R.H. (1987). A taxonomy of critical thinking dispositions and abilities. In J. B. Baron & R. F. Sternberg (Eds.). Teaching thinking skills: Theory and practice (pp. 9–26). New York: Freeman.

Halonen, J.S. (1995). Demystifying critical thinking. Teaching of Psychology, 22 , 75–81.

Halonen, J.S., Appleby, D.C., Brewer, C.L., Buskist, W., Gillem, A. R., Halpern, D. F., et al. (APA Task Force on Undergraduate Major Competencies). (2002) Undergraduate psychology major learning goals and outcomes: A report. Washington, DC: American Psychological Association. Retrieved August 27, 2008, from http://www.apa.org/ed/pcue/reports.html .

Halpern, D.F. (1998). Teaching critical thinking for transfer across domains: Dispositions, skills, structure training, and metacognitive monitoring. American Psychologist , 53 , 449–455.

Halpern, D.F. (2003). Thought and knowledge: An introduction to critical thinking . (3rd ed.). Mahwah, NJ: Erlbaum.

Lilienfeld, S.O. (2007). Psychological treatments that cause harm. Perspectives on Psychological Science , 2 , 53–70.

Meyer, R.E. (2004). Should there be a three-strikes rule against pure discovery learning? The case for guided methods of instruction. American Psychologist , 59 , 14–19.

Nieto, A.M., & Saiz, C. (2008). Evaluation of Halpern’s “structural component” for improving critical thinking. The Spanish Journal of Psychology , 11 ( 1 ), 266–274.

Penningroth, S.L., Despain, L.H., & Gray, M.J. (2007). A course designed to improve psychological critical thinking. Teaching of Psychology , 34 , 153–157.

Rotton, J., & Kelly, I. (1985). Much ado about the full moon: A meta-analysis of lunar-lunacy research. Psychological Bulletin , 97 , 286–306.

Ruscio, J. (2006). Critical thinking in psychology: Separating sense from nonsense. Belmont, CA: Wadsworth.

Solon, T. (2007). Generic critical thinking infusion and course content learning in introductory psychology. Journal of Instructional Psychology , 34(2), 972–987.

Stanovich, K.E. (2007). How to think straight about psychology . (8th ed.). Boston: Pearson.

Sternberg, R.J. (2007). Critical thinking in psychology: It really is critical. In R. J. Sternberg, H. L. Roediger, & D. F. Halpern (Eds.), Critical thinking in psychology. (pp. 289–296) . Cambridge, UK: Cambridge University Press.

Wade, C., & Tavris, C. (2005) Invitation to psychology. (3rd ed.). Upper Saddle River, NJ: Prentice Hall.

Walberg, H.J. (2006). Improving educational productivity: A review of extant research. In R. F. Subotnik & H. J. Walberg (Eds.), The scientific basis of educational productivity (pp. 103–159). Greenwich, CT: Information Age.

Williams, R.L. (1999). Operational definitions and assessment of higher-order cognitive constructs. Educational Psychology Review , 11 , 411–427.

' src=

Excellent article.

' src=

Interesting and helpful!

APS regularly opens certain online articles for discussion on our website. Effective February 2021, you must be a logged-in APS member to post comments. By posting a comment, you agree to our Community Guidelines and the display of your profile information, including your name and affiliation. Any opinions, findings, conclusions, or recommendations present in article comments are those of the writers and do not necessarily reflect the views of APS or the article’s author. For more information, please see our Community Guidelines .

Please login with your APS account to comment.

About the Author

D. Alan Bensley is Professor of Psychology at Frostburg State University. He received his Master’s and PhD degrees in cognitive psychology from Rutgers University. His main teaching and research interests concern the improvement of critical thinking and other cognitive skills. He coordinates assessment for his department and is developing a battery of instruments to assess critical thinking in psychology. He can be reached by email at [email protected] Association for Psychological Science December 2010 — Vol. 23, No. 10

critical thinking assessment method

Student Notebook: Five Tips for Working with Teaching Assistants in Online Classes

Sarah C. Turner suggests it’s best to follow the golden rule: Treat your TA’s time as you would your own.

Teaching Current Directions in Psychological Science

Aimed at integrating cutting-edge psychological science into the classroom, Teaching Current Directions in Psychological Science offers advice and how-to guidance about teaching a particular area of research or topic in psychological science that has been

European Psychology Learning and Teaching Conference

The School of Education of the Paris Lodron University of Salzburg is hosting the next European Psychology Learning and Teaching (EUROPLAT) Conference on September 18–20, 2017 in Salzburg, Austria. The main theme of the conference

Privacy Overview

Assessment of Critical Thinking

  • First Online: 10 December 2023

Cite this chapter

Book cover

  • Dirk Jahn 3 &
  • Michael Cursio 4  

90 Accesses

The term “to assess” has various meanings, such as to judge, evaluate, estimate, gauge, or determine. Assessment is therefore a diagnostic inventory of certain characteristics of a section of observable reality on the basis of defined criteria. In a pedagogical context, assessments aim to make learners’ knowledge, skills, or attitudes observable in certain application situations and to assess them on the basis of observation criteria.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

To give an example: Holistic Critical Thinking Rubric from East Georgia College; Available at https://studylib.net/doc/7608742/east-georgia-college-holistic-critical-thinking-rubric-cr… (04/03/2020).

Astleitner, H. (1998). Kritisches Denken. Basisqualifikation für Lehrer und Ausbildner . Studien.

Google Scholar  

Biggs, J. (2003). Aligning teaching and assessment to curriculum objectives . https://www.heacademy.ac.uk/sites/default/files/biggs-aligning-teaching-and-assessment.pdf . Accessed 21 Apr 2015.

Brookfield, S. (2003). Critical thinking in adulthood. In D. J. Fasko & D. J. Fasko (Eds.), Critical thinking and reasoning. Current research, theory, and practice (pp. 143–163). Hampton Press.

Ennis, R. H. (2003). Critical thinking assessment. In D. Fasko (Ed.), Critical thinking and reasoning. Current research, theory, and practice (pp. 293–314). Hampton Press.

Garrison, D. R. (1992). Critical thinking and self-directed learning in adult education: an analysis of responsibility and control issues. Adult Education Quarterly, 42 (3), 136–148.

Article   Google Scholar  

Garrison, D. R., & Anderson, T. (2003). E-learning in the 21st century. A framework for research and practice . Routledge.

Book   Google Scholar  

Grotjahn, R. (1999). Testtheorie: Grundzüge und Anwendung in der Praxis. Materialien Deutsch als Fremdsprache, 53 , 304–341.

Halpern, D. F. (2003). The “how” and “why” of critical thinking assessment. In D. Fasko (Ed.), Critical thinking and reasoning: Current research, theory and practice . Hampton Press.

Handke, J., & Schäfer, A. M. (2012). E-learning, E-teaching und E-assessment in der Hochschullehre: Eine Anleitung: Eine Anleitung . Oldenbourg.

Ingenkamp, K. (1985). Lehrbuch der Pädagogischen Diagnostik . Beltz Verlag.

Jahn, D. (2012a). Kritisches Denken fördern können. Entwicklung eines didaktischen Designs zur Qualifizierung pädagogischer Professionals . Shaker.

Landis, M., Swain, K. D., Friehe, M. J., & Coufal, K. L. (2007). Evaluating critical thinking in class and online: Comparison of the Newman method and the Facione Rubric. Teacher Education Quarterly, 34 (4), 121–136.

Newman, D. R., Webb, B., & Cochrane, C. (1995). A content analysis method to measure critical thinking in face-to-face and computer supported group learning. Interpersonal Computing and Technology: An Electronic Journal for the 21st Century, 2 , 56–77.

Newman, D. R., Johnson, C., Cochrane, C. & Webb, B. (1996). An experiment in group learning technology: evaluating critical thinking in face-to-face and computer-supported seminars . Verfügbar unter: http://emoderators.com/ipct-j/1996/n1/newman/contents.html . Accessed 12 Apr.

Pandero, E., & Jonsson, A. (2013). The use of scoring rubrics for formative assessment purposes revisited: A review. Educational Research Review, 9 , 129–144.

Reinmann-Rothmeier, G., & Mandl, H. (1999). Unterrichten und Lernumgebungen gestalten (überarbeitete Fassung). Forschungsbericht Nr. 60. Ludwig-Maximilans-Universität München Institut für Pädagogische Psychologie und Empirische Pädagogik.

Rieck, K, unter Mitarbeit von Hoffmann, D., & Friege, G. (2005). Gute Aufgaben. In Modulbeschreibung des Programms SINUS-Transfer Grundschule. https://www.schulportal-thueringen.de/get-data/a79020fe-f99b-4153-8de5-cfff12f92f30/N1.pdf . Accessed 27 Jan 2020.

Sopka, S., Simon, M., & Beckers, S. (2013). “Assessment drives Learning”: Konzepte zur Erfolgs- und Qualitätskontrolle. In M. St. Pierre & G. Breuer (Eds.), Simulation in der Medizin . Springer.

Wilbers, K. (2014). Wirtschaftsunterricht gestalten. Toolbox (2. Aufl.). epubli.

Wilbers, K. (2019). Wirtschaftsunterricht gestalten. epubli GmbH. https://www.pedocs.de/volltexte/2019/17949/pdf/Wilbers_2019_Wi.rtschaftsunterricht_gestalten.pdf . Accessed 24 Okt 2019.

Download references

Author information

Authors and affiliations.

Friedrich Alexander Uni, Fortbildungszentrum Hochschullehre FBZHL, Fürth, Bayern, Germany

Friedrich Alexander Universität Erlangen-Nürnberg, Fortbildungszentrum Hochschullehre FBZHL, Fürth, Germany

Michael Cursio

You can also search for this author in PubMed   Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Fachmedien Wiesbaden GmbH, part of Springer Nature

About this chapter

Jahn, D., Cursio, M. (2023). Assessment of Critical Thinking. In: Critical Thinking. Springer VS, Wiesbaden. https://doi.org/10.1007/978-3-658-41543-3_8

Download citation

DOI : https://doi.org/10.1007/978-3-658-41543-3_8

Published : 10 December 2023

Publisher Name : Springer VS, Wiesbaden

Print ISBN : 978-3-658-41542-6

Online ISBN : 978-3-658-41543-3

eBook Packages : Education Education (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research
  • Search Menu
  • Advance articles
  • Editor's Choice
  • Special Collections
  • Author Guidelines
  • Submission Site
  • Open Access
  • Reasons to submit
  • About BioScience
  • Journals Career Network
  • Editorial Board
  • Advertising and Corporate Services
  • Self-Archiving Policy
  • Potentially Offensive Content
  • Terms and Conditions
  • Journals on Oxford Academic
  • Books on Oxford Academic

Issue Cover

Article Contents

Acknowledgements.

  • < Previous

A New Method for Assessing Critical Thinking in the Classroom

Ahrash N. Bissell (e-mail [email protected] ) is a research associate in biology at the Academic Resource Center, Duke University, Durham, NC 27708, where he studies teaching and learning innovation in science, as well as animal behavior and evolution

Paula P. Lemons (e-mail [email protected] ) is an assistant professor of the practice of biology in Duke's Department of Biology, where she is responsible for the required introductory biology course and works on curriculum and graduate student development

  • Article contents
  • Figures & tables
  • Supplementary Data

Ahrash N. Bissell, Paula P. Lemons, A New Method for Assessing Critical Thinking in the Classroom, BioScience , Volume 56, Issue 1, January 2006, Pages 66–72, https://doi.org/10.1641/0006-3568(2006)056[0066:ANMFAC]2.0.CO;2

  • Permissions Icon Permissions

To promote higher-order thinking in college students, we undertook an effort to learn how to assess critical-thinking skills in an introductory biology course. Using Bloom's taxonomy of educational objectives to define critical thinking, we developed a process by which (a) questions are prepared with both content and critical-thinking skills in mind, and (b) grading rubrics are prepared in advance that specify how to evaluate both the content and critical-thinking aspects of an answer. Using this methodology has clarified the course goals (for us and the students), improved student metacognition, and exposed student misconceptions about course content. We describe the rationale for our process, give detailed examples of the assessment method, and elaborate on the advantages of assessing students in this manner.

Several years ago, we launched a journey toward understanding what it means to teach critical thinking. At that time, we were both biology instructors working together on teaching an introductory biology course at Duke University, and we wanted to help students develop higher-order thinking skills—to do something more sophisticated than recite back to us facts they had memorized from lectures or the textbook (i.e., what many of them had been asked to do in previous biology courses).

The justification for our journey is well supported by the science education literature. Many college and university faculty believe that critical thinking should be a primary objective of a college education ( Yuretich 2004 ), and numerous national commissions have called for critical-thinking development (e.g., AAAS 1989 , NAS–NRC 2003 ). Yet when trying to implement critical thinking as an explicit goal in introductory biology, we found ourselves without a well-defined scheme for its assessment.

And we were not alone. Despite the interest among faculty in critical thinking as a learning goal, many faculty believe that critical thinking cannot be assessed or they have no method for doing so ( Beyer 1984 , Cromwell 1992 , Aviles 1999 ). Consider a 1995 study from the Commission on Teacher Credentialing in California and the Center for Critical Thinking at Sonoma State University ( Paul et al. 1997 ). These groups initiated a study of college and university faculty throughout California to assess current teaching practices and knowledge of critical thinking. They found that although 89 percent of the faculty surveyed claimed that critical thinking is a primary objective in their courses, only 19 percent could explain what critical thinking is, and only 9 percent of these faculty were teaching critical thinking in any apparent way ( Paul et al. 1997 ). This observation is supported by evidence from other sources more specific to the sciences, which suggest that many introductory science, technology, engineering, and math (STEM) courses do not encourage the development of critical-thinking abilities ( Fox and Hackerman 2003 , Handelsman et al. 2004 ).

Why is it that so many faculty want their students to think critically but are hard-pressed to provide evidence that they understand critical thinking or that their students have learned how to do it?

We identified two major impediments to the assimilation of pedagogical techniques that enhance critical-thinking abilities. First, there is the problem of defining “critical thinking.” Different definitions of the term abound ( Facione 1990 , Aretz et al. 1997 , Fisher and Scriven 1997 ). Not surprisingly, many college instructors and researchers report that this variability greatly impedes progress on all fronts ( Beyer 1984 , Resnick 1987 ). However, there is also widespread agreement that most of the definitions share some basic features, and that they all probably address some component of critical thinking ( Potts 1994 ). Thus, we decided that generating a consensus definition is less important than simply choosing a definition that meets our needs and consistently applying it. We chose Bloom's taxonomy of educational objectives ( Bloom 1956 ), which is a well-accepted explanation for different types of learning and is widely applied in the development of learning objectives for teaching and assessment (e.g., Aviles 1999 ).

Bloom's taxonomy delineates six categories of learning: basic knowledge, secondary comprehension, application, analysis, synthesis, and evaluation (box 1). The first two categories, basic knowledge and secondary comprehension, do not require critical-thinking skills, but the last four—application, analysis, synthesis, and evaluation—all require the higher-order thinking that characterizes critical thought. The definitions for these categories provide a smooth transition from educational theory to practice by suggesting specific assessment designs that researchers and instructors can use to evaluate student skills in any given category. Other researchers and even entire departments have investigated how to apply Bloom's taxonomy to refine questions and drive teaching strategies (e.g., Aviles 1999 , Anderson and Krathwohl 2001 ). Nonetheless, the assessments developed as part of these efforts cannot be used to measure critical thinking independent of content.

The second major impediment to developing critical thinking in the classroom is the difficulty that faculty face in measuring critical-thinking ability per se. It is relatively straightforward to assess students' knowledge of content; however, many faculty lack the time and resources to design assessments that accurately measure critical-thinking ability ( Facione 1990 , Paul et al. 1997 , Aviles 1999 ). A large body of literature already exists showing that critical thinking can be assessed (e.g., Cromwell 1992 , Fisher and Scriven 1997 ). The critical-thinking assessments that have been most rigorously tested are subject-independent assessments. These assessments presumably have the advantage of allowing measurements of critical-thinking ability regardless of the context, thus making it possible to compare different groups of people ( Aretz et al. 1997 , Facione et al. 2000 ). Previous studies have demonstrated a positive correlation between the outcomes of these subject-independent tests and students' performance in a course or on a task (e.g., Onwuegbuzie 2001 ). Such studies serve to illustrate that critical thinking per se is worth assessing, or at least that it has some relationship to students' understanding of the material and to their performance on exams. Still, generalized assessments of critical-thinking ability are almost never used in a typical classroom setting ( Haas and Keeley 1998 ). There are several problems with such general tests, including the following:

Faculty doubt that the measurements indicate anything useful about discipline-specific knowledge.

Administering these tests takes time away from the content of the course and can be costly; thus, they are viewed as “wasted” time.

Most faculty lack the time to learn the underlying structure and theory behind the tests, and so it is unclear to them why such a test would be worthwhile.

Recognizing the problems with standardized, discipline-independent assessments of critical thinking, we developed an assessment methodology to enable the design of questions that clearly measure both the content we want students to know and the cognitive skills we want them to obtain. Ideally, this methodology should allow for discipline-specific (i.e., content-based) questions in which the critical-thinking component can be explicitly dissected and scored. Furthermore, we built on the work of others who have used Bloom's taxonomy to drive assessment decisions by using this taxonomy to explicitly define the skills that are required for each question. Finally, we crafted a system for developing scoring rubrics that allows for independent assessment of both the content and the skills required for each question. It is this methodology we have begun applying to introductory biology.

Designing, testing, and scoring discipline-specific assessments of critical thinking

Our methodology consists of several steps. First, we write questions that require both biological knowledge and critical-thinking skills. Second, we document the particular content and critical-thinking skills required (e.g., application, analysis, synthesis) and then devise a scoring rubric for the question. Our scheme is a synthesis of the work of others who have devised rubrics that independently assess either content ( Porter 2002 , Ebert-May et al. 2003 , Middendorf and Pace 2004 ) or critical-thinking skills ( Facione et al. 2000 ). Third, we subject these questions to a test of validity by submitting them for review to colleagues who are experts in biology and biological education. Fourth, we administer the assessments to students and score them on the basis of the rubric that we established in advance. On average, the first two steps of the process take about an hour for a new question; substantially less time is required when revising existing questions. For the third step, the speed at which questions can be reviewed and validated depends on the number and quality of professional colleagues, but this step is not crucial in terms of trying out a new question in a course.

Figures 1–3 illustrate three examples of our methodology. These examples are just a few of the many questions that we have already developed for use in biology. The methodology appears to be robust for different types of questions and varying degrees of open-endedness in the answers. The questions themselves (as the students would see them on an exam) are shown in figures 1a , 2a , and 3a . These questions are akin to other “advanced” questions in biology in that the students must bring several concepts together to give the correct answers. A substantial fraction of the points is based on the rationale presented in the answers, and the students are alerted to this fact ahead of time. The first step in evaluating the effectiveness of these problems is to clearly define the expectations for each question, as described in the captions for figures 1–3 (note that components of this framework borrow ideas from the work of Ebert-May and colleagues [2003] ). These expectations are valid when the student gets the correct and complete answer, or when the student answers the question by drawing on the expected content. However, it is possible to apply the correct critical-thinking skills to these problems while getting some aspect of the content wrong or employing an alternative body of content. This insight is a key element of our assessment technique. Thus, we designed a grading rubric that explicitly clarifies the intersection of the content and the skills (detailed below), illustrated in figures 1b , 2b , and 3b .

Example 1: Chemical transport across cell membranes

A student who forgets (or never learned) about the structure of messenger RNA (mRNA) or the action of lipid bilayers can get one or both of these concepts wrong in the answer, losing two points for each incorrect part (figure 1b) . However, as the rubric shows, the student can get some points if these incorrect answers are correctly rationalized to lead to an appropriate conclusion. For example, a student might indicate that mRNA is a neutral molecule (zero points) but say that lipid bi-layers act as barriers to charged molecules (plus two points). In this case, the correct rationalization would be that mRNA can diffuse through the lipid bilayer, which means choice 1 is correct (figure 1b) . The relative point values can be assigned by the professor to favor either the critical-thinking component or the knowledge needed. Thus, students can only get all of the points by knowing and understanding the content and then applying and analyzing it correctly. However, students can still get some points, even if they don't know all of the content, as long as the justification for the choice follows logically from the original errors in understanding.

Note that it is not possible to get any points for the choice if a step is missing in the rationalization. Thus, a student who correctly indicates the answers to IA and IB, and then skips the analysis, does not get any credit for the choice, even if it is correct. The reason this grading scheme works is that the choice makes no sense out of context, and the context must be provided by the student.

Example 2: Genetic drift in toads

One area of confusion that is tested by this question is the distinction between selection and drift, and the fact that environmental conditions affect these two evolutionary processes differently. Assuming the student defines genetic drift properly, then a series of logical deductions about the action of genetic drift as it pertains to the map should lead to the conclusion that the island (I) population is the correct choice. However, as figure 2b illustrates, there are many opportunities for erroneous understanding that can nonetheless be rationalized appropriately. Students may incorrectly define genetic drift (confusing it with natural selection), but then also incorrectly state that smaller populations are more likely to evolve via selection, which leads to the conclusion that species I (the island scrub toad) is most likely to change. Each part of the answer can be viewed as a decision point, and whichever decision the student makes is then carried over to the next step. In this manner, instructors can reward students who are internally consistent, even if they err somewhere early on. They also avoid rewarding students who are not consistent anywhere but happen to choose the “best” answer, which of course is only true if they get all of the components correct.

Example 3: DNA, cell division, and cancer

This question requires that students apply their knowledge of the bonding between members of DNA base pairs to understand how an anti-cancer drug works. It is less open-ended than the other two examples because there is really only one correct conclusion, which is essentially given within the question (i.e., nitrogen mustard is an anticancer drug). As a result, most of the errors in answering the question arise from skipping one or more of the steps necessary for describing the connection between base-pair bonding and an anticancer drug. As figure 3b indicates, the awarding of points stops when the logical flow of ideas stops; no points are awarded for the final conclusion unless it is supported by evidence presented in the answer.

As with any short-answer examination, grading is vastly easier when the students' answers are clear and employ good reasoning. But this is not typical, and the difficulty for most graders lies in the awarding of partial credit, where much time can be wasted trying to decipher whether a student's answer suggests a level of knowledge or skill that deserves some points. This methodology can greatly reduce these types of difficulties, since the range of acceptable answers should be clearly predefined. We routinely find that the questions for which we have developed a rubric require less time and effort for graders—and produce more valid scoring—than questions for which the rubric is developed during the grading process. The rubric is also subject to further refinement as a result of repeated use, in which case the time and effort needed for grading decreases even more. Of course, an instructor can save time in grading by carefully designing any type of assessment, but we believe that the level of rigor has been particularly low in assessments intended to measure critical-thinking skills, and that more explicit assessments of critical-thinking skills are needed in STEM courses.

We have successfully implemented these types of assessments in a large (approximately 150 students) introductory biology course at Duke University. We are currently gathering data on student performance that allow us to assess mastery of content at several different skill levels at the same time that we test mastery of skills using different types of content. Although we have not yet completed these analyses, we have already found that the use of this assessment methodology has positively affected the course in a number of ways.

For example, thinking in advance about what we want questions to accomplish in terms of both content and critical thinking has enabled us to be explicit with students about the skills they need to develop in order to succeed in the course. We have reviewed questions and grading rubrics in our lectures and made examples of them available to students outside of class. As a result of this exposure, students were more aware of the quality of responses we expected for questions and could easily cross-reference their own responses with our explicit guidelines. These efforts helped students reflect on and improve their thinking (and writing) abilities—a concept referred to as metacognition. Conversations with students suggested that we were in fact teaching metacognition, which is known to have positive effects on learning ( Fink 2003 , Tuckman 2003 , Juwah et al. 2004 ). Many students have communicated that they never understood how to “think critically” in other courses, even if they were asked to, because they were never shown what it means.

In addition to these benefits, we have found that student answers to these types of questions have provided exceptional formative feedback for us to refine and improve our assessments and teaching practices. In some cases, the feedback comes in the form of apparent student misconceptions of the course material. For instance, in example 1, we found that 36 percent of the students who opted for choice 2 either lacked a rationale or offered a rationale that was incorrect. These results suggested that many students were selecting an answer based on intuition or on a foggy memory of the material discussed in class, as opposed to sound analysis of the scenarios presented and application of the facts about membranes and RNA to those scenarios. To investigate this phenomenon further, we used a different form of the question later in the same semester, in which the molecule of interest was uncharged, which means that it can pass through the membrane without the use of a channel. Unfortunately, 43 percent of the class still proposed that this molecule would not pass through the membrane. Because this question was more complex, we expected that many students would have errors in reasoning. However, this large percentage was disappointing and may illustrate that many students failed to fully understand membrane structure, perhaps as a result of preconceived (and hard to change) notions about this material, or perhaps because of the manner in which they were exposed to it. Before we discovered this finding, the students learned about membranes through the standard combination of lectures and assigned readings in the textbook. Other components of the course are taught using alternative methods, such as peer interaction in lecture, role-playing, one-minute essays, and other techniques. We are changing the way we teach about membrane structure and function, and we will be using this same assessment methodology to measure the impact of these instructional changes on student understanding.

In other cases, the formative feedback arises when students demonstrate either limited understanding or unexpected insights about the material. In example 2, although we expected students to be most confused about the distinction between drift and selection, we found instead that the most common mistake was the failure to adequately describe genetic drift. Most students could only talk about genetic drift in terms of specific examples that create the conditions for genetic drift (e.g., founder effect). Many students even used the terms “genetic drift” and “founder effect” interchangeably. This type of revelation, which is a common result of the assessment methodology we are describing, allows for responsive changes in the grading rubric as well as in teaching approaches. For example, we might ask another class a version of the question in which the “I” population is moved onto the mainland, thus eliminating the easy rationale for assuming the founder effect.

In example 3, although the most predictable answer involves knowledge of DNA replication, an alternative approach begins with a different focus. The impact of nitrogen mustard is the same (it prevents unzipping of DNA), but the impact is on DNA transcription, not DNA replication. Since the cell cannot function without proper transcription, it is reasonable to assume that cell division would also cease, effectively stopping tumor growth. Here, the instructor can decide whether the question is broad enough to allow for this level of insight, especially if the student employs appropriate logic throughout. The rubric can be easily amended as desired for immediate or subsequent use. Alternatively, the question can be rewritten to further constrain the possible answers or to encourage an even greater diversity of responses. Since one characteristic of critical thinking is the awareness that a given question may have more than one correct answer, this methodology allows alternative answers to be considered and possibly built into the scoring rubric. Overall, this type of feedback has proved valuable in helping us identify specific areas of the course that need further refining to improve student learning and understanding.

Conclusions

We can imagine that some biology instructors might still be reluctant to use this methodology, despite its advantages for student learning, because of time constraints and other practical concerns. But our assessment methodology offers three particular advantages that can help alleviate these worries. First, these types of assessments demand content knowledge, so there are no “wasted” questions. Second, the assessments are flexible, in that they can be easily amended to accommodate unforeseen answers, and can be weighted to favor either the critical-thinking component or the content component. Third, the assessments can be more rapidly and reliably scored than other “open-ended” questions because of the highly refined format of the scoring rubrics.

We are currently studying individual gains in learning for both specific content (e.g., membrane structure and function, forces of evolution, and DNA replication) and critical-thinking skills (e.g., application or analysis) from one time point to another in a semester. Most instructors recognize that their discipline contains particular concepts that are known to be difficult for most students, and we are hoping that our investigations will clarify how to help students learn essential biological concepts more deeply.

We are also examining the transferability of skills developed in one context (e.g., introductory biology) to a different context (e.g., introductory physics). These types of investigations have created collaborative opportunities with instructors in other STEM disciplines at Duke University who are interested in improving student learning and curriculum goals. The critical-thinking assessments described here offer an entry-way into understanding relationships between teaching practices, student learning, and assessment designs. We are currently parlaying this methodology into an interdisciplinary effort to enhance critical-thinking instruction across the STEM disciplines at Duke University, guided in part by the collaborative working-group model described by Middendorf and Pace (2004) and Cox (2004) . Although we are also working with other interested faculty within biology, we have found that these types of assessments are most needed and desired in the large introductory classes across disciplines, and conversations across disciplinary lines have helped faculty to see the value of assessing critical-thinking skills as a distinct goal from measuring content acquisition.

We would like to acknowledge the members of the Duke Consortium for the Scholarship of Teaching and Learning for their advice and feedback. In addition, we thank Jeremy Hyman for his contributions to the design of the questions used as examples. This work was supported in part by a grant to P. P. L. from Duke University's Arts and Sciences Committee on Faculty Research.

References cited

[AAAS] American Association for the Advancement of Science . 1989 . Science for All Americans. New York: Oxford University Press.

Anderson LW Krathwohl D . eds. 2001 . A Taxonomy for Learning, Teaching, and Assessing: A Revision of Bloom's Taxonomy of Educational Objectives. New York: Longman.

Aretz AJ Bolen MT Devereux KE . 1997 . Critical thinking assessment of college students. Journal of College Reading and Learning . 28 : 12 - 23 .

Aviles CB . 1999 . Understanding and Testing for “Critical Thinking” with Bloom's Taxonomy of Educational Objectives. Washington (DC): ERIC Clearinghouse. Report no. SO032219.

Beyer BK . 1984 . Improving thinking skills—defining the problem. Phi Delta Kappan . 65 : 486 - 490 .

Bloom BS . ed. 1956 . Taxonomy of Educational Objectives: The Classification of Educational Goals. New York: McKay.

Cox M . 2004 . Introduction to faculty learning communities. New Directions for Teaching and Learning . 97 : 5 - 23 .

Cromwell LS . 1992 . Assessing critical thinking. New Directions for Community Colleges . 77 : 37 - 50 .

Ebert-May D Batzli J Lim H . 2003 . Disciplinary research strategies for assessment of learning. BioScience . 53 : 1221 - 1228 .

Facione PA . 1990 . Critical thinking: A statement of expert consensus for purposes of educational assessment and instruction. Pages. 1 - 19 . in Facione PA, ed. The Delphi Report. Millbrae (CA): California Academic Press.

Facione PA Facione NC Giancarlo CA . 2000 . The disposition toward critical thinking: Its character, measurement, and relationship to critical thinking skill. Informal Logic . 20 : 61 - 84 .

Fink LD . 2003 . Creating Significant Learning Experiences. San Francisco: Jossey-Bass.

Fisher A Scriven M . 1997 . Critical Thinking: Its Definition and Assessment. Point Reyes (CA): Edgepress.

Fox MA Hackerman N . eds. 2003 . Evaluating and Improving Undergraduate Teaching in Science, Technology, Engineering, and Mathematics. Washington (DC): National Academies Press.

Haas PF Keeley SM . 1998 . Coping with faculty resistance to teaching critical thinking. College Teaching . 46 : 63 - 67 .

Handelsman J . 2004 . Scientific teaching. Science . 304 : 521 - 522 .

Juwah C Macfarlane-Dick D Matthew B Nicol D Ross D Smith B . 2004 . Enhancing Student Learning through Effective Formative Feedback. York (United Kingdom): Higher Education Academy Generic Centre.

Middendorf J Pace D . 2004 . Decoding the disciplines: A model for helping students learn disciplinary ways of thinking. New Directions for Teaching and Learning . 98 : 1 - 12 .

[NAS–NRC] National Academy of Sciences–National Research Council . 2003 . BIO 2010: Transforming Undergraduate Education for Future Research Biologists. Washington (DC): National Academies Press.

Onwuegbuzie AJ . 2001 . Critical thinking skills: A comparison of doctoral-and master's-level students. College Student Journal . 35 : 477 - 481 .

Paul RW Elder L Bartell T . 1997 . California Teacher Preparation for Instruction in Critical Thinking: Research Findings and Policy Recommendations. Santa Rosa (CA): Foundation for Critical Thinking.

Porter AC . 2002 . Measuring the content of instruction: Uses in research and practice. Educational Researcher . 31 : 3 - 14 .

Potts B . 1994 . Strategies for Teaching Critical Thinking in Education. Washington (DC): ERIC Clearinghouse. Report no. ED385606.

Resnick LB . 1987 . Education and Learning to Think. Washington (DC): National Academy Press.

Tuckman BW . 2003 . The effect of learning and motivation strategies training on college students' achievement. Journal of College Student Development . 44 : 430 - 437 .

Yuretich RF . 2004 . Encouraging critical thinking: Measuring skills in large introductory science classes. Journal of College Science Teaching . 33 : 40 - 46 .

Box 1. Bloom's taxonomy of educational objectives

Bloom's taxonomy subdivides the academic skills that students might need into six different categories, listed below. The first three categories are considered to be hierarchical: basic knowledge requires no critical-thinking skills, secondary comprehension expands on basic knowledge but also requires no critical thinking, and application requires higher-order thinking about the knowledge that a student constructs. The last three categories are also considered higher-order skills that require critical thinking, but they are not necessarily hierarchical. Note that correctly using the higher-order skills requires both knowledge and comprehension of the content, so all levels of thinking should be encouraged.

Basic knowledge: memorizing facts, figures, and basic processes.

Secondary comprehension: understanding and illustrating the facts.

Application: generalizing the facts to other contexts and situations.

Analysis: understanding why the facts are the way they are; breaking problems down.

Synthesis: making connections between different elements on one's own.

Evaluation: critically using one's knowledge to ascertain the quality of information.

(a) Example question 1: mRNA diffusion through a nuclear membrane. (b) Grading rubric for example question 1. To answer this question, students need knowledge of the chemical structure of mRNA and lipid bilayers, and two types of critical-thinking skill, application (addressing mRNA movement in the cell using knowledge of mRNA structure and lipid bilayers) and analysis (examining both scenarios and deciding which is more likely to be the way that mRNA moves from the nucleus to the cytoplasm). Complete answers will include each of the following elements (IA, IB, II, and III). (IA) mRNA is a molecule that has charged phosphate groups. (IB) The lipid bilayer has a hydrophobic interior. (Note: mentioning the hydrophilic head groups is not essential to this answer.) (II) The lipid bilayer cannot accomodate a charged molecule via diffusion, thereby eliminating choice 1. (III) Since diffusion won't work, an alternative mechanism is needed. Therefore, choice 2 is more likely (mRNAs move through a protein channel).

(a) Example question 2: Genetic drift in toads. (b) Grading rubric for example question 2. To answer this question, students need knowledge of (1) the definition of genetic drift, (2) the relationship between population size and the likelihood of random genetic drift, and (3) the possible relationship between range size and population size (reading the map). They also need two types of critical-thinking skill, application and analysis (students must correctly define genetic drift and state its relationship to population size, and then apply that knowledge to an analysis of likely population sizes based on the map). Complete answers will include each of the following elements (I, IIA, IIB, and III): (I) Genetic drift refers to the occurrence of random genetic changes in a population. (IIA) Random genetic changes are most likely to occur in smaller populations. (IIB) Species I has the smallest overall range, suggesting that it also has the smallest effective population size. (III) Therefore, species I is most likely to be affected by genetic drift. (Note: There are some plausible reasons why the species with larger ranges might actually have smaller effective population sizes, but the burden of fully rationalizing this conclusion falls on the student)

(a) Example question 3: DNA, cell division, and cancer. (b) Grading rubric for example question 3. To answer this question, students need to know that (1) double-stranded DNA is held together by weak hydrogen (H) bonding between members of a base pair; (2) in DNA replication, base pairs must separate to allow for copying; and (3) DNA replication prepares a cell for division by producing two complete copies of the genetic material. They also need the critical-thinking skill of application (applying their knowledge to determine why nitrogen mustard interferes with cell division and tumor growth). Complete answers will include each of the following (I, IIa, IIb, III, and IV): (I) An anticancer drug must somehow stop cell division, thus halting tumor growth. (IIa) Normally, bonding between DNA base pairs is weak (H bonding). (IIb) When DNA replication occurs, these weak bonds are broken (the DNA is “unzipped”) to allow for copying, which is necessary for cell division. (III) Strong, covalent bonds between base pairs (as formed by nitrogen mustard) cannot be broken, so no DNA copying occurs and cell division ceases. (IV) Therefore, nitrogen mustard is an anticancer drug.

Author notes

Email alerts, citing articles via.

  • Recommend to your Library

Affiliations

  • Online ISSN 1525-3244
  • Copyright © 2024 American Institute of Biological Sciences
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

critical thinking assessment method

JavaScript seems to be disabled in your browser. For the best experience on our site, be sure to turn on Javascript in your browser.

  • Order Tracking
  • Create an Account

critical thinking assessment method

200+ Award-Winning Educational Textbooks, Activity Books, & Printable eBooks!

  • Compare Products

Reading, Writing, Math, Science, Social Studies

  • Search by Book Series
  • Algebra I & II  Gr. 7-12+
  • Algebra Magic Tricks  Gr. 2-12+
  • Algebra Word Problems  Gr. 7-12+
  • Balance Benders  Gr. 2-12+
  • Balance Math & More!  Gr. 2-12+
  • Basics of Critical Thinking  Gr. 4-7
  • Brain Stretchers  Gr. 5-12+
  • Building Thinking Skills  Gr. Toddler-12+
  • Building Writing Skills  Gr. 3-7
  • Bundles - Critical Thinking  Gr. PreK-9
  • Bundles - Language Arts  Gr. K-8
  • Bundles - Mathematics  Gr. PreK-9
  • Bundles - Multi-Subject Curriculum  Gr. PreK-12+
  • Bundles - Test Prep  Gr. Toddler-12+
  • Can You Find Me?  Gr. PreK-1
  • Complete the Picture Math  Gr. 1-3
  • Cornell Critical Thinking Tests  Gr. 5-12+
  • Cranium Crackers  Gr. 3-12+
  • Creative Problem Solving  Gr. PreK-2
  • Critical Thinking Activities to Improve Writing  Gr. 4-12+
  • Critical Thinking Coloring  Gr. PreK-2
  • Critical Thinking Detective  Gr. 3-12+
  • Critical Thinking Tests  Gr. PreK-6
  • Critical Thinking for Reading Comprehension  Gr. 1-5
  • Critical Thinking in United States History  Gr. 6-12+
  • CrossNumber Math Puzzles  Gr. 4-10
  • Crypt-O-Words  Gr. 2-7
  • Crypto Mind Benders  Gr. 3-12+
  • Daily Mind Builders  Gr. 5-12+
  • Dare to Compare Math  Gr. 2-7
  • Developing Critical Thinking through Science  Gr. 1-8
  • Dr. DooRiddles  Gr. PreK-12+
  • Dr. Funster's  Gr. 2-12+
  • Editor in Chief  Gr. 2-12+
  • Fun-Time Phonics!  Gr. PreK-2
  • Half 'n Half Animals  Gr. K-4
  • Hands-On Thinking Skills  Gr. K-1
  • Inference Jones  Gr. 1-6
  • James Madison  Gr. 10-12+
  • Jumbles  Gr. 3-5
  • Language Mechanic  Gr. 4-7
  • Language Smarts  Gr. 1-4
  • Mastering Logic & Math Problem Solving  Gr. 6-9
  • Math Analogies  Gr. K-9
  • Math Detective  Gr. 3-8
  • Math Games  Gr. 3-8
  • Math Mind Benders  Gr. 5-12+
  • Math Ties  Gr. 4-8
  • Math Word Problems  Gr. 4-10
  • Mathematical Reasoning  Gr. Toddler-11
  • Middle School Science  Gr. 6-8
  • Mind Benders  Gr. PreK-12+
  • Mind Building Math  Gr. K-1
  • Mind Building Reading  Gr. K-1
  • Novel Thinking  Gr. 3-6
  • OLSAT® Test Prep  Gr. PreK-K
  • Organizing Thinking  Gr. 2-8
  • Pattern Explorer  Gr. 3-9
  • Practical Critical Thinking  Gr. 8-12+
  • Punctuation Puzzler  Gr. 3-8
  • Reading Detective  Gr. 3-12+
  • Red Herring Mysteries  Gr. 4-12+
  • Red Herrings Science Mysteries  Gr. 4-9
  • Science Detective  Gr. 3-6
  • Science Mind Benders  Gr. PreK-3
  • Science Vocabulary Crossword Puzzles  Gr. 4-6
  • Sciencewise  Gr. 4-12+
  • Scratch Your Brain  Gr. 2-12+
  • Sentence Diagramming  Gr. 3-12+
  • Smarty Pants Puzzles  Gr. 3-12+
  • Snailopolis  Gr. K-4
  • Something's Fishy at Lake Iwannafisha  Gr. 5-9
  • Teaching Technology  Gr. 3-12+
  • Tell Me a Story  Gr. PreK-1
  • Think Analogies  Gr. 3-12+
  • Think and Write  Gr. 3-8
  • Think-A-Grams  Gr. 4-12+
  • Thinking About Time  Gr. 3-6
  • Thinking Connections  Gr. 4-12+
  • Thinking Directionally  Gr. 2-6
  • Thinking Skills & Key Concepts  Gr. PreK-2
  • Thinking Skills for Tests  Gr. PreK-5
  • U.S. History Detective  Gr. 8-12+
  • Understanding Fractions  Gr. 2-6
  • Visual Perceptual Skill Building  Gr. PreK-3
  • Vocabulary Riddles  Gr. 4-8
  • Vocabulary Smarts  Gr. 2-5
  • Vocabulary Virtuoso  Gr. 2-12+
  • What Would You Do?  Gr. 2-12+
  • Who Is This Kid? Colleges Want to Know!  Gr. 9-12+
  • Word Explorer  Gr. 6-8
  • Word Roots  Gr. 3-12+
  • World History Detective  Gr. 6-12+
  • Writing Detective  Gr. 3-6
  • You Decide!  Gr. 6-12+

critical thinking assessment method

  • Special of the Month
  • Sign Up for our Best Offers
  • Bundles = Greatest Savings!
  • Sign Up for Free Puzzles
  • Sign Up for Free Activities
  • Toddler (Ages 0-3)
  • PreK (Ages 3-5)
  • Kindergarten (Ages 5-6)
  • 1st Grade (Ages 6-7)
  • 2nd Grade (Ages 7-8)
  • 3rd Grade (Ages 8-9)
  • 4th Grade (Ages 9-10)
  • 5th Grade (Ages 10-11)
  • 6th Grade (Ages 11-12)
  • 7th Grade (Ages 12-13)
  • 8th Grade (Ages 13-14)
  • 9th Grade (Ages 14-15)
  • 10th Grade (Ages 15-16)
  • 11th Grade (Ages 16-17)
  • 12th Grade (Ages 17-18)
  • 12th+ Grade (Ages 18+)
  • Test Prep Directory
  • Test Prep Bundles
  • Test Prep Guides
  • Preschool Academics
  • Store Locator
  • Submit Feedback/Request
  • Sales Alerts Sign-Up
  • Technical Support
  • Mission & History
  • Articles & Advice
  • Testimonials
  • Our Guarantee
  • New Products
  • Free Activities
  • Libros en Español

How to Assess Critical Thinking

Assessing Critical Thinking

October 11, 2008, by The Critical Thinking Co. Staff

Developing appropriate testing and evaluation of students is an important part of building critical thinking practice into your teaching. If students know that you expect them to think critically on tests, and the necessary guidelines and preparation are given before hand, they are more likely to take a critical thinking approach to learning all course material. Design test items that require higher-order thinking skills such as analysis, synthesis, and evaluation, rather than simple recall of facts; ask students to explain and justify all claims made; instruct them to make inferences or draw conclusions that go beyond given data. Essays and problems are the most obvious form of item to use for testing these skills, but well-constructed multiple-choice items can also work well. Consider carefully how you will evaluate and grade tests that require critical thinking and develop clear criteria that can be shared with the students.

In order to make informed decisions about student critical thinking and learning, you need to assess student performance and behavior in class as well as on tests and assignments. Paying careful attention to signs of inattention or frustration, and asking students to explain them, can provide much valuable information about what may need to change in your teaching approach; similarly, signs of strong engagement or interest can tell you a great deal about what you are doing well to get students to think. Brief classroom assessment instruments, such as asking students to write down the clearest and most confusing points for them in a class session, can be very helpful for collecting a lot of information quickly about student thinking and understanding.

Critical Thinking Assessment: 4 Ways to Test Applicants

Post Author - Juste Semetaite

In the current age of information overload, critical thinking (CT) is a vital skill to sift fact from fiction. Fake news, scams, and disinformation can have a negative impact on individuals as well as businesses. Ultimately, those with finer CT skills can help to lead their team with logical thinking, evidence-based motivation, and smarter decisions.

Today, most roles require critical thinking skills. And understanding how to test and evaluate critical thinking skills can not only help to differentiate candidates but may even predict job performance .

This article will cover:

What is critical thinking?

  • Critical thinking vs problem-solving
  • 5 critical thinking sub-skills
  • The importance of assessing critical thinking skills
  • 4 ways to leverage critical thinking assessments

Critical thinking is the process of analyzing and evaluating information in a logical way. And though a valuable skill since as far back as the early philosophers’ era, it is just as vital today. For candidates to succeed in the digital economy , they need modern thinking skills that help them think critically.

Whether we realize it or not, we process tons of data and information on a daily basis. Everything from social media to online news, data from apps like Strava – and that’s on top of all the key metrics in relation to our professional role.

Without a shadow of a doubt, correctly interpreting information — and recognizing disinformation — is an essential skill in today’s workplace and everyday life. And that’s also why teaching critical thinking skills in education is so important to prepare the next generation for the challenges they will face in the modern workplace.

Critical thinking isn’t about being constantly negative or critical of everything. It’s about objectivity and having an open, inquisitive mind. To think critically is to analyze issues based on hard evidence (as opposed to personal opinions, biases, etc.) in order to build a thorough understanding of what’s really going on. And from this place of thorough understanding, you can make better decisions and solve problems more effectively. Bernard Marr | Source

Today, candidates with CT skills think and reason independently, question data, and use their findings to contribute actively to their team rather than passively taking in or accepting information as fact.

Why are critical thinking skills important?

In the workplace, those with strong CT skills no longer rely on their gut or instinct for their decisions. They’re able to problem-solve more effectively by analyzing situations systematically.

With these skills, they think objectively about information and other points of view and look for evidence to support their findings rather than simply accepting opinions or conclusions as facts.

When employees can turn critical thinking into a habit, it ultimately reduces personal bias and helps them be more open to their teammates’ suggestions — improving how teams collaborate and collectively solve problems.

Critical thinking vs. Problem solving – what’s the difference?

Let’s explore the difference between these two similar concepts in more detail.

Critical thinking is about processing and analyzing information to reach an objective, evidence-based conclusion. Let’s take a look at an example of critical thinking in action:

  • A member of the team suggests using a new app they’ve heard about to automate and speed up candidate screening . Some like the idea, but others in the team share reasons why they don’t support the idea. So you visit the software website and look at the key features and benefits yourself, then you might look for reviews about it and ask your HR counterparts what they think of it. The reviews look promising, and a few of your fellow practitioners say it’s worked well for them. Next, you look into the costs compared to the solution your team is already using and calculate that the return on investment (ROI) is good. You arrive at the conclusion that it’d be worth testing the platform with the free trial version and recommend this to your team.

On the other hand, problem solving can involve many of the same skills as critical thinking, such as observing and evaluating. Still, it focuses on identifying business obstacles and coming up with solutions. So, let’s return to the example of the candidate screening software and see how it might work differently in the context of problem-solving :

  • For weeks, the talent acquisition team has complained about how long it takes to screen candidates manually. One of the team members decides to look for a solution to their problem. They assess the team’s current processes and resources and how to best solve the issues. In their research, they discover the new candidate screening platform and test out its functionality for a few days. They feel it would benefit the team and suggest it at the next meeting. Great problem solving, HR leader!

Problem-Solving Skills: 5 Ways to Evaluate Them When Hiring

What are the 5 sub-skills that make up critical thinking?

the sub skills of critical thinking competency

Now that we’ve established what CT is, let’s break it down into the 5 core sub-skills that make up a critical thinking mindset .

  • Observation : Being observant of your environment is the first step to thinking critically. Observant employees can even identify a potential problem before it becomes a problem.
  • Analysis : Once you’ve observed the issue or problem, you can begin to analyze its parts. It’s about asking questions, researching, and evaluating the findings objectively. This is an essential skill, especially for someone in a management role.
  • Inference : Also known as construct validity, is about drawing a conclusion from limited information. To do this effectively may require in-depth knowledge of a field. Candidates with this skill can contribute a lot of value to a startup, for instance, where initially, there may be little data available for information processing.
  • Communication : This pertains to expressing ideas and your reasoning clearly and persuasively, as well as actively listening to colleagues’ suggestions or viewpoints. When all members of a team or department can communicate and consider different perspectives, it helps tasks (and, well, everything) move along swiftly and smoothly.
  • Problem solving : Once you begin implementing a chosen solution, you may still encounter teething problems. At that point, problem solving skills will help you decide on the best solution and how to overcome the obstacles to bring you closer to your goal.

What is a critical thinking assessment test?

Though there are a few different ways to assess critical thinking, such as the Collegiate Learning Assessment, one of the most well-known tests is the Watson Glaser™ Critical Thinking Appraisal .

Critical thinking tests, or critical reasoning tests, are psychometric tests used in recruitment at all levels, graduate, professional and managerial, but predominantly in the legal sector. However, it is not uncommon to find companies in other sectors using critical thinking tests as part of their selection process. This is an intense test, focusing primarily on your analytical, or critical thinking, skills. Source

These tests are usually timed and typically include multiple choice items, short answers or short scenario-based questions to assess students or prospective candidates. They test candidates’ ability to interpret data without bias, find logical links between information, and separate facts from false data .

Critical thinking example questions from the Watson-Glacer test rubric

But how do these tests measure critical thinking?

In addition to educational and psychological testing, many employers today use critical thinking tests to assess a person’s ability to question information — to ask What , Why , and How of the data. A standard critical thinking test breaks down this aptitude by examining the following 5 components:

  • assumption – analyzing a scenario to determine if there are any assumptions made
  • deduction – the ability to choose which deductions are logical
  • evaluating evidence – in support of and against something
  • inference – conclusions, drawn from observed facts
  • interpretation – interpreting the accuracy of a stated conclusion (based on a scenario)

Why is it important to assess critical thinking skills during the recruitment process?

Critical thinking skills may be considered a soft skill , but it’s become a prerequisite in certain industries, like software, and for many roles. Marketing managers, project managers, accountants, and healthcare professionals, for example, all require a degree of CT skills to perform their roles.

The kinds of businesses that require critical thinking include technology , engineering , healthcare , the legal sector , scientific research, and education . These industries are typically very technical and rely on data . People working in these fields research and use data to draw logical conclusions that help them work smarter and more efficiently.

In the hiring process, test takers with good critical thinking skills stand out . Why? Because they are able to demonstrate their ability to collaborate, problem-solve, and manage pressure in a rational, logical manner. As a result, they’re more likely to make the right business decisions that boost efficiency and, ultimately, a business’s bottom line.

Critical thinking assessment template for evaluating candidates

Examples of jobs that rely on critical thinking skills

Critical thinking is not rocket science, but it is an important skill when making decisions — especially when the correct answer is not obvious. Here are a few examples of job roles that rely on critical thinking dispositions:

  • computer programmers or developers : may use critical thinking and other advanced skills in a variety of ways, from debugging code to analyzing the problem, finding potential causes, and coming up with suitable solutions. They also use CT when there is no clear roadmap to rely on, such as when building a new app or feature.
  • criminologists : must have critical thinking abilities to observe criminal behavior objectively and to analyze the problem in such a way that they can be confident in the conclusions they present to the authorities.
  • medical professionals : need to diagnose their patients’ condition through observation, communication, analysis and solving complex problems to decide on the best treatment.
  • air traffic controllers: need a super clear, calm head to deal with their high-stress job. They observe traffic, communicate with pilots, and constantly problem-solve to avoid airplane collisions.
  • legal professionals : use logic and reasoning to analyze various cases – even before deciding whether they’ll take on a case – and then use their excellent communication skills to sway people over to their reasoning in a trial setting.
  • project managers : have to deal with a lot of moving parts at the same time. To successfully keep projects on time and budget, they continually observe and analyze the progress of project components, communicate continually with the team and external stakeholders and work to solve any problems that crop up.

What are the risks of not testing for critical thinking?

By not evaluating critical thinking beforehand, you may end up hiring candidates with poor CT skills. Especially when hiring business leaders and for key positions, this has the potential to wreak havoc on a business. Their inaccurate assumptions are more likely to lead to bad decisions , which could cost the company money .

Weak critical thinking can result in a number of issues for your organization and justifies the expense or added effort of asking your candidate to complete critical thinking tests in the hiring process. For example, poor CT skills may result in:

  • making mistakes
  • not being able to take action when needed
  • working off false assumptions
  • unnecessary strain on work relationships

4 ways to assess critical thinking skills in candidates

Now that we’ve seen how important it is for most candidates today to have strong critical thinking skills, let’s take a look at some of the assessment instruments the talent acquisition team can use.

#1 – A homework assignment

A homework assignment is a task that assesses whether test takers have the right skills for a role. If critical thinking is essential for a particular job, you could provide candidates with a homework assignment that specifically tests their ability to:

  • accurately interpret relevant evidence
  • reach logical conclusions
  • judge information skeptically
  • communicate their own viewpoint and others’ backed by facts

Top tips to enlarge those brains

Tip : use Toggl Hire’s skills screening tests to easily filter out the good candidates first and speed up your hiring process.

#2 – Behavioral and situational interview questions

Ask the candidate to provide examples of situations when they used CT for solving problems or making a decision. This can provide insight into the candidate’s ability to analyze information and make informed decisions. For example:

Critical thinking example questions:

  • Tell me about a time when you had to make a really difficult decision at work.
  • What would you do in a situation where your manager made a mistake in a presentation or report?
  • How would you respond if a colleague shared a new idea or solution with you?
  • How do you evaluate the potential outcomes of different actions or decisions?
  • Can you describe a situation where you had to think on your feet and come up with a creative solution to a problem?
  • How do you ensure that your decision-making is based on relevant and accurate information?

30 Behavioral Interview Questions to Ask Candidates (With Answers)

#3 – Discuss the candidate’s critical thinking skills with their references

Additionally, the hiring manager can ask the candidate’s references about how the candidate demonstrated CT skills in the past.

  • Can you recall a time when (the candidate) had to convince you to choose an alternative solution to a problem?
  • Tell me about a time when (the candidate) had to solve a team disagreement regarding a project.

#4 – Critical thinking tests

Ask the candidate to complete a critical thinking test and score against critical thinking rubrics. You can then share feedback on their test scores with them and explore their willingness to improve their score, if necessary. Or compare their score to other applicants, and prioritize those with higher scores if the role truly requires a critical thinker.

Create your next critical thinking assessment with Toggl Hire

Assessing critical thinking skills is becoming a key component in the hiring process, especially for roles that require a particularly advanced skillset. Critical thinking is a sign of future performance. Candidates that clearly demonstrate these skills have a lot to offer companies, from better decision-making to more productive relationships and cost savings.

If your team needs help automating the screening process, and creating custom skills tests based on specific roles, try Toggl Hire’s skills test questions engine or the Custom Test Builder to create the exact questions you want from scratch.

Juste Semetaite

Juste loves investigating through writing. A copywriter by trade, she spent the last ten years in startups, telling stories and building marketing teams. She works at Toggl Hire and writes about how businesses can recruit really great people.

Join 30,000+ subscribers getting the best tips on productivity, work management, hiring and more!

We promise we won't spam you and you can unsubscribe anytime.

You might also like...

Related to Talent Assessments

How to Identify & Hire for Language Proficiency Skills

How to Identify & Hire for Language Proficiency Skills

Toggl Blog, Read articles by Mile Živković

11 Ways Companies Use Employee Skill Testing

Toggl Blog, Read articles by Juste Semetaite

How to Test Hard Skills in Candidates When Hiring

Toggl Blog, Read articles by James Elliott

Take a peek at our most popular categories:

loading

How it works

For Business

Join Mind Tools

Article • 8 min read

Critical Thinking

Developing the right mindset and skills.

By the Mind Tools Content Team

We make hundreds of decisions every day and, whether we realize it or not, we're all critical thinkers.

We use critical thinking each time we weigh up our options, prioritize our responsibilities, or think about the likely effects of our actions. It's a crucial skill that helps us to cut out misinformation and make wise decisions. The trouble is, we're not always very good at it!

In this article, we'll explore the key skills that you need to develop your critical thinking skills, and how to adopt a critical thinking mindset, so that you can make well-informed decisions.

What Is Critical Thinking?

Critical thinking is the discipline of rigorously and skillfully using information, experience, observation, and reasoning to guide your decisions, actions, and beliefs. You'll need to actively question every step of your thinking process to do it well.

Collecting, analyzing and evaluating information is an important skill in life, and a highly valued asset in the workplace. People who score highly in critical thinking assessments are also rated by their managers as having good problem-solving skills, creativity, strong decision-making skills, and good overall performance. [1]

Key Critical Thinking Skills

Critical thinkers possess a set of key characteristics which help them to question information and their own thinking. Focus on the following areas to develop your critical thinking skills:

Being willing and able to explore alternative approaches and experimental ideas is crucial. Can you think through "what if" scenarios, create plausible options, and test out your theories? If not, you'll tend to write off ideas and options too soon, so you may miss the best answer to your situation.

To nurture your curiosity, stay up to date with facts and trends. You'll overlook important information if you allow yourself to become "blinkered," so always be open to new information.

But don't stop there! Look for opposing views or evidence to challenge your information, and seek clarification when things are unclear. This will help you to reassess your beliefs and make a well-informed decision later. Read our article, Opening Closed Minds , for more ways to stay receptive.

Logical Thinking

You must be skilled at reasoning and extending logic to come up with plausible options or outcomes.

It's also important to emphasize logic over emotion. Emotion can be motivating but it can also lead you to take hasty and unwise action, so control your emotions and be cautious in your judgments. Know when a conclusion is "fact" and when it is not. "Could-be-true" conclusions are based on assumptions and must be tested further. Read our article, Logical Fallacies , for help with this.

Use creative problem solving to balance cold logic. By thinking outside of the box you can identify new possible outcomes by using pieces of information that you already have.

Self-Awareness

Many of the decisions we make in life are subtly informed by our values and beliefs. These influences are called cognitive biases and it can be difficult to identify them in ourselves because they're often subconscious.

Practicing self-awareness will allow you to reflect on the beliefs you have and the choices you make. You'll then be better equipped to challenge your own thinking and make improved, unbiased decisions.

One particularly useful tool for critical thinking is the Ladder of Inference . It allows you to test and validate your thinking process, rather than jumping to poorly supported conclusions.

Developing a Critical Thinking Mindset

Combine the above skills with the right mindset so that you can make better decisions and adopt more effective courses of action. You can develop your critical thinking mindset by following this process:

Gather Information

First, collect data, opinions and facts on the issue that you need to solve. Draw on what you already know, and turn to new sources of information to help inform your understanding. Consider what gaps there are in your knowledge and seek to fill them. And look for information that challenges your assumptions and beliefs.

Be sure to verify the authority and authenticity of your sources. Not everything you read is true! Use this checklist to ensure that your information is valid:

  • Are your information sources trustworthy ? (For example, well-respected authors, trusted colleagues or peers, recognized industry publications, websites, blogs, etc.)
  • Is the information you have gathered up to date ?
  • Has the information received any direct criticism ?
  • Does the information have any errors or inaccuracies ?
  • Is there any evidence to support or corroborate the information you have gathered?
  • Is the information you have gathered subjective or biased in any way? (For example, is it based on opinion, rather than fact? Is any of the information you have gathered designed to promote a particular service or organization?)

If any information appears to be irrelevant or invalid, don't include it in your decision making. But don't omit information just because you disagree with it, or your final decision will be flawed and bias.

Now observe the information you have gathered, and interpret it. What are the key findings and main takeaways? What does the evidence point to? Start to build one or two possible arguments based on what you have found.

You'll need to look for the details within the mass of information, so use your powers of observation to identify any patterns or similarities. You can then analyze and extend these trends to make sensible predictions about the future.

To help you to sift through the multiple ideas and theories, it can be useful to group and order items according to their characteristics. From here, you can compare and contrast the different items. And once you've determined how similar or different things are from one another, Paired Comparison Analysis can help you to analyze them.

The final step involves challenging the information and rationalizing its arguments.

Apply the laws of reason (induction, deduction, analogy) to judge an argument and determine its merits. To do this, it's essential that you can determine the significance and validity of an argument to put it in the correct perspective. Take a look at our article, Rational Thinking , for more information about how to do this.

Once you have considered all of the arguments and options rationally, you can finally make an informed decision.

Afterward, take time to reflect on what you have learned and what you found challenging. Step back from the detail of your decision or problem, and look at the bigger picture. Record what you've learned from your observations and experience.

Critical thinking involves rigorously and skilfully using information, experience, observation, and reasoning to guide your decisions, actions and beliefs. It's a useful skill in the workplace and in life.

You'll need to be curious and creative to explore alternative possibilities, but rational to apply logic, and self-aware to identify when your beliefs could affect your decisions or actions.

You can demonstrate a high level of critical thinking by validating your information, analyzing its meaning, and finally evaluating the argument.

Critical Thinking Infographic

See Critical Thinking represented in our infographic: An Elementary Guide to Critical Thinking .

critical thinking assessment method

You've accessed 1 of your 2 free resources.

Get unlimited access

Discover more content

How to write a business case.

Getting Approval and Funding for Your Project

How to Reboot Your Career Video

Video Transcript

Add comment

Comments (1)

priyanka ghogare

critical thinking assessment method

Try Mind Tools for FREE

Get unlimited access to all our career-boosting content and member benefits with our 7-day free trial.

Sign-up to our newsletter

Subscribing to the Mind Tools newsletter will keep you up-to-date with our latest updates and newest resources.

Subscribe now

Business Skills

Personal Development

Leadership and Management

Member Extras

Most Popular

Newest Releases

Article acd2ru2

Team Briefings

Article a4vbznx

Onboarding With STEPS

Mind Tools Store

About Mind Tools Content

Discover something new today

New pain points podcast - perfectionism.

Why Am I Such a Perfectionist?

Pain Points Podcast - Building Trust

Developing and Strengthening Trust at Work

How Emotionally Intelligent Are You?

Boosting Your People Skills

Self-Assessment

What's Your Leadership Style?

Learn About the Strengths and Weaknesses of the Way You Like to Lead

Recommended for you

Communicate like a leader.

Dianna Booher

Expert Interviews

Business Operations and Process Management

Strategy Tools

Customer Service

Business Ethics and Values

Handling Information and Data

Project Management

Knowledge Management

Self-Development and Goal Setting

Time Management

Presentation Skills

Learning Skills

Career Skills

Communication Skills

Negotiation, Persuasion and Influence

Working With Others

Difficult Conversations

Creativity Tools

Self-Management

Work-Life Balance

Stress Management and Wellbeing

Coaching and Mentoring

Change Management

Team Management

Managing Conflict

Delegation and Empowerment

Performance Management

Leadership Skills

Developing Your Team

Talent Management

Problem Solving

Decision Making

Member Podcast

Critical thinking definition

critical thinking assessment method

Critical thinking, as described by Oxford Languages, is the objective analysis and evaluation of an issue in order to form a judgement.

Active and skillful approach, evaluation, assessment, synthesis, and/or evaluation of information obtained from, or made by, observation, knowledge, reflection, acumen or conversation, as a guide to belief and action, requires the critical thinking process, which is why it's often used in education and academics.

Some even may view it as a backbone of modern thought.

However, it's a skill, and skills must be trained and encouraged to be used at its full potential.

People turn up to various approaches in improving their critical thinking, like:

  • Developing technical and problem-solving skills
  • Engaging in more active listening
  • Actively questioning their assumptions and beliefs
  • Seeking out more diversity of thought
  • Opening up their curiosity in an intellectual way etc.

Is critical thinking useful in writing?

Critical thinking can help in planning your paper and making it more concise, but it's not obvious at first. We carefully pinpointed some the questions you should ask yourself when boosting critical thinking in writing:

  • What information should be included?
  • Which information resources should the author look to?
  • What degree of technical knowledge should the report assume its audience has?
  • What is the most effective way to show information?
  • How should the report be organized?
  • How should it be designed?
  • What tone and level of language difficulty should the document have?

Usage of critical thinking comes down not only to the outline of your paper, it also begs the question: How can we use critical thinking solving problems in our writing's topic?

Let's say, you have a Powerpoint on how critical thinking can reduce poverty in the United States. You'll primarily have to define critical thinking for the viewers, as well as use a lot of critical thinking questions and synonyms to get them to be familiar with your methods and start the thinking process behind it.

Are there any services that can help me use more critical thinking?

We understand that it's difficult to learn how to use critical thinking more effectively in just one article, but our service is here to help.

We are a team specializing in writing essays and other assignments for college students and all other types of customers who need a helping hand in its making. We cover a great range of topics, offer perfect quality work, always deliver on time and aim to leave our customers completely satisfied with what they ordered.

The ordering process is fully online, and it goes as follows:

  • Select the topic and the deadline of your essay.
  • Provide us with any details, requirements, statements that should be emphasized or particular parts of the essay writing process you struggle with.
  • Leave the email address, where your completed order will be sent to.
  • Select your prefered payment type, sit back and relax!

With lots of experience on the market, professionally degreed essay writers , online 24/7 customer support and incredibly low prices, you won't find a service offering a better deal than ours.

European Proceedings Logo

  • Publishing Policies
  • For Organizers/Editors
  • For Authors
  • For Peer Reviewers

Search icon

Comparison Of Assessment Methods Of Critical Thinking

email address

Critical thinking has been defined over time by many philosophers, educators, researchers, etc. Although definitions contain a series of common skills, there are various details. Critical thinking is important for individuals to make logical decisions possible and can make life quality better. It is a deliberate act of analysing and evaluating elements of reasoning systematically. Having all this in mind, assessing people`s critical thinking skills is difficult. Main assessment methods are multiple choice tests, multiple choice questions with justification, short essays and performance tests. The most common method among them is multiple choice tests which are easy to apply. However they have some disadvantages such as the lack of comprehensiveness and providing valid results. Hence there are alternative methods, which depend on the purpose of assessment. This paper compares a series of assessment methods for critical thinking and gives an idea for appropriate assessment of critical thinking in project based learning. For this purpose 29 high school students` critical thinking skills are investigated in project based learning by using multiple choice tests, multiple choice questions with justification, short essays and performance tests. Advantages and disadvantages of assessment methods are compared to obtain proper method to use in project based learning. Keywords: Critical thinking assessment project based learning rubric

Introduction

Critical thinking is a popular term in education and even in daily life. Actually it is not new term and many philosophers, authors and educators used this term under different names and applied and taught in different ways. In modern era, Dewey ( 1933 ) stated it was an active process of consideration of belief by analysing reasons and its consequences. He named this process as “reflective thinking”.

Ennis ( 1987 ) defined the term as `a reasonable reflective thinking that is focused on deciding what to believe or do.

Brookfield ( 1987 ) extended and clarified the definition as “a process that involves identifying and challenging assumptions, becoming aware of the importance of context in creating meaning, imagining and exploring reflective skepticisms … a reflective dimensions, more than the cognitive activity of analyzing arguments –it is emotive as well as rational.

American Philosophical Association organized several panels to discuss definition, instruction and assessment of critical thinking through two years. 46 experts, are from different disciplines, participated and contributed to the Delphi Report. They reached to consensus definition as “be purposeful, self-regulatory judgment which results in interpretation, analysis, evaluation, and inference, as well as explanation of the evidential, conceptual, methodological, criteriological, or contextual considerations upon which that judgment is based.” ( Facione, 1990 ).

Definitions of critical thinking share basic features. Although reaching one common definition is difficult, experts stated its common components which are defined more or less same.

Critical thinking is important for individuals to make logical decisions possible and can make life quality better. It is a deliberate act of analysing and evaluating elements of reasoning systematically ( Brookfield, 1987 ; Elder & Paul, 2000 ; Paul & Elder, 2000 ).

Assessment of Critical Thinking

Assessing critical thinking skills is difficult as well as defining critical thinking. Purpose and objectives of assessment depend on definition and approaches of critical thinking. Various purposes for assessing students’ critical thinking skills are enlisted as diagnosing the levels of students` critical thinking and giving them feedback, motivating students to be better, informing teachers and schools about their success, doing research about critical thinking and providing information for further educational programs which students decide to enter ( Ennis, 1993 ). In addition to educational field, critical thinking skills are assessed to select employees (Retrieved from http://www.linklatersgraduates.co.uk/application-process/critical-thinking-test).

Critical thinking is assessed by using multiple choice tests, multiple choice questions with justification, short essays or case studies and performance tests. The use of assessment method depends on test makers` purpose and the size of test takers. These methods are given with their advantages and disadvantages below.

General content based multiple choice tests are usually used to select prospect students and employees or to do research about critical thinking. On a large scale use, these tests are useful because they save time, are cheap, their validity and reliability are checked ( Ennis, 1993 ). Well-known general content based tests are Cornell Critical Thinking Test Level X and Level Z, Watson Glaser Critical Thinking Appraisal, ( Ennis, 1993 ), California Critical Thinking Skills Test, California Critical Thinking Dispositions Inventory, (retrieved from https://www.insightassessment.com/Products) , and International Critical Thinking Test, (retrieved from http://www.criticalthinking.org/pages/international-critical-thinking-test/619).

General content based tests could not assess students’ critical thinking skills well when the test is applied after a course. Since critical thinking is part of the course and the test may not contain anything related with the subjects in the course. Also teachers could be unfamiliar to the underlying structure and theory behind the general content based tests, and lacks the time to learn. That`s why using these tests could be seen as ineffective ( Haas and Keeley, 1998 ).

Critical thinking essay tests are general content based and more comprehensive than multiple choice tests. The Ennis-Weir Critical Thinking Essay Test is a common essay test. They present realistic tasks and permit students to justify their responses. However they are more expensive in time and/or money and not secure ( Ennis, 1993 ). Also CIE offers thinking skills exams which have multiple choice questions, essays and case studies. CIE assess critical thinking by using case studies which includes a scenario and open-ended questions (Retrieved from http://www.cie.org.uk/programmes-and-qualifications/cambridge-international-as-and-a-level-thinking-skills-9694/).

Teachers and researchers may make their own test which is could be good in terms of comprehensiveness especially for subject specific assessment. They can mix the methods and form multiple-choice test with written justification. It can be easy to prepare and check like multiple choice and be open-minded as allowing students to defend their responses even can receive credit by defending their answers well ( Ennis, 1993 ).

Performance assessments can depend on real life cases and may use for subject specific assessment because of reducing comprehensiveness. Although using real life cases is an advantage for learner, this assessment method has disadvantageous for teachers who must devote more time for performance process and assessing performance ( Ennis, 1993 ).

Rubrics and Critical Thinking Assessment

Formative assessments collect students` performance data in a considered instruction and students` work that will promote students’ abilities and higher level thinking. That`s why, formative assessments should be authentic and multidimensional ( Peverini, 2009 ). Rubrics are used as formative assessment tools that guide students to do quality works and teacher to evaluate their works fairly. According to Popham ( 1997 ), essential elements of rubric are evaluating criteria, quality definitions and scoring scale. Rubrics may use for performance assessments as well as open-ended questions.

Rubrics offer benefits for assessment. Tasks and criteria are categorized. Hence rubrics guide the students though the process and help the teachers to give proper feedback ( Tierney & Simon, 2004 ).

The Foundation and Center for Critical Thinking are educational non-profit organizations and their goals are to improve education primary to university. They study and teach critical thinking and develop assessment tools such as multiple choice tests and rubric. (Retrieved from www . criticalthinking . org / files / CriticalThinkingGrid . doc)

BIE is a nonprofit educational organization and expresses its priority to help teachers and to prepare students for successful lives. It forms and shares PBL products like books, articles and critical thinking rubrics.

Universities study on assessment tools for critical thinking. Critical thinking rubric of WSU is an exemplary one. WSU searched the implementation of rubric in undergraduate courses. The mean score for courses (M=3.3) in which the rubric was used was higher than the mean score for courses (M=2.44) in courses which the rubric was not use. ( Retrieved from http://web.uri.edu/assessment/files/WSU-Critical-Thinking-Rubric-2009.pdf )

Problem Statement

Assessment of critical thinking is difficult in terms of reliability and required time. Choosing a proper assessment method is important for teachers. That`s why, four methods are compared and analyzed their advantages and disadvantages.

Research Questions

In this study two questions are searched. “Which methods are more reliable?” and “Which methods have more effective time allowance for teachers and students?”

Purpose of the Study

The purpose of the study was to find out a proper method for assessing students’ critical thinking skills. Four methods, multiple choice, multiple choice with justifications, short answer and rubric for performance test are applied and compared.

Subjects: 9 th and 10 th grade students (n=29) studied on a chemistry project in 5 weeks and presented their products in 2 weeks. All course materials and assessment instruments are in English. Groups includes 3 members, which are one high level, one medium level and one low level students in terms of the mean of first semester exams.

Research Methods

Learning and improvement in thinking requires more time and happens in years ( Paul & Elder, 2000 ). Differences in background beliefs and assumptions between students and teachers and the difference, or similarities in the instruction make disadvantageous to use general content-based test ( Ennis, 1993 ). Therefore I used disciple specific assessment tools for a 7-week study.

Students studied in PBL environment. A rubric, is designed for PBL, was used to assess students` critical thinking skills during preparation of the projects. (Retrieved from www.bie.org/object/document/6_12_critical_thinking_rubric_ccss_aligned)

After the presentations, multiple choice test, in which chemistry related questions were selected from science reasoning tests, has two parts. First part is composed of 3 cases and 5-7 multiple choice questions for each case and second part includes multiple choice questions with justification part in order to understand students` reasons for their choices.

Open-ended questions were selected from Cambridge Thinking Skills Test. There is a chemistry related case study, is told by different sources. Students answered the questions about reliability of sources and possible outcomes.

Rubric was used to assess students` performance while they were preparing their projects. Other tools were applied after their presentations. Means and standard deviations of the test results and rubric scores are calculated and are presented in Table 1 .

As seen in Table 1 , mean of multiple choice test (M=63.3) is the highest and mean of open-ended (M=46.7) questions is the lowest.

Mean of multiple choice with justification is between the means of multiple choice and case study and its results are given as separated in Table 2 .

Several students got points by choosing correct answers without explanation. A few students got points by explaining their choices although they circled wrong option. Mean of justification part is lower than multiple choice and close to case study. There are some answers without explanation in justification part.

The mean values decreased by asking explanation. The results of multiple choice test without explanation are better than others. Similar questions were asked in multiple choice with justification and the mean of the test is lower than multiple choice test.

The results of open-ended questions were similar to justification part. There were some poor answers such as `no evidence`, `quite reliable`, `source A is a good evidence` etc. However some students did not explain or not support their thoughts with sources. It is unclear without an explanation whether they understood the case and answered correctly or not.

Rubrics for performance assessment guide students and teachers by stating criteria for each part of the project. Then teachers follow them to check students` act and work.

Preparation of discipline-specific multiple choice test and multiple choice test with justification take less time than discipline-specific case study. Performance assessment takes more time not only preparation but also application. Tests can be prepared in few hours and applied in a lesson. However preparation of performance assessment needs several hours for preparation and teachers are occupied with the assessment in the lessons to assess all students in all parts of projects.

Multiple choice test and multiple choice test with justification are checked fast. Checking case study takes time to read and assess each answer. Performance assessment keeps the teacher busy in all applied lessons because teacher must observe students` acts and compare with criteria in the rubric.

As time allowance to prepare the test and to assess students` responds, multiple choice test and multiple choice with justification have advantages as taking less time.

The rubric is a helpful tool for teachers who can assess students` work by checking the criteria on it. Therefore teachers provide feedback clearly for each part. They are more reliable. On the other hand, performance assessment takes up time. It can be applied for small scale classes. Case studies, including open-ended questions are more reliable and needs time but not as long as performance assessment. Multiple choice test with justification is a mixture of open-ended questions and multiple choice questions. Preparation and checking are easy like multiple choice and asking for explanation and reliability is better like case studies. Case studies and multiple choice tests with justification can be applied to small and medium scale classes. Multiple choice tests are advantageous in terms of time allowance but not reliability. These tests are good for large scale classes.

Justification part is important because teachers could not be sure that students really knew the answer and selected the option surely or they just circled the option randomly as given in Table 2 .

  • Brookfield, S.D. (1987). Developing critical thinkers. San Francisco: Jossey- Bass Publishers.
  • Dewey, J. (1933). How we think: A restatement of the relation of reflective thinking to the educative process. Boston: D.C. Heath.
  • Elder, L., and Paul, R. (2000). Critical thinking: Nine strategies for everyday life, Part II. Journal of Developmental Education, 24(2), 38–39.
  • Ennis, R. H. (1987). A taxonomy of critical thinking dispositions and abilities. In J. Baron & R. Sternberg (Eds.), Teaching thinking skills: Theory and Practice (pp. 9-26). New York: W. H. Freeman.
  • Ennis, R. H. (1993).  Critical thinking assessment.  Theory into Practice, 32 (3), 179-186.
  • Facione, P.A. (1990). American Philosophical Association. Critical Thinking: A Statement of Expert Consensus for Purposes of Educational Assessment and Instruction. 1990. ERIC document ED 315-423
  • Haas, P.F., and Keeley S.M. (1998). Coping with faculty resistance to teaching critical thinking. College Teaching 46:63–67.
  • Paul, R., and Elder, L. (2000). Critical thinking: Nine strategies for everyday life, Part I. Journal of Developmental Education, 24(1), 40–41.
  • Peverini, S. (2009). The value of teacher expertise and good judgment: Recent inspiring reading about assessment. Language Arts, 86(5), 398-402.
  • Popham, J. (1997). "What's Wrong - and What's Right - with Rubrics". Educational Leadership 55 (2): 72–75.
  • Retrieved from: http://www.cie.org.uk/programmes-and-qualifications/cambridge-international-as-and-a-level-thinking-skills-9694/
  • Retrieved from https://assessment.trinity.duke.edu/documents/WashingtonStateUniversityCriticalThinkingProjectResourceGuide_000.pdf .
  • Retrieved from: https://www.insightassessment.com/Products
  • Retrieved from: http://www.criticalthinking.org/pages/international-critical-thinking-test/619
  • Retrieved from: http://web.uri.edu/assessment/files/WSU-Critical-Thinking-Rubric-2009.pdf
  • Retrieved from: www.bie.org/object/document/6_12_critical_thinking_rubric_ccss_aligned
  • Retrieved from: www.criticalthinking.org/files/CriticalThinkingGrid.doc
  • Tierney, R., & Simon, M. (2004). What’s still wrong with rubrics: Focusing on the consistency of performance criteria across scale levels. Practical Assessment, Research & Evaluation.

Copyright information

Creative Commons License

About this article

Publication date.

28 June 2018

Article Doi

https://doi.org/10.15405/epsbs.2018.06.47

978-1-80296-040-2

Future Academy

Print ISBN (optional)

Edition number.

1st Edition

Teacher, teacher training, teaching skills, teaching techniques, special education, children with special needs

Cite this article as:

Şentürk, N. (2018). Comparison Of Assessment Methods Of Critical Thinking. In V. Chis, & I. Albulescu (Eds.), Education, Reflection, Development – ERD 2017, vol 41. European Proceedings of Social and Behavioural Sciences (pp. 398-404). Future Academy. https://doi.org/10.15405/epsbs.2018.06.47

We care about your privacy

We use cookies or similar technologies to access personal data, including page visits and your IP address. We use this information about you, your devices and your online interactions with us to provide, analyse and improve our services. This may include personalising content or advertising for you. You can find out more in our privacy policy and cookie policy and manage the choices available to you at any time by going to ‘Privacy settings’ at the bottom of any page.

Manage My Preferences

You have control over your personal data. For more detailed information about your personal data, please see our Privacy Policy and Cookie Policy .

These cookies are essential in order to enable you to move around the site and use its features, such as accessing secure areas of the site. Without these cookies, services you have asked for cannot be provided.

Third-party advertising and social media cookies are used to (1) deliver advertisements more relevant to you and your interests; (2) limit the number of times you see an advertisement; (3) help measure the effectiveness of the advertising campaign; and (4) understand people’s behavior after they view an advertisement. They remember that you have visited a site and quite often they will be linked to site functionality provided by the other organization. This may impact the content and messages you see on other websites you visit.

How to assess critical thinking skills in candidates

critical thinking assessment method

It’s not always easy to assess critical thinking skills, but there is a solution – a reliable critical thinking skills test.  It’s the ideal way to get accurate data on a candidate’s critical thinking skills as a whole, as well as a wide range of individual skill sets in your candidates. 

TestGorilla’s comprehensive critical thinking skills test can help you determine whether your candidates can evaluate complex challenges and solve problems using a critical thinking approach. Discover more about how to assess critical thinking skills below.

Table of contents

What does critical thinking mean, why candidates’ critical thinking abilities should align with your industry, assessing critical thinking skills with skills tests, what are the benefits of using a critical thinking skills assessment, use testgorilla’s critical thinking tests to hire with confidence.

Top critical thinking skills indicate that a candidate or employee will be able to gather all the information needed for a task or project, objectively assess data, and conclude how to complete tasks based on what they have learned from their research.

Although more information may appear to be useful when solving tasks, what’s essential is having the right facts and data; more can sometimes equal less in terms of deductive reasoning because some facts may be irrelevant in certain situations. As a result, critical thinking also entails filtering out irrelevant data.

Critical thinking skills are made up of a group of sub-skills that help candidates solve problems and complete tasks, such as research and analytical skills, assessing and interpreting information with critical judgment skills, sharing findings with excellent communication skills, and problem-solving skills in the workplace .

Using research and analytical skills to gather and conceptualize data

Candidates with exceptional critical thinking skills should first be able to gather the necessary information for the task at hand. This might include consulting various sources of information and conducting research on the project.

Assessing and evaluating information with critical judgment skills

To approach the task, candidates and employees will need to evaluate the information they’ve gathered using the appropriate approaches. 

Before taking any action, they should objectively evaluate the validity of the research they have gathered to ensure the data is reliable and accurate using objective, critical judgment skills.

For instance, before writing a blog post or article, content writers should evaluate the credibility of their sources, which helps them overcome the challenge of producing a factually accurate piece.

Using data inference to make complex decisions and make conclusions

Candidates and employees must also be able to recognize patterns, trends, or correlations in data to resolve complex problems and arrive at a conclusion. 

For example, learning from the meanings of qualitative or quantitative data related to the marketing funnel, or analyzing conversion or click-through rates to determine how to optimize a marketing strategy.

Sharing findings and receiving advice with communication skills

Employees must be able to communicate effectively to inform their colleagues of their findings when solving problems. This requires not only clear verbal communication skills but also the ability to listen and interpret the advice of colleagues when discussing the next steps needed to solve the problem.

For instance, if your software development team is looking for ways to fix and roll out a cloud license service, team members will need to share what they’ve learned about their clients and listen to other team members to fix the issues and make the service available.

Approaching the problem with problem-solving skills

To solve problems, candidates and employees will need problem-solving skills as part of their critical thinking mindset. This will include:

Breaking the problem down into smaller parts

Using analytical and critical thinking to solve the problem

Implementing the appropriate strategies to address each of the smaller issues

Since each industry requires different critical thinking sub-skills. The strongest candidates will have critical thinking abilities that are relevant to the industry in which they work. Your candidates should demonstrate that they have the necessary critical thinking sub-skills for the vacant position.

Here is a rundown of several industries and the critical thinking skills they require.

Accountant positions

Accounting candidates must have attention to detail to spot anomalies in figures, as well as critical thinking skills to interpret the story behind the numbers . 

They should also be able to use critical thinking and deductive reasoning to understand the causes of numerical anomalies and devise strategies to solve them.

Business analyst roles

Data inference and communication skills are two critical thinking sub-skills that business analysts should possess. Candidates for these positions should be able to make critical decisions and provide feedback to business leaders that can shape the future of the company.

They should also have critical judgment skills that allow them to make crucial decisions to improve the company’s performance.

Teaching and education jobs

Teachers and educators must be able to think critically to teach and mentor students.  If you’re hiring a teacher , can your candidates effectively handle the process of understanding your students’ learning styles, and can they come up with strategies to deliver and teach lessons that are easily understood?

The problem-solving sub-skill is also important in this field when dealing with behavioral issues in the classroom. Teachers will need to identify the cause of behavioral issues and implement methods to resolve and address them using critical thinking dispositions.

They may also be involved in teaching critical thinking skills or cognitive skills to their students, which necessitates a well-planned teaching strategy.

Examples of roles where critical thinking skills are essential

Legal roles

As stated in Lawyer Monthly , critical thinking for legal positions entails remaining objective when inferring case-related information and comparing the information you have with the facts of relevant cases. 

Legal cases also require more than just a recount of the facts related to the case. Candidates must be able to use critical thinking skills to evaluate the strengths of their arguments and present them persuasively and coherently to solve cases.

Software developer/engineer positions

Switchup emphasizes the importance of critical thinking skills for software developers , explaining that solving problems as they arise is made possible with critical thinking.

Two critical thinking sub-skills for software developers are excellent organizational and time management capabilities, as well as problem-solving abilities. 

Can your software developer candidate create successful software solutions using the data provided to them? Can they effectively communicate with their team members to efficiently resolve errors or bugs in their code? 

Marketing / SEO roles

Candidates for SEO positions must be able to conduct ongoing keyword searches and expand on keyword opportunities. Part of this involves researching keywords related to the articles produced by the marketing team and expanding on keyword clusters which also requires critical thinking.

Because you’ll need to look for several critical thinking sub-skills in candidates and employees, here are four steps you can take to assess them with critical thinking skills tests and make the process easier.

Choose from a selection of skills tests to create an assessment and assess your candidates’ critical thinking skills. These might include:

Problem-solving skills  

Communication skills  

Attention to detail

Part of a critical thinking assessment involves requesting candidates to decide on the validity of a conclusive statement by using a set of information and evidence. Look for candidates who recognize which conclusions are linked to the evidence, and which are not; those that can differentiate between these two have top critical thinking skills.

During a critical thinking skills assessment, candidates will be expected to deduce the strength and validity of a set of arguments. Can your candidates deduce which of the arguments adequately support the main statement, and which fail to do so?

Use the results of this part of the assessment and make comparisons between candidates to recognize which candidates have the best critical thinking skills.

How skilled are your candidates at evaluating assumptions? This is another way to assess critical thinking skills using a skills assessment. Candidates who can use deductive reasoning skills to deduce whether an assumption is true have the best critical thinking skills.

Use the results to generate interview questions and invite candidates to the next round.

There are four major advantages to using a critical thinking assessment to test a candidate’s or employee’s critical thinking ability:

Assess critical thinking skills to find out if candidates can enhance productivity

Critical thinking can make a difference in communication and, as a result, productivity in the workplace. For this reason, knowing which candidates have the required critical thinking skills that can boost productivity is crucial; it’s one reason why assessing critical thinking skills is beneficial to your business.

Evaluate critical thinking to learn if candidates can save your organization time

Can your candidates use critical thinking skills to save your organization time when completing a project or task? Assessing critical thinking skills is beneficial as it helps you find out if candidates or employees are capable of this. 

Assessing critical thinking skills with a timed critical thinking assessment is also beneficial as you’ll learn which candidates are the strongest for your organization. Candidates who can correctly solve these challenging deductive reasoning questions within a shorter time than others can potentially help your business save time.

Benefits of using a critical thinking skills assessment

Find out if a candidate’s critical thinking skills suit your business

Since you’ll find out if a candidate’s approach to critical thinking suits your organization by assessing their critical thinking skills, use a critical thinking assessment to help you achieve this for various roles – keep in mind that 75% of professions describe critical thinking as a required skill .

If you’re hiring for many different positions within your organization such as legal secretary roles or lawyer positions, and they all require critical thinking skills, analytical skills, or deductive reasoning abilities, a critical thinking skills test will rigorously test your candidates for each of these particular skills in your business. 

Bridge a critical thinking and analytical skills gap in your organization

The University of Massachusetts suggests that there is ample evidence that critical thinking can be taught , and assessing critical thinking skills can help with this. In the context of the workplace and skills testing, employers can use critical skills test results to assess the baseline of their employees’ critical thinking and analytical skills. 

They can then plan towards suitable training sessions to enhance their analytical and critical thinking skills, which is why assessing critical thinking skills is beneficial. 

For instance, to enhance critical thinking skills, teach your employees to:

Weigh up the pros and cons of a situation

Conduct the right research

Break down projects into smaller sections

Double-check a potential solution to a problem

There’s no denying that critical thinking tests can help you make better hiring decisions. However, keep in mind that there are several sub-skills to consider and that the specific role you’re hiring for will require a distinct set of critical thinking sub-skills.Critical thinking tests can assist you in hiring talent with the right critical thinking skills for your organization. Evaluating critical thinking doesn’t have to be complicated. Hire with confidence by checking out TestGorilla’s critical thinking tests and hire the best, faster and bias-free.

Related posts

Kafka interview questions featured image

65 Apache Kafka interview questions to hire top talent (+ 25 sample answers)

Are employee referrals the key to staff retention

Employee referrals: A powerful tool for finding – and keeping – top talent

50 Jenkins interview questions featured image

50 Jenkins interview questions to hire top developers

Hire the best candidates with TestGorilla

Create pre-employment assessments in minutes to screen candidates, save time, and hire the best talent.

critical thinking assessment method

Latest posts

Could AI lead to the rise of mediocre employees and mediocre work featured image

The best advice in pre-employment testing, in your inbox.

No spam. Unsubscribe at any time.

Hire the best. No bias. No stress.

Our screening tests identify the best candidates and make your hiring decisions faster, easier, and bias-free.

Free resources

critical thinking assessment method

This checklist covers key features you should look for when choosing a skills testing platform

critical thinking assessment method

This resource will help you develop an onboarding checklist for new hires.

critical thinking assessment method

How to assess your candidates' attention to detail.

critical thinking assessment method

Learn how to get human resources certified through HRCI or SHRM.

critical thinking assessment method

Learn how you can improve the level of talent at your company.

critical thinking assessment method

Learn how CapitalT reduced hiring bias with online skills assessments.

critical thinking assessment method

Learn how to make the resume process more efficient and more effective.

Recruiting metrics

Improve your hiring strategy with these 7 critical recruitment metrics.

critical thinking assessment method

Learn how Sukhi decreased time spent reviewing resumes by 83%!

critical thinking assessment method

Hire more efficiently with these hacks that 99% of recruiters aren't using.

critical thinking assessment method

Make a business case for diversity and inclusion initiatives with this data.

Cart

  • SUGGESTED TOPICS
  • The Magazine
  • Newsletters
  • Managing Yourself
  • Managing Teams
  • Work-life Balance
  • The Big Idea
  • Data & Visuals
  • Reading Lists
  • Case Selections
  • HBR Learning
  • Topic Feeds
  • Account Settings
  • Email Preferences

A Short Guide to Building Your Team’s Critical Thinking Skills

  • Matt Plummer

critical thinking assessment method

Critical thinking isn’t an innate skill. It can be learned.

Most employers lack an effective way to objectively assess critical thinking skills and most managers don’t know how to provide specific instruction to team members in need of becoming better thinkers. Instead, most managers employ a sink-or-swim approach, ultimately creating work-arounds to keep those who can’t figure out how to “swim” from making important decisions. But it doesn’t have to be this way. To demystify what critical thinking is and how it is developed, the author’s team turned to three research-backed models: The Halpern Critical Thinking Assessment, Pearson’s RED Critical Thinking Model, and Bloom’s Taxonomy. Using these models, they developed the Critical Thinking Roadmap, a framework that breaks critical thinking down into four measurable phases: the ability to execute, synthesize, recommend, and generate.

With critical thinking ranking among the most in-demand skills for job candidates , you would think that educational institutions would prepare candidates well to be exceptional thinkers, and employers would be adept at developing such skills in existing employees. Unfortunately, both are largely untrue.

critical thinking assessment method

  • Matt Plummer (@mtplummer) is the founder of Zarvana, which offers online programs and coaching services to help working professionals become more productive by developing time-saving habits. Before starting Zarvana, Matt spent six years at Bain & Company spin-out, The Bridgespan Group, a strategy and management consulting firm for nonprofits, foundations, and philanthropists.  

Partner Center

IMAGES

  1. 10 Essential Critical Thinking Skills (And How to Improve Them

    critical thinking assessment method

  2. PPT

    critical thinking assessment method

  3. The benefits of critical thinking for students and how to develop it

    critical thinking assessment method

  4. Critical Thinking Assessment: 4 Ways to Test Applicants

    critical thinking assessment method

  5. CRITICAL THINKING SKILLS. 1. Analytical Part of critical thinking…

    critical thinking assessment method

  6. Critical Thinking

    critical thinking assessment method

VIDEO

  1. Critical Thinking Assessment Series [Disk 2] [Part 3]

  2. Example Assessment 3 Presentation

  3. Critical Thinking Assessment Series [Disk 3] [Part 5]

  4. Critical Thinking Assessment Series [Disk 2] [Part 4]

  5. Non-Verbal Communication: Pre-Thinking Assessment-2

  6. Planning for Rigor: Critical Thinking, Assessment, Planning Grade 7

COMMENTS

  1. Assessing Critical Thinking in Higher Education: Current State and

    Critical thinking is one of the most frequently discussed higher order skills, believed to play a central role in logical thinking, decision making, and problem solving (Butler, 2012; Halpern, 2003).It is also a highly contentious skill in that researchers debate about its definition; its amenability to assessment; its degree of generality or specificity; and the evidence of its practical ...

  2. Critical Thinking > Assessment (Stanford Encyclopedia of Philosophy)

    The Critical Thinking Assessment Test (CAT) is unique among them in being designed for use by college faculty to help them improve their development of students' critical thinking skills (Haynes et al. 2015; Haynes & Stein 2021). ... which incorporated a method of controlling for background beliefs articulated and defended by Norris (1985 ...

  3. Critical Thinking Testing and Assessment

    The purpose of assessing instruction for critical thinking is improving the teaching of discipline-based thinking (historical, biological, sociological, mathematical, etc.) It is to improve students' abilities to think their way through content using disciplined skill in reasoning. The more particular we can be about what we want students to ...

  4. Teaching, Measuring & Assessing Critical Thinking Skills

    Yes, We Can Define, Teach, and Assess Critical Thinking Skills. Critical thinking is a thing. We can define it; we can teach it; and we can assess it. While the idea of teaching critical thinking has been bandied around in education circles since at least the time of John Dewey, it has taken greater prominence in the education debates with the ...

  5. Frontiers

    An Approach to Performance Assessment of Critical Thinking: The iPAL Program. The approach to CT presented here is the result of ongoing work undertaken by the International Performance Assessment of Learning collaborative (iPAL 1). iPAL is an international consortium of volunteers, primarily from academia, who have come together to address the dearth in higher education of research and ...

  6. A Brief Guide for Teaching and Assessing Critical Thinking in

    Instructional interventions affecting critical thinking skills and dispositions: A stage 1 meta-analysis. Review of Educational Research, 4, 1102-1134. Angelo, T. A. (1995). Classroom assessment for critical thinking. Teaching of Psychology, 22(1), 6-7. Bensley, D.A. (1998). Critical thinking in psychology: A unified skills approach.

  7. Guidelines for a Scientific Approach to Critical Thinking Assessment

    This article examines benefits of taking a scientific approach to critical thinking assessment and proposes guidelines for planning, conducting, and using assessment research. ... Assessment of methodological and causal reasoning skills in research methods classes. Presented at the 81st annual meeting of the Eastern Psychological Association ...

  8. Best practices for teaching and assessing critical thinking in

    Critical thinking assessment methods. Scholars tend to agree that critical thinking skills and dispositions are challenging to teach and learn (Abrami et al., 2014; Arum & Roksa, 2011; Behar-Horenstein & Niu, 2011; Ennis, 1985; Norris, 1985; Willingham, 2008). However, as discussed earlier, it is indeed possible to develop critical ...

  9. Assessment of Critical Thinking

    2.1 Observing Learners in the Process of Critical Thinking. The desire for empirical assessment of competence in CT has spawned a variety of different lines of argument and assessment procedures based on them, depending on intent, tradition, and associated conceptual understanding (Jahn, 2012a). Depending on what is understood by CT and what function the assessment is supposed to have, there ...

  10. PDF The NPEC Sourcebook on Assessment, Volume I

    The NPEC Sourcebook on Assessment, Volume 1: Definitions and Assessment Methods for Critical Thinking, Problem Solving, and Writing is a compendium of information about tests used to assess the three skills. Volume 1 is a tool for people who are seeking comparative data about the policy-relevance of

  11. HCTA Halpern Critical Thinking Assessment

    HCTA is the first test that enables a content-representative assessment of recognition and recall aspects of critical thinking. The development of critical thinking skills is listed as the most important outcome of education and the most prized ability for high-level success in the workforce (Stanovich, 2009). Stanovich describes the ability to think critically as "what intelligence tests miss."

  12. [PDF] Assessing Critical Thinking

    Debate as a teaching tool, has a place in pedagogical methods because it allows students to enhance critical thinking through investigating arguments, engaging in research, gathering information, performing analysis, assessing arguments, questioning assumptions, and demonstrating interpersonal skills. Expand. 96. PDF.

  13. New Method for Assessing Critical Thinking in the Classroom

    Abstract. To promote higher-order thinking in college students, we undertook an effort to learn how to assess critical-thinking skills in an introductory biology course. Using Bloom's taxonomy of educational objectives to define critical thinking, we developed a process by which (a) questions are prepared with both content and critical-thinking ...

  14. (PDF) Assessing critical thinking skills

    Abstract. Tennessee Technological University has been exploring methods of assessing critical thinking skills as part of a performance funding initiative since 2000. Our experiences over the last ...

  15. How to Assess Critical Thinking

    Assessing Critical Thinking. October 11, 2008, by The Critical Thinking Co. Staff. Developing appropriate testing and evaluation of students is an important part of building critical thinking practice into your teaching. If students know that you expect them to think critically on tests, and the necessary guidelines and preparation are given ...

  16. PDF Assessing Students' Minds: Developing Critical Thinking or Fitting into

    promote equality in the education sector. The assessment methods should allow students to debate, compare and analyze ideas through critical thinking, inquiring, and understanding for applying the learned knowledge into real life. Thus, the importance of an inquiry-based curriculum and assessment is stressed.

  17. Critical Thinking Assessment: 4 Ways to Test Applicants

    A standard critical thinking test breaks down this aptitude by examining the following 5 components: assumption - analyzing a scenario to determine if there are any assumptions made. deduction - the ability to choose which deductions are logical. evaluating evidence - in support of and against something.

  18. Critical Thinking

    Critical thinking is the discipline of rigorously and skillfully using information, experience, observation, and reasoning to guide your decisions, actions, and beliefs. You'll need to actively question every step of your thinking process to do it well. Collecting, analyzing and evaluating information is an important skill in life, and a highly ...

  19. Using Critical Thinking in Essays and other Assignments

    Critical thinking, as described by Oxford Languages, is the objective analysis and evaluation of an issue in order to form a judgement. Active and skillful approach, evaluation, assessment, synthesis, and/or evaluation of information obtained from, or made by, observation, knowledge, reflection, acumen or conversation, as a guide to belief and action, requires the critical thinking process ...

  20. Comparison Of Assessment Methods Of Critical Thinking

    Critical thinking is assessed by using multiple choice tests, multiple choice questions with justification, short essays or case studies and performance tests. The use of assessment method depends on test makers` purpose and the size of test takers. These methods are given with their advantages and disadvantages below.

  21. How to assess critical thinking skills in candidates

    Problem-solving skills. Communication skills. Attention to detail. Part of a critical thinking assessment involves requesting candidates to decide on the validity of a conclusive statement by using a set of information and evidence. Look for candidates who recognize which conclusions are linked to the evidence, and which are not; those that can ...

  22. A Short Guide to Building Your Team's Critical Thinking Skills

    To demystify what critical thinking is and how it is developed, the author's team turned to three research-backed models: The Halpern Critical Thinking Assessment, Pearson's RED Critical ...

  23. Comprehensive review and assessment of carbon capturing methods and

    Overall, through this comprehensive review, we have identified some critical research gaps in the open literature in the field of CO 2 -capturing methods where there are strong needs for future research and technology development studies, for instance, developing stable and cost-effective liquid solvents and improving the adsorption capacity of ...