ASCB logo LSE Logo

Redesigning a General Education Science Course to Promote Critical Thinking

    Published Online:https://doi.org/10.1187/cbe.15-02-0032

    Abstract

    Recent studies question the effectiveness of a traditional university curriculum in helping students improve their critical thinking and scientific literacy. We developed an introductory, general education (gen ed) science course to overcome both deficiencies. The course, titled Foundations of Science, differs from most gen ed science offerings in that it is interdisciplinary; emphasizes the nature of science along with, rather than primarily, the findings of science; incorporates case studies, such as the vaccine-autism controversy; teaches the basics of argumentation and logical fallacies; contrasts science with pseudoscience; and addresses psychological factors that might otherwise lead students to reject scientific ideas they find uncomfortable. Using a pretest versus posttest design, we show that students who completed the experimental course significantly improved their critical-thinking skills and were more willing to engage scientific theories the general public finds controversial (e.g., evolution), while students who completed a traditional gen ed science course did not. Our results demonstrate that a gen ed science course emphasizing the process and application of science rather than just scientific facts can lead to improved critical thinking and scientific literacy.

    INTRODUCTION

    If we teach only the findings and products of science—no matter how useful and even inspiring they may be—without communicating its critical method, how can the average person possibly distinguish science from pseudoscience?

    Sagan, 1996, p. 21

    A primary goal of education in general, and higher education in particular, is to improve the critical-thinking skills of students (Facione et al., 1995; Van Gelder, 2005; Bok, 2006). Sadly, higher education appears insufficient to the task, with recent studies (Arum and Roksa, 2010; Arum et al., 2011; Pascarella et al., 2011) showing minimal gains in students’ critical-thinking and analytical skills during their undergraduate careers, reducing their employment potential upon graduation (Arum and Roksa, 2014). Science courses, with their focus on evidence and logic, should provide exemplary exposure to and training in critical thinking. Here, too, we appear to be failing, both at the level of individual science classes and programmatically in the science core, given the ineffectiveness of these courses to either improve students’ scientific knowledge or mitigate their acceptance of pseudoscientific claims (Walker et al., 2002; Johnson and Pigliucci, 2004; Impey et al., 2011; Carmel and Yezierski, 2013).

    The inadequacy of standard approaches to teaching science is demonstrated by the fact that 93% of American adults and 78% of those with college degrees are scientifically illiterate (Hazen, 2002); that is, they do not understand science as an empirically based method of inquiry, they lack knowledge of fundamental scientific facts, and they are unable to understand the science-related material published in a newspaper such as the Washington Post (Miller, 1998, 2012). Such deficiencies extend to science majors as well. For example, a study of 170 undergraduates at the University of Tennessee found that, while science majors knew more science facts than non–science majors, there were no differences between the two groups in their conceptual understanding of science or their belief in pseudoscience (Johnson and Pigliucci, 2004). This poor understanding of science adversely affects the ability of individuals to make informed decisions about science-related issues, including well-established theories like the big bang, which is rejected by nearly two-thirds of Americans (National Science Foundation, 2014). The woeful lack of scientific literacy similarly provides insight into the public (though not scientific) controversies surrounding such issues as evolution (Miller et al., 2006), global climate change (Morrison, 2011; Reardon, 2011), and the safety of childhood immunizations (Mnookin, 2011; Offit, 2011). In short, there appears to be a gap between a fundamental goal of science education, to produce scientifically literate citizens, and the results of the pedagogical approaches intended to meet this goal. Particularly troublesome is the ripple effect of inadequate science education at the university level, leading to poor teacher preparation and threatening the quality of science instruction in our public schools (Eve and Dunn, 1990; Rutledge and Warden, 2000).

    Commonly identified causes of the impotency of science courses, especially the introductory courses taken by the majority of college students, are their tendency to focus on scientific “facts” rather than on the nature of science (Johnson and Pigliucci, 2004; Alberts, 2005), often reinforced by exams that reward memorization over higher-order thinking (Alberts, 2009; Momsen et al., 2010); the reluctance to directly engage students’ misconceptions (Alters and Nelson, 2002; Nelson, 2008; Alberts, 2005; Verhey, 2005); the failure to connect “science as a way of knowing” with decisions faced by students in their daily lives (Kuhn, 1993; Walker et al., 2002); and the resistance of faculty trained in more innovative pedagogical approaches to actually employ them (Ebert-May et al., 2011). The traditional approach to science education not only fosters scientific illiteracy, but also alienates many students from science (Seymour and Hewitt, 1997; Ede, 2000; Johnson, 2007) and, ultimately, jeopardizes America’s global competitiveness (National Academy of Sciences, National Academy of Engineering, and Institute of Medicine, 2010). While methods emphasizing active learning demonstrate significant pedagogical improvements for students majoring in the sciences (Freeman et al., 2014), ∼85% of the 1.8 million students graduating from college annually in the United States are not science majors (Snyder and Dillow, 2013). Our goal, therefore, was to develop and test an intervention targeting this larger, frequently overlooked, yet extremely important audience. But what would scientific literacy comprise for students completing only one or two science courses during their college careers? What tools could we use to measure said literacy? And how might we best, in a single course or two, help our students achieve it?

    Our answer to these questions was an integrative, general education (gen ed) science course titled Foundations of Science (FoS), selected as the centerpiece of the Quality Enhancement Plan for reaffirmation at Sam Houston State University (SHSU; Sam Houston State University, 2009). Per Sagan’s (1996) admonition, the FoS course focuses as much on the nature of science as on its facts. We intentionally sought to demystify the process of science by selecting examples, such as the vaccine-autism controversy, that not only held the students’ attention but also, and as importantly, helped demonstrate the utility of “evidentiary thinking” in their daily lives. A brief list of the central tenets of the course is provided below; more detail is available in the “Expanded Course Rationale and Structure” in our Supplemental Material.

    Critical Thinking

    Our central hypothesis was that critical thinking—defined as the ability to draw reasonable conclusions based on evidence, logic, and intellectual honesty—is inherent to scientific reasoning (Facione, 1990, 2015; American Association for the Advancement of Science [AAAS], 1993; Bernstein et al., 2006) and is therefore an essential aspect of scientific literacy. Scientific literacy, then, can best be achieved by offering an alternative type of integrated science course that focuses on these foundations rather than on the traditional “memorize the facts” approach to science education. A simple, operational approach to critical thinking is provided by Bernstein et al. (2006) via a set of questions one should ask when presented with a claim (e.g., vaccines cause autism, global warming is a hoax, there are no transitional fossils). 1) What am I being asked to accept? 2) What evidence supports the claim? 3) Are there alternative explanations/hypotheses? And, finally, 4) what evidence supports the alternatives? The most likely explanation is the one that is best supported. Evidence matters, but only when all of the evidence for and against each of the competing hypotheses has been examined—fully, thoughtfully, and honestly. Sounds like science, doesn’t it? But how can we get science-phobic college students to use it? Perhaps by focusing on topics the non–science student finds interesting, including astrology, homeopathy, Bigfoot, and even intelligent design. But aren’t these ideas just pseudoscientific nonsense? Of course, but students need to understand why they are pseudo rather than real science, and critical thinking/scientific literacy is the key. This is the approach adopted by Theodore Schick and Lewis Vaughn (2014) in How to Think about Weird Things: Critical Thinking for a New Age, one of the two main texts we adopt in the course.

    This text and the course also help students identify and analyze the validity and soundness of arguments. We include a discussion of common heuristics and several logical fallacies, some examples being correlation proves causation, appeal to the masses, and ad hominem attacks. An understanding and awareness of strong versus weak arguments, and the informal fallacies used to surreptitiously circumvent the former, are essential to critical thinking and to the evaluation of claims—whether scientific or pseudoscientific.

    Integrating Content with Process

    While there has been a clarion call for teachers to focus more on scientific process and less on scientific facts (Rutherford and Ahlgren, 1990; AAAS, 1993, 2010), content still matters. Therefore, in addition to the critical-thinking text by Schick and Vaughn, we also use an integrated science textbook (e.g., Hewitt et al., 2013; Trefil and Hazen, 2013) as our second text, typically a custom printing that includes only those chapters whose content we cover in the course. We are fortunate that our course includes both “lecture” and “lab” components, providing multiple, weekly opportunities for active learning. We employ, as a cornerstone of our approach, case studies we have built specifically for the FoS course. Cases, we have found, permit us to teach content and process at the same time, in a manner that engages the non–science student. One of our cases, for example, examines the purported connection between vaccines and autism (Rowe, 2010). Working in small groups, students examine the data from Andrew Wakefield et al.’s (1998) paper, the proverbial match that lit the current firestorm of antivaccine hysteria (Mnookin, 2011; Offit, 2011). After dissecting Wakefield’s data and his conclusions, students are tasked with designing a better study. In so doing, they learn a great deal about sample size, replication, double-blind studies, and scientific honesty, that is, the procedural underpinnings of good science. But the students also learn about antibodies, antigens, herd immunity, and autism spectrum disorders, that is, the findings of science. Similarly, in a case in which students use the science of ecology to go “hunting” for the Loch Ness monster (Rowe, 2015), they must learn and then apply scientific “findings” ranging from the second law of thermodynamics to minimum viable population sizes to postglacial rebound. A large part of the success we witness in our experimental course is due, we believe, to this integration of scientific facts with scientific process.

    Addressing Cognitive Barriers

    An emphasis on evidentiary thinking combined with an integration of content and process will achieve little if students are unable or unwilling to objectively evaluate a claim, hypothesis, or theory. Cognitive barriers can stand in the way of rational decision making (Posner et al., 1982; Sinatra et al., 2008). We designed the FoS course to overcome two such barriers. One hurdle is peoples’ personal experiences, which, for many, trump critical thinking (Chabris and Simons, 2010). If something feels real, looks real, tastes real, if we saw it, experienced it, then it must be true. Zinc is not effective against the common cold? Why, then, did my headache disappear when I used zinc-infused cough drops? Vaccines do not cause autism? What else could explain why my son stopped walking two days after his MMR shot? To help students understand the limitations of anecdotal evidence, including their own personal experiences, we guide them through an exploration of the science of perception and memory. We use illusions to show how our brain unconsciously takes shortcuts that can lead to misperceptions. And we employ simple exercises to demonstrate the malleability and fallibility of memories. Critical thinking requires we recognize that our perceptions and our memories may be flawed.

    The second barrier starts once perceptions and memories have solidified into an opinion. Opinions, once formed, resist change; the more important the belief, the more stubbornly we hang onto it, even in the light of contradictory evidence (Tavris and Aronson, 2007). An honest evaluation of competing explanations requires that students understand cognitive dissonance and its servant twins, expectation bias and confirmation bias. Facts do not matter to someone who does not want to hear them, and evidence is easily discounted when examined with prejudice. Indeed, simply throwing facts at biased conclusions may cause further retrenchment as, for example, was demonstrated in a recent study (Nyhan et al., 2014) of the rebellion against childhood immunizations. Results of the study, which surveyed 1759 parents, are discouraging, in that an intervention presenting the overwhelming evidence that vaccines do not cause autism made parents less likely to vaccinate, not more (Nyhan et al., 2014).

    Social judgment theory (SJT) offers an explanation of Nyhan et al.’s (2014) counterintuitive results. SJT postulates there is a range, a latitude, of ideas similar to a person’s current position he or she might be willing to consider as being true if presented with information that supports the idea. However, if the idea is too different from the person’s initial belief, if it lies outside his or her latitude of acceptance, it will be rejected (Erwin, 2014). Furthermore, the more involved a person is with a view, the wider the latitudes of rejection and the narrower the latitudes of acceptance (Benoit, n.d.). If we want students to understand and accept the big bang theory and the theory of evolution, ideas many find uncomfortable, we cannot simply present the overwhelming evidence in favor of these ideas, we must also accommodate and overcome the dissonance these explanations engender. SJT was, therefore, a central, guiding tenet in the topical organization of the course, briefly outlined below. Topics in the first third of the course are, we believe, the most unusual, so we focus on those here. Additional details of the topics included in the course, the reasons we included them, and the materials we used to teach them can be found in the “Expanded Course Rationale and Structure” in our Supplemental Material, along with a copy of an example course syllabus.

    Topical Organization

    We begin the course by discussing the witch hunts of the 14th through 18th centuries. By some accounts, more than half a million innocent victims were horribly tortured and then killed under the mistaken belief they were the cause of miscarriages, crop failures, and storms, that is, calamities and misfortunes we now know have underlying natural, not supernatural, causes (Sagan, 1996; Cawthorne, 2004). A common question we frequently pose to the students is “What is the harm in believing in something that is not true?” The students, having no personal stake in the fates of these historical victims, easily grasp the importance of evidence, skepticism, and the need for multiple working hypotheses when seeking causal explanations.

    Lest the students think witch hunts are a thing of the past, we segue to a discussion of modern witch hunts, with a focus on the satanic ritual abuse mass hysteria of the 1980s and 1990s (Nathan and Snedeker, 2001). As with the earlier hunts, hundreds of people were accused, convicted, and sent to jail, even though there was little or no empirical evidence to support the allegations (Lanning, 1992). Here, too, the students, with little emotional investment and, thus, little dissonance, draw the reasonable conclusion that scientific literacy, evidence, and critical thinking are good things, because they prevent harm.

    We then discuss the nature of science as a systematic, objective, and reliable means of evaluating testable claims. Mindful of SJT, we do not dismiss other ways of knowing (e.g., intuition, spirituality) but highlight the strengths and successes of the scientific approach, including its unique reliance on evidence, skepticism, logic, multiple working hypotheses, and Occam’s razor, that is, the foundations of science. We stress the importance of self-correction, a characteristic unique to science yet frequently misunderstood by students as a weakness. And, using examples, we introduce students to the pernicious effects of dissonance, dishonesty, and bias as impediments to understanding.

    The next section of the course deals with the limits to perception and memory mentioned earlier, topics critical for understanding why anecdotal evidence, eyewitness accounts, and even personal experiences are insufficient for accepting a claim. By this point in the course, students are beginning to understand Richard Feynman’s famous quote “The first principle is that you must not fool yourself and you are the easiest person to fool” (Feynman and Leighton, 1985, p. 343). If their own perceptions and memories can be faulty, might not some of their opinions be too?

    The remainder of the course covers content more typical of an integrative science course, including but not limited to cosmology, geology, cell biology, and ecology, with somewhat atypical side trips to explore the paranormal and investigate alternative medical therapies. But even here, we attempt to capture the nonmajors’ attention by having them analyze claims they find engaging; they learn a lot about plate tectonics, for example, by investigating the claim that a continent, Atlantis in this case, can disappear.

    The theory of evolution is, by design, reserved for the last week of the course. By then, most students recognize the importance of evidence and logic and critical thinking. They have sharpened the tools in their “baloney detection kit” (Sagan, 1996) and understand that it is not just snake-oil salesmen who market baloney but that we are pretty good at selling it to ourselves. With latitudes of acceptance broadened, they are ready to tackle the scientific theory many find the most discomforting of all.

    METHODS

    Institutional Setting

    Our experiment was conducted at SHSU, a public, doctoral research university located in Huntsville, Texas. Founded in 1879, it offers 138 bachelor’s, master’s, and doctoral degrees. With the exception of an underrepresentation of Asians, the ethnic composition of SHSU broadly matches that of the United States, with 57% of its 19,000-plus students self-reporting as Caucasian/white, 18% as Hispanic, 17% as African American/black, 1% as Asian, and 4% as either multiracial or other ethnicities. Two percent are classified as international. The average age of the institution’s undergraduates is 22 yr. Approximately half of the students are first-generation college students. Because the FoS course is an open-enrollment, gen ed core science course with no prerequisites, the demographic makeup of the course likely represents that of the university. We compared the effectiveness of the FoS course with several traditional introductory science courses for nonmajors taught at the university, courses which, as gen ed survey courses, should also reflect the demographics of the university as a whole.

    Experimental Approach

    We used a pretest versus posttest design to assess the effectiveness of the FoS. Our treatment group consisted of several sections of the experimental course taught over multiple semesters (Table 1). Our comparison group was composed of several different, traditional gen ed science courses, also sampled over multiple semesters, offered by the departments of chemistry, physics, biology, and geography/geology (Table 1). During the study period of Fall semester 2008 through Fall semester 2012, the average class size in each section of our experimental FoS course was 51.75 (±1.17 SE) students; the lab/discussion sections that accompanied the FoS course were capped at 30 students/section. Over the same period, average class size in the traditional courses that formed our comparison group was 51.00 (± 6.07 SE) students. All of the comparison courses also included a lab, similarly capped at 30 students.

    Table 1. CAT scores in traditional versus experimental gen ed science courses, by semester

    Assessment Tools

    To examine changes in student analytical skills, we used the Critical thinking Assessment Test (CAT) developed by the Center for Assessment & Improvement of Learning at Tennessee Tech University (TTU; Stein and Haynes, 2011; Stein et al., 2007). The CAT exam assesses several aspects of critical thinking, including the evaluation and interpretation of information, problem solving, creative thinking, and communication. Student skills encompassed by the CAT include their ability to interpret graphs and equations, solve basic math problems, identify logical fallacies, recognize when additional information might be needed to evaluate a claim, understand the limitations of correlational data, and develop alternative explanations for a claim. These aspects of the CAT exam conform to accepted constructs that characterize critical thinking (Facione, 1990, 2015), and align well with those taught in the FOS course, which specifically emphasizes the ability to draw appropriate conclusions based on multiple working hypotheses, evidence, and reason. The CAT instrument consists of 15 questions, most of which are short-answer responses. More than 200 institutions of higher education are now using the CAT for assessing programmatic changes designed to improve critical thinking among college students, permitting us to compare our results not only with traditional gen ed science courses being taught at our own institution but also with national norms.

    To examine changes in the attitudes of students about science in general, and controversial scientific theories in particular, we used the Measure of Acceptance of the Theory of Evolution (MATE), a 20-question, Likert-scale survey (Rutledge and Warden, 1999; Rutledge and Sadler, 2007) that has been widely used for assessing the acceptance of evolutionary theory among high school teachers and college students (Moore and Cotner, 2009; Nadelson and Southerland, 2010; Peker et al., 2010; Kim and Nehm, 2011; Abraham et al., 2012).

    Beginning in the Fall of 2010, approximately half the students in each of the experimental and comparison courses were assessed pre- and postcourse using the CAT, the other half with the MATE. The pretests were administered during the second week of the term, while the posttests were given in the penultimate week of classes. Instructors teaching both the FoS and the traditional courses agreed on identical incentives each semester, with the exception of Fall 2010: as no credit (baseline data before creation of the FoS) or as extra credit/part of the course grade thereafter (Table 1). Details regarding how the incentive was applied are provided in the example course syllabus in our Supplemental Materials.

    All CAT exams were graded using a modified rubric that enabled the exams to be graded quickly. These scores were used to assign performance points to the students. A subset of all the CAT exams from each course was randomly selected for formal grading using the rubric developed by the Center for Assessment & Improvement of Learning at TTU. Based on the grading procedures established by the center, graders were blind to the identity of the student, whether an exam was a pretest or posttest, and the treatment group. Results of the formal grading are reported herein.

    The MATE was coupled with a locally developed assessment not presented in this publication. Because the responses on the MATE assessment represent personal opinions and attitudes, no incentives were provided to students for their responses on the MATE, and they were informed that their answers would not be graded. However, students were still able to earn rewards equivalent to those of students taking the CAT based on their performance on the locally developed assessment tool.

    Assessment Reliability and Validity

    Arguments regarding the effectiveness of the FoS course demand both reliability and validity. While these concepts are frequently ignored (Campbell and Nehm, 2013), researchers who address the issues of reliability and validity often mistake them as required properties of one’s assessment tools rather than, correctly, as characteristics of the interpretations we make from the tools’ results (Cronbach and Meehl, 1955; Messick, 1995; Brown, 2005; Campbell and Nehm, 2013). The reliability and validity of interpretations based on the CAT have strong evidentiary support (Tennessee Technological University, 2010; Stein and Haynes, 2011; Stein et al., 2007, 2010).

    Interpretations based on the MATE also have demonstrated reliability and validity, at least for certain populations (Rutledge and Warden, 1999; Rutledge and Sadler, 2007). A recent study (Wagler and Wagler, 2013), however, found the MATE lacked construct validity for Hispanic elementary education majors and questioned the utility of the tool for assessing student acceptance of evolutionary theory. Our results do not support this criticism, an argument we present more fully in our Discussion.

    Statistical Analyses

    Pretest versus posttest changes in student scores on the CAT were analyzed using a matched-pairs t test. Personal identifiers were not available in our MATE assessments, preventing the use of a matched-pairs t test; we therefore used a less powerful independent-samples t test when analyzing the MATE results. Assessments of end-of-semester scores in our experimental course (the FoS) versus those in comparison courses (traditional gen ed science courses) were also made using t tests for independent samples, as were analyses of our FoS results versus the national norms available from the Center for Assessment & Improvement of Learning at TTU. The sample data in all tests were examined for violations of the parametric assumptions of normality and variance equality. Where needed, t tests assuming unequal sample variances were applied, while data violating the assumption of normality were log-transformed. In the few cases in which transformations failed to generate a normal distribution, we reduced our α value from 0.05 to 0.025 (Keppel, 1982). An analysis of covariance (ANCOVA) compared the postcourse CAT score for the FoS course with traditional courses while accounting for a student’s entering ability by using his or her precourse CAT score as the covariate. The ANCOVA assumptions of regression-slope homogeneity and treatment-covariate independence were met. As a further aid to understanding the strength of our results (Maher et al., 2013), we also report our effect sizes (Cohen’s d). Results presented in the text are mean ± 1 SE.

    Sample Sizes

    CAT.

    We have CAT results for eight semesters (Table 1), beginning in the Fall of 2008 and ending in the Fall of 2012 (the CAT assessment tool was not used in the Spring of 2012). A total of 475 SHSU undergraduate students have been assessed via the CAT; 203 students representing our comparison group from six different traditional gen ed science courses (with one course, introductory geography, being assessed twice); and 272 students representing our experimental treatment consisting of six different semesters of our FoS course. During the first two semesters of this experiment, we administered the CAT once at the end of the semester, and only in our traditional gen ed science courses, restricting us to a “postcourse” comparison on the full data set. Beginning with the first offering of our experimental course in the Fall of 2009, we administered the CAT both at the beginning and again at the end of the semester to three different traditional gen ed science courses and six semesters of the FoS course, permitting us to use a more powerful “pre- versus postcourse” evaluation comparing the effectiveness of our experimental FoS with traditional gen ed science courses. We also compared the CAT performance of both treatment groups with the national norms for students attending 4-yr colleges and universities, a database of nearly 39,300 students available from the Center for Assessment & Improvement of Learning at TTU.

    MATE.

    We have MATE results for five semesters, beginning in the Fall of 2010 and ending in the Fall of 2012. We have pretest MATE scores from 1443 undergraduate students; 561 from three different traditional gen ed science courses and 882 representing five different semesters of our experimental FoS course. Similarly, we have posttest MATE scores from 1250 undergraduates, with 417 representing the three traditional courses and 833 from the five semesters of the FoS course.

    RESULTS

    Critical Thinking

    FoS Experiment versus Traditional Gen Ed Science Courses.

    Our results are robust and consistent; quite simply, students who complete the experimental FoS course show significant improvement in their critical-thinking skills, as measured by the CAT, while students who complete a traditional gen ed science course do not. In no semester, for example, did students completing a traditional course show improvement in their critical-thinking scores (all p values > 0.49; Table 1), while students completing the experimental course showed highly significant improvement each semester (all p values < 0.01, Cohen’s d typically > 0.70; Table 1). An analysis of pooled end-of-course (posttest only) CAT scores for all six semesters of the FoS course (Table 1, rows 8–13) versus the pooled posttest CAT scores for all six traditional gen ed science courses (Table 1, rows 1–7) reinforce this finding; students completing the FoS course scored significantly higher (19.76 ± 0.35) than did students completing a traditional (14.83 ± 0.37) introductory science course for nonmajors (t(473) = 4.93, p < 0.001, Cohen’s d = 0.89; Figure 1A). A comparison of our pooled pre- versus posttest CAT scores for all six semesters of the FoS course (Table 1, rows 8–13) versus the pooled CAT scores for the three different gen ed science courses (introductory environmental studies, introductory physics, and introductory chemistry) for which we had pre- and postcourse CAT test scores (Table 1, rows 5–7) show similar results. Students who completed the FoS course showed highly significant improvement in critical thinking (pretest = 15.45 ± 0.34, posttest = 19.76 ± 35; t(271) = 13.43, p < 0.001, Cohen’s d = 0.76), while there was no change in the critical thinking scores for students completing a traditional course (pretest = 14.17 ± 0.64, posttest = 14.61 ± 0.72; t(50) = 0.80, p = 0.43; Figure 1B).

    Figure 1.

    Figure 1. Students who complete the experimental FoS course show significant improvement in their critical-thinking scores, as measured by the CAT, while students who complete a traditional gen ed science course do not. Histograms show means + 1 SE. (A) Pooled end-of-course (posttest) CAT scores for all six semesters of the FoS course (Table 1, rows 8–13) vs. the pooled posttest CAT scores for all six traditional gen ed science courses (Table 1, rows 1–7). (B) Pooled pre- vs. posttest CAT scores for all six semesters of the FoS course (Table 1, rows 8–13) vs. the pooled CAT scores for the three different gen ed science courses (introductory environmental studies, introductory physics, and introductory chemistry) for which we had pre- and postcourse CAT test scores (Table 1, rows 5–7). (C) Posttest CAT scores adjusted by pretest CAT scores for the same data set used in B.

    The slightly higher pretest CAT scores for students in the experimental course relative to students taking a traditional course (15.45 vs. 14.61, respectively, Figure 1B) might suggest the significant pre versus post improvement in the former represents a cohort rather than a treatment effect; that is, students selecting an experimental course like FoS may possess better critical-thinking skills to begin with, generating more improvement over the course of a semester regardless of the science course. To assess this, we ran an ANCOVA on the postcourse CAT scores using each student’s precourse CAT score as a covariate. Results adjusting for each student’s entry-level critical-thinking ability still showed a highly significant effect of our experimental treatment (Figure 1C). That is, students who complete the FoS course show significantly better postcourse CAT scores than their peers who complete a traditional course, even when differences in students’ precourse critical-thinking abilities are taken into account (mean adjusted postcourse critical-thinking score in the FoS course experimental course = 19.64 ± 0.65, mean adjusted postcourse critical-thinking score in traditional courses = 15.26 ± 0.28; F(1, 320) = 38.29, p < 0.001, Cohen’s d = 0.339).

    Lower- versus Upper-Division Students and Comparison with National Norms.

    Analyzing our results by class standing not only presents a more detailed picture of where our intervention might be most effective but also permits a comparison with national norms. We have pre- and posttest CAT scores for 166 students who completed the FoS course when they were freshmen or sophomores (i.e., lower-division students), and for 106 students who completed the course when they were juniors or seniors (i.e., upper-division students). Lower-division students enrolling in the FoS course have significantly higher pretest CAT scores (14.80 ± 0.40) than do lower-division students nationally (13.66 ± 0.05, t(165) = 2.827, p < 0.01, Cohen’s d = 0.22) and highly significantly better CAT scores (19.54 ± 0.41, t(165) = 14.305, p < 0.001, Cohen’s d = 1.13) in their posttest CAT at the end of the semester. Indeed, the average posttest CAT score for lower-division FoS students is comparable to the national mean (19.04 ± 0.05) for upper-division (junior/senior) students (t165 = 1.063, p = 0.289; Figure 2A).

    Figure 2.

    Figure 2. Non–science students selecting to enroll in one of their gen ed science courses as entry-level freshmen or sophomores may represent a different subset of students than those who delay taking such core courses until they are juniors or seniors, but both cohorts show highly significant improvement in their critical-thinking ability after completing the FoS course. Histograms show means + 1 SE. (A) Pretest and posttest CAT scores of lower-division (LD; i.e., freshman/sophomore) FoS students (pooled over all six semesters, rows 8–13 in Table 1) compared with national norms. (B) Pretest and posttest CAT scores of upper-division (UD; i.e., junior/senior) FoS students (again pooled over all six semesters, rows 8–13 in Table 1) compared with national norms.

    The results for our upper-division students are quite different. Pretest and posttest CAT scores of upper-division FoS students (again pooled over all six semesters, rows 8–13 in Table 1) compared with national norms show that upper-division FoS students have pretest CAT scores (16.48 ± 0.60) significantly below the national average (19.04 ± 0.05) for juniors and seniors (t(105) = −4.287, p < 0.001, Cohen’s d = −0.42); this deficit is erased, however, after one semester in our experimental course (posttest FoS CAT = 20.12 ± 0.63; t(105) = 1.717, p = 0.090; Figure 2B).

    Student Acceptance of the Theory of Evolution

    Results on the MATE parallel those from the CAT; in no semester did students completing a traditional course show improvement in their acceptance of evolutionary theory (all p values > 0.27; Table 2), while students completing the experimental course showed highly significant improvement each semester (all p values ≤ 0.001, all Cohen’s d > 0.43; Table 2). A pooled analysis comparing students across all semesters in the experimental course with students from the three different traditional courses further highlights the success of the experimental approach; students who completed the FoS course showed highly significant improvement in their acceptance of evolution (pretest = 66.17 ± 0.45, posttest = 75.45 ± 0.49; t(1686.15) = 13.93, p < 0.001, Cohen’s d = 0.67), while there was no change in the acceptance of evolution for students completing a traditional course (pretest = 65.27 ± 0.56, posttest = 64.91 ± 0.71; t(976) = 0.40, p = 0.69; Figure 3).

    Figure 3.

    Figure 3. Students who complete the experimental FoS course show a significant increase in their acceptance of evolution, as measured by the MATE, while students who complete a traditional gen ed science course do not. Pooled pre- vs. posttest MATE scores for five semesters of the FoS course (Table 2, rows 4–8) vs. the pooled MATE scores for the three different gen ed science courses (introductory environmental studies, introductory physics, and introductory chemistry) for which we had pre- and postcourse MATE scores (Table 2, rows 1–3). Histograms show means + 1 SE.

    Table 2. MATE scores in traditional versus experimental gen ed science courses, by semester

    DISCUSSION

    Critical Thinking

    Our results demonstrate that an introductory, gen ed science course for nonmajors, a course focusing on the nature of science rather than just its facts, can lead to highly significant improvements, with large effect sizes, in the ability of college students to think critically. Most college courses do not significantly improve CAT performance in a pre/post design; substantive gains are typically observed only at the program/institutional level (Center for Assessment & Improvement of Learning, TTU, unpublished data). Moreover, results from more than 200 institutions using the CAT show the average improvement in critical thinking observed over 4 yr of a typical undergraduate curriculum is 26% (Harris et al., 2014); students who successfully completed the FoS course improved their CAT scores by almost 28% (15.45 vs. 19.76; Figure 1B). In short, students who complete a single-semester FoS course demonstrate levels of improvement in their critical-thinking skills typically requiring multiple years of college experience, demonstrating that it is possible to teach higher-order thinking skills to nonmajors in a single science course they are required to take, many begrudgingly.

    A finer-grained analysis of our results further illustrates the need to rethink how we are teaching our gen ed science courses. The pretest CAT score for our lower-division students, pooled over all six semesters, was significantly higher than the national average for this age group (Figure 2A). By the end of the semester, our lower-division students’ critical-thinking scores moved well beyond the national norm for freshmen/sophomores and were comparable to the CAT scores achieved by juniors and seniors nationwide (Figure 2A). This is the good news.

    The pattern for our upper-division students, however, is more worrisome, as their pretest CAT average is significantly lower than the national mean for juniors and seniors (Figure 2B). Given that our lower-division students start with significantly better CAT scores than their peers nationally, results showing that our juniors and seniors are significantly worse (before taking the FoS course) than their countrywide counterparts might suggest our institutional curriculum degrades rather than improves a student’s critical-thinking skills. An alternative interpretation is that the non–science students who choose, as freshmen or sophomores, to take one of their science requirements, especially an experimental course like the FoS course, represent a cohort different from the students who delay taking their core science courses until near the end of their undergraduate careers. The former may be less science-phobic than the latter and, thus, more practiced at and receptive to evidentiary thinking. If this interpretation is correct, as science educators, we need to embrace pedagogies that connect with our more anxious students, lest their experiences further alienate them from science as a way of knowing. The approaches adopted in the FoS course may be part of the solution, as the significant deficit in critical thinking we observe in upper-division students, compared with national norms, is gone by the end of the semester (Figure 2B).

    Student Acceptance of Evolutionary Theory

    Results also demonstrate that our experimental course led to significant improvements, again with large effect sizes, in the willingness of students to engage with the theory of evolution. But to what degree? Rutledge and Sadler (2007), authors of the MATE, have identified five levels of acceptance associated with their instrument: very high (89–100), high (76–88), moderate (65–75), low (53–64), and very low (<52). At the beginning of the semester, students in the FoS course exhibited, on average, borderline low to moderate (66.17 ± 0.45) scores on the MATE, improving to the boundary between moderate and high acceptance by the end of the course (75.45 ± 0.49). While we hoped for greater improvement, the end-of-course MATE scores for FoS students are comparable with those of both high school biology teachers in Indiana (77.59 ± 0.84; Rutledge and Warden, 2000) and preservice high school science teachers in Korea (73.79 ± 1.00; Kim and Nehm, 2011). A study of introductory biology students (both majors and nonmajors) attending a public university in Wisconsin who completed a special module exploring macroevolution and its misconceptions (Abraham et al., 2012), also employing a pretest versus posttest design, deserves special mention given the similarities to our experiment. The average postintervention MATE score for the Wisconsin students (75.0 ± 0.52) was similar to the average post-FoS MATE score for students in this study (75.45 ± 0.49). The preintervention scores for students in the two studies, however, were dramatically different (70.8 ± 1.14 for nonmajors, 73.0 ± 0.58 for majors in the Wisconsin study; 66.17 ± 0.45 for the nonmajors in this study), as were the effect sizes of the two interventions (Cohen’s d for Wisconsin = 0.19; Cohen’s d for this study = 0.67). The similarities in postintervention scores given the dissimilarities in preintervention scores of these two comparable studies suggest we have much to learn about the factors influencing student acceptance of evolutionary theory. To contribute, we plan additional analyses, mining our database to examine the effects of gender, ethnicity, high school grade point average, and student attitudes on the MATE and on the CAT.

    Instructors (who are also colleagues and friends) in the traditional gen ed science courses that served as our comparison group were disappointed their students showed no improvement in critical thinking after a semester of science. But, they argued reasonably, why should we expect student acceptance of evolutionary theory to improve in introductory gen ed chemistry or physics classes, given that biological evolution is not discussed in such courses? Four points are relevant, the last being most important. First, we suggest that all college graduates, science majors or not, should appreciate how the term “theory,” used scientifically, differs from its conversational definition. Second, evolutionary theory was covered in the environmental studies course (Table 2) in which we used the MATE, yet students still failed to demonstrate improvement in their acceptance of the theory in this traditionally taught gen ed science course. Third, even though evolution is a topic we address explicitly in the FoS course, it is covered during the last week of the semester, the week following the posttest administration of the MATE.

    The most important issue, however, relates to what the MATE may be measuring. Several authors have argued that the MATE more likely measures an individual’s knowledge about evolution rather than his or her acceptance of the theory (Smith, 2010a; Wagler and Wagler, 2013). And while it is generally presumed that some content knowledge is required for a student to accept evolution as the best explanation of biological diversity, evidence also suggests that dispositional change may be required before a student is willing to entertain the theory (Sinatra et al., 2003; Smith, 2010a,b). Whether the MATE measures an individual’s content knowledge about evolution or his or her disposition toward the theory is beyond the scope of this analysis. Our results, however, are robust; a course focusing on the nature of science and applying SJT leads to significantly improved engagement of the non–science college student with evolution (see also Pigliucci, 2007; Lombrozo et al., 2008).

    Assessment Validity, Revisited

    Wagler and Wagler (2013) criticized the construct validity and, thus, the generalizability of the MATE for populations other than the high school teachers used to originally test the tool’s validity (Rutledge and Warden, 1999). The Waglers found, for example, that the MATE lacked construct validity for their sample of Hispanic college students majoring in elementary education. Construct validity is the degree to which a test actually measures the mental attribute it claims to measure (Brown, 2000); for the MATE, the attribute is thought to be an individual’s acceptance of the theory of evolution (Rutledge and Warden, 1999). One technique for assessing construct validity uses factor analyses with structural equation modeling to identify the number of dimensions of the construct; if a significant unifying dimension or dimensions cannot be identified, the tool may be suspect; this was the approach used to demonstrate that the MATE lacked construct validity for preservice teachers (Wagler and Wagler, 2013). We applied the same technique to our MATE results and similarly found that no model, either uni- or multidimensional, could be fitted to the data (unpublished data). But researchers should never rely on a single method for assessing the validity of their interpretations (Cronbach and Meehl, 1955; Messick, 1995; Brown, 2000, 2005; Campbell and Nehm, 2013). Two related experimental approaches for assessing the construct validity of a test are intervention studies and differential-groups studies (Cronbach and Meehl, 1955; Messick, 1995; Brown, 2000, 2005). In the former, a group is tested before and following their exposure to the construct; significant improvement demonstrates the construct validity of the intervention. Differential-groups studies employ two groups, one presented with the construct, the other not; significantly better scores by the informed group similarly demonstrate the validity of the training. We used both approaches in this study; the “construct” was a novel gen ed science course (the FoS) focusing on the nature of science rather than just its facts (for more details please see “Expanded Course Rationale and Structure” in our Supplemental Materials). Students who completed the training demonstrated, over multiple sections of the course spanning multiple years, highly significant improvement both in their critical-thinking skills (as measured by the CAT; Table 1 and associated figures) and in their willingness to engage the theory of evolution (assessed with the MATE; Table 2 and associated figures). Students who did not receive this training, those who instead completed a traditional gen ed science course, showed no improvement on either metric. While validity is never absolute (Messick, 1995; Brown, 2005; Campbell and Nehm, 2013), we argue that the power and consistency of our results are strong validation of the success of the intervention.

    CONCLUSIONS

    Students completing the FoS course significantly improve their critical-thinking skills. Given the ineffectiveness of gen ed sciences courses in particular (Impey et al., 2011, 2012) and the college curriculum more broadly (Arum and Roksa, 2010, 2014) to produce such change, we are proud to share our successes. But we recognize the improvements we demonstrate, in both critical thinking and in the willingness of students to engage with scientific ideas they often reject, are a snapshot in time, an improvement over a single semester. Our hope, of course, is that students completing an experimental course like the FoS would, upon graduation, be more scientifically literate as adults, that they would understand and value science as a way of knowing, and that they could digest a science-related story in the Washington Post (Miller, 1998). As a single litmus test, would it not be wonderful if all college graduates, not just our science, technology, engineering, and mathematics students, had the confidence and the ability to make intelligent decisions about whether or not to vaccinate their children? We all depend on an educated citizenry with the skills to make, quite literally, just such life-and-death decisions. We must design and teach our nonmajors science courses toward this end.

    ACKNOWLEDGMENTS

    This project was supported through the Quality Enhancement Plan at SHSU. We thank Brent Rahlwes, Cheramie Trahan, Samantha Martin, and Kelsey Pearman, outstanding teaching assistants who not only led the case studies during the lab sections associated with the course but also enthusiastically helped improve the material; Joe Hill, for contributing course materials and teaching a section of the class; Tim Tripp, for having taught several sections of the course; Rita Caso, for her sound advice and unwavering support in her capacity as director of the Office of Institutional Research & Assessment at SHSU; and Cory Kohn, Louise Mead, Ross Nehm, and two anonymous reviewers for providing valuable suggestions on an earlier version of this article. All electronic data files (i.e., xlsx and sav) and reports (i.e., pdf) associated with the CAT are stored on a limited-access, secure FTP server administered by the Center for Assessment & Improvement of Learning at TTU. The physical copy of each CAT test has been digitized (pdf) and stored on a limited-access, secure hard drive; all physical tests were then destroyed. The MATE data are stored on a limited-access, secure server administered by the Office of Institutional Effectiveness at SHSU. Approval for this study was granted by the Internal Review Board Committee (#2013-04-7942) of SHSU.

    REFERENCES

  • Abraham JK, Perez KE, Downey N, Herron JC, Meir E (2012). Short lesson plan associated with increased acceptance of evolutionary theory and potential change in three alternate conceptions of macroevolution in undergraduate students. CBE Life Sci Educ 11, 152-164. LinkGoogle Scholar
  • Alberts B (2005). A wakeup call for science faculty. Cell 123, 739-741. MedlineGoogle Scholar
  • Alberts B (2009). Redefining science education. Science 323, 437. MedlineGoogle Scholar
  • Alters BJ, Nelson CE (2002). Perspective: teaching evolution in higher education. Evolution 56, 1891-1901. MedlineGoogle Scholar
  • American Association for the Advancement of Science (AAAS) (1993). Project 2061–Benchmarks for Science Literacy: A Tool for Curriculum Reform, Washington, DC. Google Scholar
  • AAAS (2010). Vision and Change: A Call to Action, Washington, DC. Google Scholar
  • Arum R, Roksa J (2010). Academically Adrift: Limited Learning on College Campuses, Chicago: University of Chicago Press. Google Scholar
  • Arum R, Roksa J (2014). Aspiring Adults Adrift: Tentative Transitions of College Graduates, Chicago: University of Chicago Press. Google Scholar
  • Arum R, Roksa J, Cho E (2011). Improving Undergraduate Learning: Findings and Policy Recommendations from the SSRC-CLA Longitudinal Project, Brooklyn, NY: Social Science Research Council. Google Scholar
  • Benoit WL Persuasion, Communication Institute for Online Scholar­ship, www.cios.org/encyclopedia/persuasion/Esocial_judgment_1theory.htm (accessed 6 February 2015). Google Scholar
  • Bernstein DA, Penner LA, Clarke-Stewart A, Roy EJ (2006). Psychology, Boston: Houghton Mifflin. Google Scholar
  • Bok D (2006). Our Underachieving Colleges: A Candid Look at How Much Students Learn and Why They Should be Learning More, Princeton, NJ: Princeton University Press. Google Scholar
  • Brown JD (2000). What is construct validity. Shiken: JALT Test Eval SIG Newsl 4, 8-12. Google Scholar
  • Brown JD (2005). Testing in Language Programs: A Comprehensive Guide to English Language Assessment, New York: McGraw-Hill. Google Scholar
  • Campbell CE, Nehm RH (2013). A critical analysis of assessment quality in genomics and bioinformatics education research. CBE Life Sci Educ 12, 530-541. LinkGoogle Scholar
  • Carmel JH, Yezierski EJ (2013). Are we keeping the promise? Investigation of students’ critical thinking growth. J Coll Sci Teach 42, 71-81. Google Scholar
  • Cawthorne N (2004). Witch Hunt: History of a Persecution, London: Chartwell. Google Scholar
  • Chabris C, Simons D (2010). The Invisible Gorilla, New York: Crown. Google Scholar
  • Cronbach LJ, Meehl PE (1955). Construct validity in psychological tests. Psych Bull 52, 281-302. MedlineGoogle Scholar
  • Ebert-May D, Derting TL, Hodder J, Momsen JL, Long TM, Jardeleza SE (2011). What we say is not what we do: effective evaluation of faculty professional development programs. BioScience 61, 550-558. Google Scholar
  • Ede A (2000). Has science education become an enemy of scientific rationality. Skeptical Inquirer 24, 48-51. Google Scholar
  • Erwin P (2014). Attitudes and Persuasion, New York: Psychology Press. Google Scholar
  • Eve RA, Dunn D (1990). Psychic powers, astrology and creationism in the classroom? Evidence of pseudoscientific beliefs among high school biology and life science teachers. Am Biol Teach 52, 10-21. Google Scholar
  • Facione PA (1990). Critical Thinking: A Statement of Expert Consensus for Purposes of Educational Assessment and Instruction, Millbrae, CA: California Academic Press. Google Scholar
  • Facione PA (2015). Critical Thinking: What It Is and Why It Counts San Jose, CA California Academic Press www.insightassessment.com/Resources/Tools-For-Teaching-For-and-About-Thinking/Critical-Thinking-What-It-Is-and-Why-It-Counts/Critical-Thinking-What-It-Is-and-Why-It-Counts-PDF (accessed 21 April 2015). Google Scholar
  • Facione PA, Sánchez CA, Facione NC, Gainen J (1995). The disposition toward critical thinking. J Gen Educ 44, 1-25. Google Scholar
  • Feynman RP, Leighton R (1985). “Surely You’re Joking, Mr. Feynman!”: Adventures of a Curious Character, New York: W.W. Norton. Google Scholar
  • Freeman S, Eddy SL, McDonough M, Smith MK, Okoroafor N, Jordt H, Wenderoth MP (2014). Active learning increases student performance in science, engineering, and mathematics. Proc Natl Acad Sci USA 111, 8410-8415. MedlineGoogle Scholar
  • Harris K, Stein B, Haynes A, Lisic E, Leming K (2014). Identifying courses that improve students’ critical thinking skills using the CAT instrument: a case study. Proceedings of the 10th Annual International Joint Conferences on Computer, Information, System Sciences, and Engineering 10, 1-4. Google Scholar
  • Hazen RM (2002). Why should you be scientifically literate. ActionBioscience www.actionbioscience.org/education/hazen.html (accessed 7 February 2015). Google Scholar
  • Hewitt PG, Lyons SA, Suchocki JA, Yeh J (2013). Conceptual Integrated Science, 2nd ed. Boston: Addison-Wesley. Google Scholar
  • Impey C, Buxner S, Antonellis J (2012). Non-scientific beliefs among undergraduate students. Astron Educ Rev 11, 1-12. Google Scholar
  • Impey C, Buxner S, Antonellis J, Johnson E, King C (2011). A twenty-year survey of science literacy among college undergraduates. J Coll Sci Teach 40, 31-37. Google Scholar
  • Johnson AC (2007). Unintended consequences: how science professors discourage women of color. Sci Educ 91, 805-821. Google Scholar
  • Johnson M, Pigliucci M (2004). Is knowledge of science associated with higher skepticism of pseudoscientific claims. Am Biol Teach 66, 536-548. Google Scholar
  • Keppel G (1982). Design and Analysis: A Researcher’s Handbook, 2nd ed. Englewood Cliffs, NJ: Prentice Hall. Google Scholar
  • Kim SY, Nehm RH (2011). A cross-cultural comparison of Korean and American science teachers’ views of evolution and the nature of science. Int J Sci Educ 33, 197-227. Google Scholar
  • Kuhn D (1993). Science as argument: implications for teaching and learning scientific thinking. Sci Educ 77, 319-337. Google Scholar
  • Lanning KV (1992). Investigator’s Guide to Allegations of “Ritual” Child Abuse, Quantico, VA: Federal Bureau of Investigation. Google Scholar
  • Lombrozo T, Thanukos A, Weisberg M (2008). The importance of understanding the nature of science for accepting evolution. Evol Educ Outreach 1, 290-298. Google Scholar
  • Maher JM, Markey JC, Ebert-May D (2013). The other half of the story: effect size analysis in quantitative research. CBE Life Sci Educ 12, 345-351. LinkGoogle Scholar
  • Messick S (1995). Validity of psychological assessment: validation of inferences from persons’ responses and performances as scientific inquiry into score meaning. Am Psychol 50, 741-749. Google Scholar
  • Miller JD (1998). The measurement of civic scientific literacy. Public Underst Sci 7, 203-223. Google Scholar
  • Miller JD (2012). What colleges and universities need to do to advance civic scientific literacy and preserve American democracy. Liberal Education 98, www.aacu.org/publications-research/periodicals/what-colleges-and-universities-need-do-advance-civic-scientific (accessed 7 February 2015). Google Scholar
  • Miller JD, Scott EC, Okamoto S (2006). Public acceptance of evolution. Science 313, 765-766. MedlineGoogle Scholar
  • Mnookin S (2011). The Panic Virus: A True Story of Medicine, Science, and Fear, New York: Simon & Schuster. Google Scholar
  • Momsen JL, Long TM, Wyse SA, Ebert-May D (2010). Just the facts? Introductory undergraduate biology courses focus on low-level cognitive skills. CBE Life Sci Educ 9, 435-440. LinkGoogle Scholar
  • Moore R, Cotner S (2009). Educational malpractice: the impact of including creationism in high school biology courses. Evol Educ Outreach 2, 95-100. Google Scholar
  • Morrison D (2011). Science denialism: evolution and climate change. Reports Natl Center Sci Educ 31, 1-10. Google Scholar
  • Nadelson LS, Southerland SA (2010). Examining the interaction of acceptance and understanding: how does the relationship change with a focus on macroevolution. Evol Educ Outreach 3, 82-88. Google Scholar
  • Nathan D, Snedeker M (2001). Satan’s Silence: Ritual Abuse and the Making of a Modern American Witch Hunt, Lincoln, NE: Author’s Choice Press. Google Scholar
  • National Academy of Sciences, National Academy of Engineering, and Institute of Medicine (2010). Rising above the Gathering Storm, Revisited: Rapidly Approaching Category 5, Washington, DC: National Academies Press. Google Scholar
  • National Science Foundation (2014). Science and Engineering Indicators 2014, Arlington, VA: National Science Board. Google Scholar
  • Nelson CE (2008). Teaching evolution (and all of biology) more effectively: strategies for engagement, critical reasoning, and confronting misconceptions. Integr Comp Biol 48, 213-225. MedlineGoogle Scholar
  • Nyhan B, Reifler J, Richey S, Freed GL (2014). Effective messages in vaccine promotion: a randomized trial. Pediatrics 133, E835-E842. MedlineGoogle Scholar
  • Offit PA (2011). Deadly Choices: How the Anti-Vaccine Movement Threatens Us All, New York: Basic. Google Scholar
  • Pascarella ET, Blaich C, Martin GL, Hanson JM (2011). How robust are the findings of academically adrift. Change 43, 20-24. Google Scholar
  • Peker D, Comert G, Kence A (2010). Three decades of anti-evolution campaign and its results: Turkish undergraduates’ acceptance and understanding of the biological evolution theory. Sci Educ 19, 739-755. Google Scholar
  • Pigliucci M (2007). The evolution-creation wars: why teaching more science just is not enough. McGill J Educ 42, 285-306. Google Scholar
  • Posner GJ, Strike KA, Hewson PW, Gertzog WA (1982). Accommodation of a scientific conception: toward a theory of conceptual change. Sci Educ 66, 211-227. Google Scholar
  • Reardon S (2011). Climate change sparks battles in classroom. Science 333, 688-689. MedlineGoogle Scholar
  • Rowe MP (2010, Ed. CF HerreidNA SchillerKF Herreid, Tragic choices: autism, measles, and the MMR vaccine In: Science Stories: Using Case Studies to Teach Critical Thinking, Arlington, VA: NSTA Press, http://sciencecases.lib.buffalo.edu/cs/collection/detail.asp?case_id=576&id=576 (accessed 7 February 2015). Google Scholar
  • Rowe MP (2015). Crazy about cryptids! An ecological hunt for Nessie and other legendary creatures. National Center for Case Study Teaching in Science In: , http://sciencecases.lib.buffalo.edu/cs/collection/detail.asp?case_id=779&id=779 (accessed 26 June 2015). Google Scholar
  • Rutherford FJ, Ahlgren A (1990). Science for All Americans, Oxford, UK: Oxford University Press. Google Scholar
  • Rutledge ML, Sadler KC (2007). Reliability of the measure of acceptance of the theory of evolution (MATE) instrument with university students. Am Biol Teach 69, 332-335. Google Scholar
  • Rutledge ML, Warden MA (1999). The development and validation of the measure of acceptance of the theory of evolution instrument. Sch Sci Math 99, 13-18. Google Scholar
  • Rutledge ML, Warden MA (2000). Evolutionary theory, the nature of science and high school biology teachers: critical relationships. Am Biol Teach 62, 23-31. Google Scholar
  • Sagan C (1996). The Demon-haunted World: Science as a Candle in the Dark, New York: Ballantine. Google Scholar
  • Sam Houston State University (2009). QEP: SHSU: Foundations of Science www.shsu.edu/qep/documents/QualityEnhancementPlanCombined.pdf (accessed 7 February 2015). Google Scholar
  • Schick T, Vaughn L (2014). How to Think about Weird Things: Critical Thinking for a New Age, New York: McGraw-Hill. Google Scholar
  • Seymour E, Hewitt NM (1997). Talking about Leaving: Why Undergraduates Leave the Sciences, Boulder, CO: Westview. Google Scholar
  • Sinatra G, Brem S, Evans EM (2008). Changing minds? Implications of conceptual change for teaching and learning about biological evolution. Evol Educ Outreach 1, 189-195. Google Scholar
  • Sinatra GM, Southerland SA, McConaughy F, Demastes JW (2003). Intentions and beliefs in students’ understanding and acceptance of biological evolution. J Res Sci Teach 40, 510-528. Google Scholar
  • Smith MU (2010a). Current status of research in teaching and learning evolution: I. Philosophical/epistemological issues. Sci Educ 19, 523-538. Google Scholar
  • Smith MU (2010b). Current status of research in teaching and learning evolution: II. Pedagogical issues. Sci Educ 19, 539-571. Google Scholar
  • Snyder TD, Dillow SA (2013). Digest of Education Statistics, 2012, Washington, DC: National Center for Education Statistics. Google Scholar
  • Stein B, Haynes A (2011). Engaging faculty in the assessment and improvement of students’ critical thinking using the critical thinking assessment test. Change 43, 44-49. Google Scholar
  • Stein B, Haynes A, Redding M, Ennis T, Cecil M (2007, Ed. M Iskander, Assessing critical thinking in STEM and beyond In: Innovations in E-Learning, Instruction Technology, Assessment, and Engineering Education, Dordrecht, Netherlands: Springer. Google Scholar
  • Stein B, Haynes A, Redding M, Harris K, Tylka M, Lisic E (2010, Ed. K ElleithyT SobhM IskanderV KapilaMA KarimA Mahmood, Faculty driven assessment of critical thinking: national dissemination of the CAT instrument In: Technological Developments in Networking, Education and Automation, Dordrecht, Netherlands: Springer. Google Scholar
  • Tavris C, Aronson E (2007). Mistakes Were Made (but Not by Me), Orlando, FL: Harcourt. Google Scholar
  • Tennessee Technological University (2010). CAT Instrument Technical Information, www.tntech.edu/files/cat/reports/CAT_Technical_Information_V7.pdf (accessed 7 February 2015). Google Scholar
  • Trefil J, Hazen RM (2013). The Sciences: An Integrated Approach, 7th ed. Hoboken, NJ: Wiley. Google Scholar
  • Van Gelder T (2005). Teaching critical thinking. Coll Teach 45, 41-46. Google Scholar
  • Verhey SD (2005). The effect of engaging prior learning on student attitudes toward creationism and evolution. BioScience 55, 996-1003. Google Scholar
  • Wagler A, Wagler R (2013). Addressing the lack of measurement invariance for the measure of acceptance of the theory of evolution. Int J Sci Educ 35, 2278-2298. Google Scholar
  • Wakefield A, Murch SH, Anthony A, Linnell J, Casson DM (1998). Ileal-lymphoid-nodular hyperplasia, non-specific colitis, and pervasive developmental disorder in children. Lancet 351, 637-641. MedlineGoogle Scholar
  • Walker WR, Hoekstra SJ, Vogl RJ (2002). Science education is no guarantee of skepticism. Skeptic 9, 24-27. Google Scholar