Grade Inflation

Student evaluations

Only about 30 percent of institutions used student evaluations in 1973 (Wilson, 1998). Today, student ratings have gained widespread acceptance as a measure of teaching effectiveness in North American colleges and universities. Almost all post-secondary institutions have some sort of plan for student evaluations of teaching effectiveness, which generally refers to the degree to which a teacher facilitates students to achieve educational goals (McKeachie, 1986). Even when other data are available, student evaluations are assumed to be a better measure of teaching effectiveness because only students observe the professor throughout a course (Howard, Conway, and Maxwell, 1985). Results of student evaluations are used both as diagnostic feedback to instructors and as evidence in decisions on faculty retention, tenure, and promotion (Murray, 1988).

Some writers (e.g., Lichty, Vose, and Peterson, 1978; Zangenehzadeh, 1988) assume that student evaluations of faculty are among the main factors generating grade inflation. Correa (2001) concluded that excessive reliance on student evaluations is indeed likely to reduce academic standards and student achievement and to promote grade inflation. Trout (1997a) contends that "course evaluations contribute significantly to grade inflation in a dumbed-down curriculum" (p. 51). In a US national survey of deans of colleges of education and of colleges of arts and sciences, over 70 percent of the respondents agreed that the use of student evaluation as a consideration for promotion and tenure was a major reason for grade inflation (Nelson and Lynch, 1984).

It seems that the beginnings of grade inflation paralleled student clamour for more say in their education. It was also in the mid-1970s that the terms of the compact between faculty and their institutions shuffled as the ideals of corporate America moved into academe. Evaluation of faculty by students emerged as one response. Evaluations allowed students more involvement in their education. Once in place, corporate tacticians could impose draconian standards for tenure and promotion, in part by elevating student evaluations to a primary role. Hence, Murray (1988) reports that while students believe that their evaluations are largely ignored, many faculty members believe that the use of student ratings in personnel decisions causes teachers to inflate grades and weaken instructional content in an attempt to buy positive evaluations from students.

A body of early research on student ratings concluded generally that ratings provide reliable and valid information on instructor effectiveness (e.g., Costen, Greenough, and Menges, 1971). Other studies questioned the effectiveness of student evaluations noting that they "have only modest agreement with some criteria of effective teaching" (Marsh, 1984, p. 749). In addition, the question of biasing influences on student objectivity was inconclusive (see Stumpf and Freedman, 1979). Variables which may impair the validity of student ratings are multiple and include gender, grading leniency, course difficulty, instructor popularity, student interest, course workload, class size, reasons for taking the course, and students' GPAs (see Blunt, 1991). Today, with the increased emphasis on the use of student evaluations for critical personnel decisions, less sanguine results are emerging. Research finds that instructors' evaluations of students, among other variables, are a source of contamination of student ratings of instructor performance (Blunt, 1991; Chacko, 1983; Stumpf and Freedman, 1979).

It is now well established that students' evaluative ratings of instruction correlate positively with expected course grades (Greenwald and Gillmore, 1997); studies in various disciplines have shown a significant correlation between student ratings of instructors and the grades expected by students (e.g., Cashin, 1988; Goldberg and Callahan, 1991; Hudson, 1989). A study of the University of Washington, for example, found that professors who were easy graders received better student evaluations than did professors who were tougher (Archibold, 1998; Wilson, 1998). Similarly, Brodie's 1998 correlational study of 1,939 student evaluations from 75 first-year university classes representing 15 disciplines found that even though grading leniency decreases learning, easy courses received high student evaluations. As well, in different laboratory experiments researchers (Perkins, Guerin, and Schleh, 1990; Snyder and Clair, 1976) found that students who were randomly assigned higher grades rated the professor higher than students who were assigned lower grades.

Certainly, undergraduates can be sincere in their comments, offering praise and acknowledgement. However, Lundrum (1999) found that students do not discriminate well between evaluating the course, the instructor, and their own performance. Sometimes, their comments reflect personal items, not teaching, or are used to punish a professor (see Wilson, 1998) particularly in light of the new wave of students who are quick to criticize high grading standards (Trout, 1998).

Brodie (1998) states that "by themselves high student evaluations do not indicate that a professor is an effective teacher" (p. 17, original punctuation). Nevertheless, despite intense disagreements over whether student evaluations actually address teaching effectiveness, they are widely used by administrators to judge faculty so that student evaluations of teaching impact seriously on the institutional reward system. When rigorous learning and assessment is equated with professional shortcomings, rigorous graders can become casualties of a system where the stress has become keeping the customer satisfied. If faculty are pressured to conform to student expectations, or face retribution, grade inflation may emerge.

© Margret Winzer, 2005. This site last updated: May 3, 2005