American Sociological Association Statement on the Use of SRIs for Retention and Promotion
September 2019
Most faculty in North America are evaluated, in
part, on their teaching effectiveness. This is
typically measured with student evaluations of
teaching (SETs), instruments that ask students to
rate instructors on a series of mostly closedended
items. Because these instruments are
cheap, easy to implement, and provide a simple
way to gather information, they are the most
common method used to evaluate faculty
teaching for hiring, tenure, promotion, contract
renewal, and merit raises.
Despite the ubiquity of SETs, a growing body of
evidence suggests that their use in personnel
decisions is problematic. SETs are weakly related
to other measures of teaching effectiveness and
student learning (Boring, Ottoboni, and Stark
2016; Uttl, White, and Gonzalez 2017); they are
used in statistically problematic ways (e.g.,
categorical measures are treated as interval,
response rates are ignored, small differences are
given undue weight, and distributions are not
reported) (Boysen 2015; Stark and Freishtat
2014); and they can be influenced by course
characteristics like time of day, subject, class
size, and whether the course is required, all of
which are unrelated to teaching effectiveness.
In addition, in both observational studies and
experiments, SETs have been found to be biased
against women and people of color (for recent
reviews of the literature, see Basow and Martin
2012 and Spooren, Brockx, and Mortelmans
2015). For example, students rate women
instructors lower than they rate men, even when
they exhibit the same teaching behaviors
(Boring, Ottoboni, and Stark 2016; MacNell,
Driscol, and Hunt 2015), and students use
stereotypically gendered language in how they
evaluate their instructors (Mitchell and Martin
2018).
........
3. SETs should not be used to compare
individual faculty members to each other or
to a department average. As part of a holistic
assessment, they can appropriately be used
to document patterns in an instructor’s
feedback over time.
4. If quantitative scores are reported, they
should include distributions, sample sizes,
and response rates for each question on the
instrument (Stark and Freishtat 2014). This
provides an interpretative context for the
scores (e.g., items with low response rates
should be given little weight).
5. Evaluators (e.g., chairs, deans, hiring
committees, tenure and promotion
committees) should be trained in how to
interpret and use SETs as part of a holistic
assessment of teaching effectiveness (see
Linse 2017 for specific guidance).
Comments
Post a Comment