|
Article Excerpt The authors illustrate how a Rasch model can guide the development of a new affective measurement instrument--the Learning to Teach for Social Justice--Beliefs scale. The results provide strong evidence of a meaningful continuum of attitudes about teaching for social justice ranging from those easier to endorse to those more difficult to endorse.
**********
The idea of preparing teachers to teach for social justice is prevalent in a loosely related collection of teacher preparation programs, partnerships, grassroots teacher and community groups, and other initiatives in the United States and elsewhere. Despite national attention, however, there is considerable variation in meanings of the phrase teacher education for social justice, and, in general, this has not been a well-theorized term (North, 2006). Very generally speaking, however, most definitions (e.g., Adams, Bell, & Griffin, 1997; Cochran-Smith, 1999, 2004; Michelli & Keiser, 2005; Oakes & Lipton, 1999; Sleeter, 1996; Villegas & Lucas, 2002; Zeichner, 2003) have in common explicit recognition of the marked disparities in educational opportunities, resources, achievement, and long-term outcomes between minority and low-income pupil groups and their White, middle-class peers. This is coupled with the position that teachers have the potential to be both educators and activists committed to the democratic ideal and to reducing the inequities in American society. Teacher education for social justice, then, is teacher preparation deliberately designed to provide the social, intellectual, and organizational contexts to foster teaching for social justice in schools accommodating students in kindergarten through 12th grade (K-12).
Teaching for social justice in K-12 schools has as its primary consideration promoting pupils' learning (academic, social, emotional, and civic) and enhancing pupils' life chances, including challenging the structures, curriculum, labels, and school arrangements that limit or inhibit life chances. This agenda builds on a wide-ranging body of scholarship, practice, and grassroots efforts, including multicultural theory and pedagogy; research on effective practices in diverse classrooms; critical analyses of education and society; research on culture, language, and identity; organization at the grassroots community level to change schools; and theories related to the role of education in democratic societies. Teaching for social justice builds on and requires knowledge (i.e., knowledge of content, pedagogy, learners, cultures, schooling, communities, as well as knowledge of self), interpretive frameworks (i.e., ways of understanding and acting on the events and processes of schooling based on the integration of knowledge with beliefs, values, ethics, moral commitments, and attitudes), and practices (including subject-specific pedagogies and strategies for supporting the learning process of English language learners [ELL], pupils with special needs, and pupils from a range of socioeconomic backgrounds). Teaching for social justice also involves teacher commitment to being part of larger social movements by working as advocates and activists for their pupils.
In this article, we assume that teaching for social justice is a legitimate and measurable outcome of teacher education. The purpose of this article is to present evidence of the extent to which this assumption has been met. Specifically, we present the psychometric characteristics of the Learning to Teach for Social Justice-Beliefs scale. This includes the operational definition of the construct, the item development and pilot testing procedures, item analysis results of both classical test theory (CTT) and item response theory (IRT; Rasch) procedures, and evidence of discriminant validity.
METHOD
Variable Definition
Learning to teach for social justice is conceptualized in terms of six core components: teachers' knowledge, skill, and interpretive frameworks; teachers' beliefs, attitudes, and values; classroom practice and pedagogy; community participation; teachers' learning in inquiry communities; and promoting pupils' academic, social-emotional, and civic learning. In developing the scale to measure this variable, the Survey Team of the Boston College-Teachers for a New Era (BC-TNE) project (Ludlow et al., 2007) came to a common understanding that any variable, even one as complex as learning to teach for social justice, may be conceptualized as a continuum along which people differ. In the academic setting addressed in this study, this means that teachers would differ in the extent to which they understand, accept, and are prepared to teach in ways consistent with the social justice principles just described.
Measurement Models
The current field of psychometrics relies on two primary measurement models, CTT and IRT. CTT is based on the simple yet powerful concept that an individual's observed score, defined as the total score on some measurement instrument, is made up of two unobservable, theoretical components: a true score and an error score (X = True + Error; Gulliksen, 1950; Lord & Novick, 1968; Spearman, 1904). Although the true score is never actually known, it is possible to generate estimates of the extent to which measurement error affected the observed score, thereby reducing the extent to which the true score is captured by the observed score. Hence, great effort is expended to estimate and reduce measurement error because the more measurement error can be reduced, the more confidence there is that the observed score accurately represents the true score.
The basic psychometric tools of CTT include factor analysis, reliability analyses (e.g., test-retest, internal consistency, and inter- and intrarater reliability), and validity analyses (e.g., content, variable, discriminant, divergent, predictive, consequential).
The limitations of CTT (e.g., the ability estimate of a person is dependent on the difficulty of the items, the standard error applies equally to all ability levels, item discrimination can be too high) have been widely recognized (Brennan, 2001; Hattie, Jaeger, & Bond, 1999; Masters, 1988; Traub, 1997; Wainer, 1986) and have led many investigators to use the principles underlying IRT. The IRT models differ, however, in how the probability of a specific response to a specific item is estimated (Hambleton, Swaminathan, & Rogers, 1991; Lord & Novick, 1968).
In general, IRT models are differentiated by the number of parameters associated with various item-specific characteristics. These characteristics are generally referred to as item difficulty, item discrimination, and item pseudoguessing parameters. IRT models that take each of these characteristics into account are referred to as one-parameter (1-PL, or the Rasch model), two-parameter (2-PL), and three-parameter (3-PL) logistic models, respectively (van der Linden & Hambleton, 1997).
The more significant distinction between these models, however, is that they differ in their fundamental purposes. Rasch models are used as confirmatory tests of the extent to which scales have been successfully developed according to explicit a priori measurement criteria. These criteria include the requirements that (a) items define a unidimensional continuum in the domain and (b) items follow a strictly hierarchical ordering in their definition of the domain. If the responses of study participants to the scale items suggest misfit to these criteria, particularly regarding hierarchical ordering, then the items are examined for the purpose of strengthening them--the Rasch measurement model is not discarded or modified. In contrast, 2-PL and 3-PL models are designed to maximize the extent to which item response variation can be accounted for--they are statistical models subject to reexpression in any way that reduces residual variation. Hence, 2-PL and 3-PL models will always fit any data set better than a Rasch model.
From a Rasch measurement perspective, better fit is not a sufficient reason for choosing an IRT model. Rasch models are preferred because they dictate the way analysts think about and subsequently construct measurement instruments. When the data fit the model, the continuous scale is analogous to a linear ruler that is invariant in terms of level of ease or difficulty of accomplishing the task for any individual appropriate for testing. With regard to measuring the variable learning to teach for social justice, a Rasch model was used, not because it fit the data better than any other model, but because when the data fit the model, teacher candidates could be ordered along a continuum based on their endorsements of simpler to more complex beliefs.
Although this notion may seem obvious to some, to measure a human characteristic well, one must know a great deal about it. This is true whether one's interests are in functional ability (Coster, Haley, Ludlow, Andres, & Ni, 2004; Coster, Ludlow, & Mancini, 1999; Haley, Ludlow, & Coster, 1993; Ludlow & Haley, 1992), test anxiety (Ludlow & Guida, 1991), reading ability (Ludlow & Hillocks,...
|