|
Article Excerpt A method for estimating criterion validity of scales with homogeneous components is outlined. It accomplishes point and interval estimation of interrelationship indices between composite scores and criterion variables and is useful for testing hypotheses about criterion validity of measurement instruments. The method can also be used with missing data.
**********
Counseling and developmental research critically depends on quality of measurement. The validity coefficient is a major index of this quality because it reflects the degree to which an instrument assesses what it purports to evaluate. For this reason, validity of measurement has received a great deal of attention from methodologists and substantive researchers during the past century (e.g., Crocker & Algina, 1986). With the high popularity of scales, composites, subscales, tests, testlets, inventories, questionnaires, or self-reports in the behavioral and social disciplines, the issue of their quality has been of special importance and the focus of substantial interest in the literature across a number of decades. Reliability of multiple-component instruments seems to have attracted the larger share of this interest (e.g., Raykov, 2001, and references therein). Unlike reliability, scale validity and especially approaches to its routine point and interval estimation in empirical settings seem to have been the concern of considerably fewer treatments.
This article contributes to dealing with the aforementioned imbalance in the literature. The goal is to describe a method for estimation of criterion validity of overall scores obtained from unidimensional measurement instruments. The approach permits one to routinely obtain (a) point estimates for relationship indices of composites with given criterion variables as well as (b) ranges of plausible values for these indices in studied subject populations. The procedure is readily applicable with the increasingly popular latent variable modeling program Mplus (L. K. Muthen & Muthen, 2006) and allows researchers to evaluate, on a regular basis, the degree to which a scale under consideration relates to a criterion measure. In addition, the proposed method can easily be used for testing hypotheses about scale validity with congeneric measures and can also be directly utilized with missing data (under the assumptions indicated as follows). In conjunction with earlier methods for estimating scale reliability that are based on covariance structure analysis (e.g., Bollen, 1989; Raykov, 1997), the approach described in the following section can contribute significantly to developing increasingly valid and reliable measurement instruments.
BACKGROUND AND NOTATION
To accomplish the aims of this article, I had to deal with an empirical situation that frequently occurs in counseling research, that is, when there is a need to determine the criterion validity of a composite consisting of a given set of measures. Let us denote such measures by [X.sub.1], [X.sub.2],..., [X.sub.k] and assume that they are congeneric (k > 1; Joreskog, 1971). This testable assumption means that the scale components assess the same underlying latent dimension, designated [xi], with possibly different units and origins of measurement as well as error variances. Hence,
[X.sub.i] = [T.sub.i] + [[epsilon].sub.i] = [[alpha].sub.i] + [[beta].sub.i][xi] + [[epsilon].sub.i] (1)
holds (i = 1, 2,..., k), where [T.sub.1], [T.sub.2],...., [T.sub.k] and [[epsilon].sub.1], [[epsilon].sub.2],..., [[epsilon].sub.k] are respectively the true and error scores of the consecutive measures (e.g., [xi] = [T.sub.1] can be taken; Lord & Novick, 1968). (If k = 2, additional identifying restrictions will be needed, such as indicator loading equality [true score-equivalent measures] and/or error variance equality [e.g., parallel measures; Lord & Novick, 1968]. Furthermore, because the intercept parameters are not consequential for validity, for convenience, I assume them all equal to zero [e.g., Bollen, 1989]. Similarly, I assume that not all [beta] and error variance parameters vanish simultaneously, a condition easily fulfilled in empirical research.) For model identification purposes, I assume Var([xi]) = 1, where Var(.) denotes variance in a studied subject population, and note that Cov([xi], [[epsilon].sub.i]) = holds with Cov(.,.) symbolizing covariance (i = 1, 2,..., k; e.g., Zimmerman, 1975). For the sake of simplicity in the following developments, I assume that the measurement errors are uncorrelated. However, this assumption can be relaxed and the method outlined in the following paragraph can easily be extended to cover the alternative case of correlated errors as well (see the following paragraph).
In this article, I am interested in the composite sum score Z = [X.sub.1] + [X.sub.2] + ... + [X.sub.k], which is very frequently used in behavioral studies as an overall indicator of a latent construct of concern. Specifically, I focus on the validity...
|