|
Article Excerpt 1. INTRODUCTION
Let [X.sub.i] = ([X.sub.i1],...,[X.sub.ip])' for i = 1,...,n be independent p-dimensional random vectors, each from a distribution with location vector [mu] = ([[mu].sub.1],...,[[mu].sub.p]). We assume that each [X.sub.i] is directionally symmetric; that is, ([X.sub.i] - [mu])/||[X.sub.i] - [mu]|| and -([X.sub.i] - [mu])/||[X.sub.i] - [mu]|| are identically distributed, where ||X|| = ([x.sub.1.sup.2] + ... + [x.sub.p.sup.2])[.sup.1/2]. However, we do not assume that the vectors of observations themselves have the same distribution. These assumptions hold for a random sample from an elliptical distribution, of which the multivariate normal distribution is a special case. They also hold for any symmetric distribution for which ([X.sub.i] - [mu]) and -([X.sub.i] - [mu]) are identically distributed. However, the class of directionally symmetric distributions is more general because it also includes different kinds of skewed distributions. Consequently, [mu] may be, but is not necessarily, the mean vector. We wish to test the hypotheses
[H.sub.0] : [mu] = vs. [H.sub.1] : [mu] [greater than or equal to] 0, with at least one strict inequality, (1)
where [greater than or equal to] is to be taken on a componentwise basis, that is, ([[mu].sub.1] [greater than or equal to] 0,...,[[mu].sub.p] [greater than or equal to] 0).
Assuming that the observations come from a multivariate normal population [N.sub.p]([mu], [SIGMA]), different testing procedures have been proposed. Perlman (1969) derived the likelihood ratio test (LRT) when the covariance matrix is unknown. The null distribution of the LRT depends on the unknown covariance matrix, but Perlman gave a bound for the critical point that can be used to perform a conservative test at a given level. However, the LRT has been criticized for the resulting loss of power. Perlman and Wu (1999) came to its defense, but judging from the discussion that follows their article, it seems that the matter has not been resolved. Silvapulle (1995) proposed a test similar to that of Hotelling that is asymptotically equivalent to the LRT. Wang and McDermott (1998a) proposed a conditional version of the LRT. This test can be difficult to use in practice because it requires the numerical evaluation of integrals. Wang and McDermott (1998b) used a similar approach to derive a conditional Hotelling's test. Perlman and Wu (2002) proposed another conditional version of the LRT that is simpler to implement. Schucany, Frawley, Gray, and Wang (1999) proposed using the bootstrap to estimate the P value of the LRT. As long as the computation of the LRT statistic is practical, this is a good way to avoid the use of the conservative critical point of Perlman, and it also relaxes the normality assumption by letting the test adapt itself to the data. Sen and Tsai (1999) proposed a union-intersection (UI) test related to the LRT. As with the LRT, the direct application of this method results in a conservative test. However, Sen and Tsai proposed a two-stage version of their test and of the LRT, both of which are unbiased. A possible disadvantage of this approach is that it requires an arbitrary choice of two quantities, a first-stage sample size and a p X p matrix. With the exception of the bootstrap LRT that can adapt itself to other distributions, all the tests mentioned previously rely on the normality assumption. To make the procedure less dependent on the normality assumption, Mudholkar, Kost, and Subbaiah (2001) proposed stepwise tests and robust stepwise tests based on trimmed means.
Another approach consists of utilizing a procedure to test [H.sub.0] against an alternative different from, but related to, [H.sub.1] and then see how it performs for the one-sided problem. In that vein, Tang (1994) investigated tests for half-space alternatives, and Follmann (1996) proposed a simple and intuitive procedure to test [H.sub.0] against another related alternative specifying that the sum of the means is greater than 0.
Sen and Silvapulle (2002) gave an interesting review of some aspects of inference under inequality constraints, which includes the one-sided problem.
As far as we know, no tests exhibiting a finite-sample distribution-free property, under [H.sub.0], over a large class of models have been proposed for the multivariate one-sided problem in the literature. This article aims to fill that gap. We propose a multivariate sign test for the one-sided alternative. Under the null hypothesis, the test statistic is conditionally distribution-free under very mild assumptions. The conditional distribution can be used to compute conditional P values. We provide a simple way to compute the test statistic and give a characterization of its conditional null distribution. We also provide a step-by-step procedure that can be used to implement the test in practice. For bivariate data, we give an explicit formula for the conditional null distribution of the test statistic. The performance of the test against some competitors is investigated with an extensive simulation study and a real data example is presented.
The test statistic is presented in Section 2. Section 3 shows how to make the test statistic conditionally distribution-free under the null hypothesis. A characterization of its conditional null distribution is also given. The results from a simulation study are presented in Section 4. An illustration with real data and some concluding remarks are given in Section 5.
2. TEST STATISTIC
Let [S.sup.p] = {a [member of] [R.sup.p] : ||a|| = 1} be the set of p-dimensional unit vectors. For any a [member of] [S.sup.p], let [Q.sub.i](a) = a'[X.sub.i] for i=1,...,n. The scalar [Q.sub.i](a) represents the signed length of the projection of [X.sub.i] on the directed line passing through a. We will simply refer to those as the projected points on the line a for the rest of this article. Let [psi] be the function defined by [psi](u) = 1 or as u > or u [less than or equal to] 0. Define
S(a) = [n.summation over (i=1)] [psi]([Q.sub.i](a)) (2)
to be the number of positive observations among the projected points on the line a. The statistic S(a) is thus equivalent to the sign statistic computed on the projected points. Assume throughout that, for any a [member of] [S.sup.p], P([Q.sub.i](a) = 0) = for any i. Then, under the hypothesis [H.sub.0], [psi]([Q.sub.1](a)),...,[psi]([Q.sub.n](a)) are independent random variables taking the values or 1, each with probability 1/2.
Consider briefly the problem of testing [H.sub.0] against the usual unrestricted alternative
[H*.sub.1] : [mu] [not equal to] 0. (3)
For bivariate data, Hodges (1995) was the first to propose a sign test for this problem. His test is based on
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (4)
The basic idea behind this approach is that, for a given line a, a high (or low) value of S(a) provides evidence against the null hypothesis. The test statistic simply takes the maximum of those values over all lines. For bivariate data, this statistic is strictly distribution-free under [H.sub.0] for directionally symmetric distributions. Larocque, Tardif, and van Eeden (2000) studied different ways of averaging the values of S(a), as a varies, to construct a test statistic. Chaudhuri and Sengupta (1993) and Neuhaus and Zhu (1999) studied the statistic (4) in the general p-dimensional case. Neuhaus and Zhu showed that this test statistic is strictly distribution-free under [H.sub.0] for distributions with elliptical directions, a class of distributions introduced by Randles (1989), which is slightly less general than the class of directionally symmetric distributions. More precisely, [X.sub.i] is said to come from a distribution with elliptical directions if there exists a p X p nonsingular matrix D such that D([X.sub.i] - [mu])/||D([X.sub.i] - [mu])|| is distributed uniformly on [S.sup.p].
Now consider the one-sided problem (1). Roy's (1953) union-intersection (UI) principle can be used to motivate our test statistic; see also Sen and Silvapulle (2002). Following Sen and Tsai (1999), let b [member of] [DELTA] = {b [member of] [R.sup.p] : b [greater than or equal to] 0, ||b|| > 0}. If we define
[H.sub.0,b] : b'[mu] = and [H.sub.1,b] : b' [mu] > 0,
then
[H.sub.0] = [[intersection].[b[member of][DELTA]]] [H.sub.0,b] and [H.sub.1] = [[union].[b[member of][DELTA]]] [H.sub.1,b].
For each b [member of] [DELTA], a one-sided univariate test statistic T(b) can be used to test [H.sub.0,b] against [H.sub.1,b], and then a test for testing [H.sub.0] against [H.sub.1] can be based on [sup.sub.b[member of][DELTA]] T(b). If the univariate statistic is scale invariant, we can restrict the search of the supremum to the vectors satisfying ||b|| = 1. Then the test statistic reduces to [sup.sub.a[member of][S.sub.+.sup.p]] T(a), where T(a) is the univariate test statistic computed with the projected points [Q.sub.1](a),...,[Q.sub.n](a) and [S.sub.+.sup.p] = {a [member of] [R.sup.p] : ||a|| = 1, a [greater than or equal to] 0}.
Sen and Tsai (1999) proposed such a test by using the univariate t test. Here, we propose using the sign test instead. Our test can also be seen as an adaptation of the statistic (4) for the one-sided problem and is based on
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (5)
The hypothesis [H.sub.0] will be rejected for large values of S. The statistic S is scale and permutation invariant. This means that the value of S remains unchanged if we compute it using the original sample [X.sub.1],...,[X.sub.n] or any sample of the type cF[X.sub.1],...,cF[X.sub.n], where c > is a scalar and F is a p X p permutation matrix.
The statistic S can be quickly and easily calculated when p...
|