|
Article Excerpt 1. Introduction
Recent advances in sensing technology mean that currently hundreds of process characteristics can be measured for modern manufacturing processes and these can be used to determine product quality. The process measurements are usually correlated and multivariate Statistical Process Control (SPC) charts have been developed that allow the determination of whether the correlated process variables are simultaneously in an in-control state. Once an out-of-control signal is triggered by a multivariate SPC chart, a search for the root causes is instigated to identify the abnormal variables or combinations of variables so that the process can be restored to an in-control condition.
It is frequently important to simultaneously monitor the mean shifts of the multiple data steams. For a specified false alarm rate [alpha], one approach to the multivariate monitoring problem is to construct individual [alpha]-level control charts for cach of the p variables under consideration. However, it is well known that such an approach is unsatisfactory since it ignores the correlation between the variables and can allow the overall false alarm rate to be much larger than [alpha]. On the other hand, if an individual false alarm rate of [alpha]/p is used, then the Bonferroni inequality ensures that the overall false alarm rate is less than the nominal level [alpha]. This procedure, however, is not sensitive enough to detect many process shifts since the actual overall false alarm rate tends to be much smaller than [alpha] due to the correlation between variables.
To simplify the need to monitor p variables simultaneously, a simple alternative is to monitor a single aggregate measure which signals when inappropriate behavior occurs and then apply diagnostic methods to identify which variables are responsible for the signal. Hotelling's [T.sup.2] chart (Hotelling, 1947) is such an alternative that follows a[[chi].sup.2] distribution when the process variables are normally distributed. To locate abnormal variables following a signal from the [T.sup.2] chart, different signal decompositions have been developed to identify the major contributors to the alarm. Jackson (1985) proposed a decomposition based on Principal Components (PCs) to identify the most influential linear combinations of the variables. Although the identified directions of the PCs can sometimes have physical interpretations, they are not always meaningful to the experimenter. To attribute signals to individual variables, Doganaksoy et al. (1991) discussed simultaneous test procedures for diagnosing individual variables. Mason et al. (1995) developed a bottom-up orthogonal [T.sup.2] decomposition, which, however, depends on a prespecified order of variables and is therefore hard to implement unless shift structures are known a priori (Mason et al, 1997).
Several multivariate methods have been developed to simultaneously monitor and identify multiple abnormal variables. Hawkins (1991. 1993) proposed a regression-adjusted chart by implicitly assuming only one shift variable. Hayter and Tsui (1994) suggested a naive procedure--the M chart--based on the construction of exact simultaneous confidence intervals for each of the variable means. On the other hand, Runger (1996) introduced an improvement of the [T.sup.2] chart--the [U.sup.2] chart--based on a priori knowledge of the potential shift space, which is unknown in many situations. The [U.sup.2] chart is essentially a conventional [T.sup.2] chart in a reduced space, which creates an improved statistical performance. When process fault information is not available, Runger et al (2007) presented a process-oriented basis representation to express process variables as linear combinations of fault patterns and further developed optimal [U.sup.2] charts computed from the process-oriented coefficients. Using PCs, Gonzalez and Sanchez (2008) investigated principle alarms by identifying the variable factors and simulated fault patterns for out-of-control situations.
The above multivariate control charts have serious shortcomings in practice as pointed out in Jiang and Tsui (2008). Specifically, although Hotelling's [T.sup.2] chart possesses certain optimal properties in terms of average run length, it cannot be directly used for fault diagnosis. The regression-adjusted chart may perform extremely poorly when the shift pattern aligns with the correlation structure among variables, although it has a better detection performance when only one variable shifts. The M chart ignores correlations among variables and performs unsatisfactorily in most cases when only a small portion of the variables change. The [U.sup.2] chart cannot be used efficiently without a priori information on the non-fault variables. This paper proposes an extension of the [U.sup.2] charts by utilizing the idea of M charts to adaptively capture shift information from sample measurements. Unlike the conventional [U.sup.2] chart, the proposed procedure is data oriented which allows the sample data to project the fault information. Different from Doganaksoy et al. (1991), where the M statistic is used for fault diagnosis following [T.sup.2] signals, the proposed procedure takes advantage of pretest information provided by the M statistic to construct an adaptive [U.sup.2] statistic. Without confusion, we will use adaptive [T.sup.2] chart and adaptive [U.sup.2] chart interchangeably.
There is no chart that can uniformly outperform other charts in detecting shift pattern (Anderson, 1984; Jiang and Tsui, 2008). It is therefore not practical to design a control chart that can be the best performer in any scenario, however, it is theoretically possible to construct a control chart that may perform close to the best in most scenarios. It is shown later that the proposed adaptive [T.sup.2] chart is not the best in terms of run length performance compared with the above individual multivariate control charts in some scenarios. The contribution of the proposed method is that it includes Hotelling's [T.sup.2] chart and the M chart as special cases, and is robust in detecting various shift patterns while its run length performance can be tuned to be competitive with the best chart when parameters are appropriately chosen. It may not necessarily replace Hotelling's [T.sup.2] chart in practice, but may outperform the latter in most cases while safeguarding against worst-case situations. More importantly, it provides a reduced subset of variables ready for diagnosis once an out-of-control signal is triggered, which can significantly lower the burden of fault diagnosis in practice. Simulation results show that the fault diagnosis capability of the proposed approach is often superior to other charts, especially when fault patterns match the in-control process variations (Gonzalez and Sanchez, 2008).
The rest of this paper is organized as follows. In Section 2, several popular multivariate control charts are briefly reviewed and the integration of SPC monitoring and fault diagnosis is elaborated. In Section 3, an adaptive [T.sup.2] control chart is proposed to integrate pre-test information into process monitoring. In Section 4, Average Run Length (ARL) performance and fault diagnosis probabilities of these control charts are extensively evaluated when the vector dimension is two. In Section 5, an illustrative example is presented to compare various control chart for monitoring high-dimensional vectors. Concluding remarks and future research issues are discussed in Section 6.
2. Multivariate process monitoring and diagnosis
Assume a p-dimensional process measurement x = ([x.sub.1], ..., [x.sub.p])' is normally distributed as N([mu], [SIGMA]) with probability density:
L(x) = [(2[pi]).sup.[-p/2]][[absolute value of [SIGMA]].sup.[-1/2]]exp[-[1/2](x - [mu])'[[SIGMA].sup.-1](x - [mu])], (1)
where [mu] = ([[mu].sub.1], [[mu].sub.2], ..., [[mu].sub.p])' is the mean vector and [SIGMA] = [([[sigma].sub.ij]).sub.[pxp]] is the covariance matrix. In this paper, we assume that [mu] and [SIGMA] are known or can be estimated from an initial large pool of observations and we are interested in detecting changes in the mean vector [mu]. Without loss of generality, assume [mu] = for simplicity.
To test if the process measurement x has a mean vector 0, without knowing the fault patterns a priori, Jiang and Tsui (2008) shows that Hotelling's [T.sup.2] chart statistic
[T.sup.2](x) = x'[[SIGMA].sup.-1]x, (2)
can be derived from generalized likelihood ratio test principles and is the most powerful "affine invariant" test regarding any parameterizations or linear transformations (Anderson, 1984). It issues a signal when [T.sub.2] > [c.sub.T] with [c.sub.T] = [[chi].sub.[p,[alpha]].sup.2], the (1 - [alpha]) percentage value of the [[chi].sub.p.sup.2], distribution. When [SIGMA] is unknown and replaced by the sample covariance matrix [^.[SIGMA]] obtained from a sample of size n, Hotelling's statistic when multiplied by constant
[[n(n - p)]/[p(n + 1)(n - 1)]]
follows a distribution F(p, n - p) and the critical value can be obtained accordingly (Tracy et al., 1992). Although this paper is mainly concerned with the situation of individual observation x, it is not difficult to generalize the presented results to the sample mean [bar.x] with subgroups of size m. The [T.sup.2] statistic becomes [T.sup.2] = m[bar.x]'[[SIGMA].sup.-1][bar.x], where [bar.x] ~ N(0, [SIGMA]/m) correspondingly.
To facilitate fault diagnosis, Hayter and Tsui (1994) proposed an M chart which used the construction of exact simultaneous confidence intervals for each of the variable means to provide an exact false alarm rate of [alpha], i.e., a signal is triggered when:
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (3)
where the critical point [c.sub.M] is defined by
P([absolute value of [X.sub.i]]/[square root of ([[sigma].sub.ii])] [less than or equal to] [c.sub.M]; 1 [less than or equal to] i [less than or equal to] p) = 1 - [alpha]. (4)
Variables that exceed the control limit are deemed to be out of control. To take into account correlations...
|