Home | Business News | Browse by Publication | J | Journal of the American Statistical Association

A hierarchical framework for modeling and forecasting web server workload.

Publication: Journal of the American Statistical Association
Publication Date: 01-SEP-05
Format: Online
Delivery: Immediate Online Access

Article Excerpt
1. INTRODUCTION

A web server farm is a cluster (or system) of servers shared by several websites and maintained by a host service provider. Computing resources are dynamically (rather than statically) allocated according to the specific needs at different websites. A service-level agreement is used to define the quality of service (QoS) for different classes of service requests and the revenue/cost model. In such an environment, accurate online prediction of server workload, with sufficiently long horizon, is key to ensuring that adequate computing resources are allocated in a timely manner to meet temporary increases of service requests of some classes at some sites while achieving certain systemwide performance objectives, such as maintaining the QoS requirements of the entire server farm or maximizing the total revenue of the server farm under the QoS constraints (Almeida and Menasce 2002; Chandra, Gong, and Shenoy 2003; Doyle, Chase, Asad, Jen, and Vahdat 2003; Pazel et al. 2002).

A key measure of server workload is the amount of service requests per unit time (or the request arrival rate). It can be, for example, the total size or number of files requested per unit time; it can also be the total number of operations requested per unit time. A time series of such requests usually fluctuates randomly over time, with different statistical characteristics over different time windows in different time granularities. A time series of requests can be stationary but self-similar (i.e., long-range dependent) and/or heavy-tailed over a small time window (in, e.g., seconds or minutes) at a fine time granularity (in, e.g., milliseconds) (Arlitt and Williamson 1997; Crovella and Bestavros 1997; Gallardo, Makrakis, and Angulo 2001; Kant and Won 1999); it can also exhibit strong daily patterns and nonstationarity over a bigger time window (in, e.g., days or weeks) at a coarser time granularity (in, e.g., minutes), where the nonstationarity may occur during weekends and holidays as well as in different periods of the day (Arlitt and Williamson 1997; Arlitt, Krishnamurthy, and Rolia 2001; Kant and Won 1999; Squillante, Yao, and Zhang 1999). Whereas the first type of data has been intensively studied for the purpose of improving the performance of routing and scheduling (load balancing), the second type of data plays an important role in designing optimal strategies of dynamic resource allocation for the purpose of efficiently and smoothly managing web server farms. The focus of this article is on the second type of data.

It has been shown that short-term prediction, with prediction horizons less than a few minutes, is useful in detecting anomalies of workload for early warning of service disruptions (Hellerstein, Zhang, and Shahabuddin 1999; Shen and Hellerstein 2000) and in mitigating temporary QoS deterioration by dynamic resource allocation (Chandra et al. 2003). In this article a hierarchical approach is developed to produce not only short-term prediction, but also long-term prediction, with lead time equaling 1 day or more, by exploiting the daily patterns in the time series of requests. Similar problems occur in other applications, such as electric utility management (Douglas, Breipohl, Lee, and Adapa 1998; Gross and Galiana 1987; Moghram and Rahman 1989; Nogales, Contreras, Conejo, and Espinola 2002; Park, Park, and Lee 1991; Ramanathan, Engle, Granger, Vahid-Araghi, and Brace 1997; Sargunaraj, Sen Gupta, and Devi 1997). Although the short-term prediction is needed to accommodate rapid fluctuations, the long-term prediction can be used to enhance the flexibility of dynamic resource management, because it reduces the magnitude (hence complexity) of short-term adjustment and provides sufficient lead time for strategic planning.

Although several methods have been considered for workload (or traffic) modeling and/or prediction based on traditional techniques, such as seasonal autoregressive integrated moving-average (SARIMA) (Basu, Mukherjee, and Klivansky 1996; Bolot and Hoschka 1996; Chandra and Eckberg 1997; Chen 2002), the method proposed in this article has the following desirable properties:

1. It provides not only point predictions, but also simultaneous confidence bands that can be used to support flexible service-level agreements (e.g., probability-based service guarantees) and the corresponding optimal strategies of resource allocation.

2. It has the capability of automatically handling nonstationarity in both daily patterns and short-term fluctuations, because it allows the model parameters to change with time.

3. It is computationally efficient, because it uses fast algorithms to compute the estimates of time-varying parameters on-line.

4. Its hierarchical approach allows easy diagnostic checking of model adequacy.

In the proposed method, the time series of requests is decomposed into a long-term component and a short-term component, where the former is represented by a linear combination of certain basis functions with random amplitudes and the latter is represented by a traditional time series model. The basis functions in the long-term component are selected to compress the daily patterns into a low-dimensional subspace suitable for efficient modeling and computation; the amplitudes of the basis functions are modeled as random processes to handle the trend and fluctuation of daily patterns. A simple multiregime model is used to deal with abrupt changes due to predetermined events such as weekdays and weekends. On-line adaptive algorithms are used to compute the estimates of time-varying parameters in both long-term and short-term models that are optimized specifically for the given prediction horizons. Key characteristics of service requests, such as time-dependent variability (heteroscedasticity), heavy-tailed non-Gaussian distribution, and residual serial dependence (correlation), are taken into account to construct simultaneous confidence bands for both long-term and short-term predictions.

2. HIERARCHICAL FRAMEWORK

Although the proposed method is applicable in more general situations, let us consider, for simplicity of presentation, an exemplary server farm where service requests are measured at the server level. Assume that the requests are recorded at each server and aggregated in nonoverlapping time intervals of length [DELTA] > to form a time series. An example is shown in Figure 1, where the time series comprises the logarithm of total file sizes of hypertext transfer protocol (HTTP) requests within every 5-minute interval (i.e., [DELTA] = 5 minutes) over a period of 13 days at a commercial website. This time series is used as a working example throughout the article. Note that the log transform is used here to eliminate, or reduce, dependence of the (local) variability on the (local) mean of the untransformed file sizes. (The former tends to rise with the latter.) In general, let [x.sub.k](t) [member of] R denote the amount of service requests measured at server k in time interval [DELTA]t (k = 1,..., s; t = 1, 2,...), where s is the total number of servers available.

[FIGURE 1 OMITTED]

As shown in Figure 1, the time series contains prominent periodic (but not deterministic) daily patterns. This is a typical characteristic of the websites of a wide range of industries (e.g., retail, finance, and insurance) and for many types of request measurements (e.g., total file sizes or file counts, total HTTP operations). Let the periodicity of such periodic patterns (i.e., the number of samples per period) be denoted by p; for example, p = 288 for the daily patterns shown in Figure 1. Once p is given, the time index t can be expressed as t = ([tau] - 1)p + r (r = 1,..., p; [tau] = 1, 2,...), meaning that t is the rth time interval (of length [DELTA]) in period [tau].

Motivated by examples such as Figure 1, let us consider the following model:

[x.sub.k](t) = [y.sub.k](t) + [z.sub.k](t), [y.sub.k](t) := [summation over (j[member of][C.sub.k])][[xi].sub.jk]([tau])[[phi].sub.jk](t), (1)

where for each given k, [C.sub.k] is a subset of [I.sub.p] := {1,..., p}, the [[phi].sub.jk](t) are p-periodic functions of t such that [[phi].sub.jk] := vec{[[phi].sub.jk](t)}[.sub.t=1.sup.p] (j = 1,..., p) form a basis (not necessarily orthogonal) of [l.sub.2]([R.sup.p]), the [[xi].sub.jk]([tau]) are random variables, and [z.sub.k](t) is a random process with mean 0.

In this model the periodic, long-term patterns are represented by [y.sub.k](t) as a linear combination of a subset of the basis functions whose amplitudes fluctuate randomly from one period to another; the remaining intraperiod, short-term fluctuations are represented by [z.sub.k](t) = [x.sub.k](t) - [y.sub.k](t) as a mean-0 random process. Li and Hinich (2002) provided a general discussion on this model and its comparison with traditional models, such as SARIMA and periodic autoregression (PAR).

Although more sophisticated models (e.g., general state-space models and nonlinear dynamic models) are possible alternatives, it is assumed in this article that [xi]([tau]) := vec{[[xi].sub.jk]([tau]) : j [member of] [C.sub.k], k = 1,..., s} and z(t) := vec{[z.sub.k](t)}[.sub.k=1.sup.s] are both AR processes, not only because the AR models can approximate more complicated linear models, but also because the AR parameters required for prediction can be easily obtained by computationally efficient algorithms.

If [xi]([tau]) is a stationary process, then the long-term component, [y.sub.k](t), is a cyclostationary process with a p-periodic mean function; that is, it satisfies

E{[y.sub.k](t + p)} = E{[y.sub.k](t)}

and

cov{[y.sub.k](t + p), [y.sub.k](t' + p)} = cov{[y.sub.k](t), [y.sub.k](t')},

for all t and t'. This constraint on [y.sub.k](t) can be relaxed for greater flexibility by allowing the parameters of [xi]([tau]) to change with time. For example, by allowing the mean of [xi]([tau]) to vary with [tau], one can introduce in [y.sub.k](t) a periodic or...

View this article FREE - Now for a Limited Time, try Goliath Business News
Free for 3 Days!



More articles from Journal of the American Statistical Association
Nonparametric inferences for additive models., September 01, 2005
Semiparametric regression analysis of longitudinal data with informati..., September 01, 2005
Dynamical correlation for multivariate longitudinal data., September 01, 2005
Estimation of long memory in the presence of a smooth nonparametric tr..., September 01, 2005
Measurement error in linear autoregressive models., September 01, 2005

Looking for additional articles?
Search our database of over 3 million articles.

Looking for more in-depth information on this industry?
Search our complete database of Industry & Market reports by text, subject, publication name or publication date.

About Goliath
Whether you're looking for sales prospects, competitive information, company analysis or best practices in managing your organization, Goliath can help you meet your business needs.

Our extensive business information databases empower business professionals with both the breadth and depth of credible, authoritative information they need to support their business goals. Whether it be strategic planning, sales prospecting, company research or defining management best practices - Goliath is your leading source for accurate information.