|
Article Excerpt In this paper, we provide theoretical arguments and empirical evidence for how Genetic Algorithms (GA) can be used for efficient estimation of macro-level diffusion models. Using simulations we find that GA and Sequential Search-Based-Nonlinear Least Squares (SSB-NLS) provide comparable parameter estimates when the data including peak sales are being used, for a range of error variances, and true parameter values commonly encountered in the literature. From empirical analyses we find that the forecasting performance of the GA estimates is better than that of SSB-NLS, Augmented Filter, Hierarchical Bayes, and Kalman Filter when only pre-peak sales data is available for estimation. When sales data until the peak time period are available for estimation, SSB-NLS is able to obtain parameter estimates when the starting values provided are the estimates from using GA. The estimates from GA are not biased and do not change in a systematic fashion when post-peak sales data are used, whereas the estimates from SSB-NLS are biased and change in a systematic fashion. Summarizing, we find that GA may be better suited for diffusion model estimation under the three conditions where SSB-NLS has been found to have problems.
Key words: Bass model; starting values; systematic change and bias; closed-form solution; nonlinear least squares; genetic algorithms; pre-peak sales forecasting
History: This paper was received August 30, 2002, and was with the authors 4 months for 5 revisions; processed by Roland Rust.
1. Introduction
Data available for estimation of the Bass model or its extensions are usually restricted to a set of 12 to 15 observations. There are two reasons for this. First, sales data are typically collected annually to avoid fluctuations in sales within a year and seasonality issues. Second, sales of most of the new products tend to stop growing, and in fact start decreasing, after 7 to 10 years. Because a manager's interest is very likely to diminish after the growth stage of a new product, researchers have to work with a smaller dataset in many cases. A natural outcome of this problem is researchers' interest in exploring more and more sophisticated estimation techniques that can extract as much information as possible from smaller datasets with maximum efficiency. Alternatively, researchers have also investigated using information such as advance purchase orders (Moe and Fader 2002) and spatial dimensions of product adoption (Garber et al. 2003) for early prediction of new product sales.
The chronology of the various estimation techniques and the benefits and drawbacks of each method are outlined in Table 1. Of the three time-invariant estimation techniques employed in the diffusion literature, namely, Ordinary Least Squares (OLS), Maximum Likelihood (ML), and Nonlinear Least Squares (NLS), it is generally accepted that NLS is the best option among the current alternatives (Putsis and Srinivasan 2000). For the Bass (1969) model, NLS is applied to the equation
(1) s(t) = m * [F(t) - F(t - 1)] + [epsilon](t),
where s(t) is the sales function, m is the market potential parameter, [epsilon](t) is the normal additive error, F(t) is the cumulative density function of time of adoption given by
(2) F(t) = [1 - [e.sup.-(p+q)t]]/[1 + (q/p)[e.sup.-(p+q)t]],
t = time period, p = coefficient of innovation, and q = coefficient of imitation.
NLS used in popular computer packages employs a sequential search technique to obtain parameter estimates. (1) The widely used sequential search-based (SSB) NLS places three major restrictions on the estimation that span every stage of the product lifecycle. SSB-NLS estimation seems to have problems with data that covers three stages of a diffusion curve: pre-peak sales, peak sales, and post-peak sales (see Figure 1). With the pre-peak sales data, SSB-NLS has been repeatedly found to not achieve convergence (Srinivasan and Mason 1986, Lenk and Rao 1990). With the peak-sales data, it has been found that SSB-NLS's convergence largely depends on the initial values one provides for the parameters. With the post-peak sales data, it has been found that the SSB-NLS estimates of the Bass model are biased and change systematically as we add datapoints from later years (Van den Bulte and Lilien 1997, Bemmaor and Lee 2002, Venkatesan et al. 2000).
[FIGURE 1 OMITTED]
In [section] 2, we provide theoretical arguments and intuition for how GA is, under certain circumstances, able to arrive at global optimal parameter estimates more efficiently even when the response surface is multimodal and noisy. We also show how a SSB-NLS has a probability of converging at a local optimal solution in these cases. In [section] 3, using simulated data we show that the estimates from GA are similar to estimates from SSB-NLS under commonly encountered error variances and parameter estimate values, provided full datasets are used for estimation. Then, using empirical datasets we compare the performance of GA with SSB-NLS and other techniques proposed in the literature when the data does not contain peak sales, when there is data until peak sales, and when datapoints are added sequentially to post-peak sales data. Based on the results of our analyses in [section] 4 and Appendix 2 we conclude that GA is able to produce better parameter estimates than SSB-NLS as evident in lower Mean Squared Errors (MSE) and Mean Absolute Deviation (MAD) under the three data related scenarios mentioned above.
2. Intuitive Expectations for the Performance of GA
In this section, we provide intuitive reasons for why we expect GA to perform better than SSB-NLS and other search algorithms (e.g., a grid search) even though the estimation techniques use the identical objective function. Details on the estimation of the Bass model using GA are provided in Appendix A.
Comparison with SSB-NLS. The SSB-NLS and GA have several important and substantive differences due to which SSB-NLS and GA need not always provide the same solutions. A vast body of research outside of marketing has investigated the properties of parameter estimates from GA (Del Moral and Miclo 2001, Dorsey and Mayer 1995) and has also compared the performance of estimates from GA with traditional gradient search algorithms used in SSB-NLS (Salomon 1998, Dorsey and Mayer 1995). We summarize the major findings from these studies below:
* SSB-NLS techniques use single-point gradient search algorithms to locate the parameters that optimize the objective function (minimum sum of squared errors).
* GA uses parallel, evolutionary search algorithms to locate parameters that optimize the objective function (minimum sum of squared errors in this case).
* Theoretical expectations are that GA has a higher probability of convergence to global optimum solutions when datapoints are less, number of parameters is large, the parameter space is multimodal, and the model is inherently nonlinear (Del Moral and Miclo 2001). However, SSB-NLS is dependent on smooth, and mostly quadratic surfaces to ensure convergence to local optimal--with the expectation that if appropriate starting values are chosen the local optimum will represent the global optimal solution (Seber and Wild 1988).
* The estimates from GA provide better fit and forecasting performance as compared to SSB-NLS for highly nonlinear functions (Dorsey and Mayer 1995) and they represent inherently two different classes of optimization techniques that have different properties (Salomon 1989).
We further expand on these key features using an illustrative example shown in Figure 2.
[FIGURE 2 OMITTED]
Suppose that the solution space is not smooth and that it has a surface as shown in Figure 2. The optimal points A and C are local but are very dominant because a vast majority of starting values will move towards these two local optima. The global optimal point B is reachable only from a few starting values. In the SSB-NLS, unless you happen to be in those few spots, you would never reach the global optimum, however hard you try. This is because in current NLS packages there is SEQUENTIAL SEARCHING in that the next point in the search has to have a lower Sum of Squared Errors (SSE), i.e., there has to be a systematic continuity. For example, if we start at x1 (see Figure 2), SSB-NLS will definitely take us only to x2 and never to x3 or x4. But in GA, if we start at x1, it will not only consider x2 in the next step (as NLS does) but also consider x3 or/and x4! In other words, in GA, in each iteration, the program not only maintains the systematic search as the sequential search-based NLS does, but also searches for nonsequential points. So, it keeps trying random values as well simultaneously. This is the strength of GA. It combines the unique advantage of sequential search-based NLS and that of the random search to the maximum advantage. So, it has a much better chance to reach the global optima.
Also, one can argue that by trying NLS with different starting values, one should theoretically reach the global optimal. However, the issue is that we currently do not have any algorithm to help us carry out such "searching for initial values." Take the Bass model, where we typically have m in the order of thousands, p in the order of thousandths and hundredths, and q in the order of hundredths and tenths. This means that with just three parameters, there are literally millions of starting value combinations that are possible. As mentioned above, we do not have a systematic way to check all these values. This problem becomes multifold:
(1) if we have a model such as the Generalized Bass Model (Bass et al. 1994) or cross-national diffusion models as in Kumar and Krishnan (2002) that have more than six parameters, and/or
(2) if we have only few datapoints, in which case the solution space becomes very rough, and/or
(3) if...
|